ok, alexandre: i tried moving named_pipe_read into wineserver - it's not possible to do completely, as you cannot do blocking-reads in wineserver but you still need blocking-read characteristics in the client (kernel32, ntdll). if you start messing with the fd, setting or clearing ioctl O_NONBLOCK in one, it interferes with the other. there's a race-condition risk.
so, i have some ideas and questions.
1) is it possible for wineserver to tell a thread to wait indefinitely? is there a communication channel where wineserver can tell "anyone who's interested" to go away, go to sleep, i'll tell you when to wake up"? what i want is to replace the fd blocking-read with something that still blocks, but doesn't interfere with the file descriptor.
2) would a _second_ filedescriptor fit the required criteria, as described in 1), above? in other words, if the message-lengths (4 bytes) were sent on a secondary control fd - and nothing _but_ message-lengths were sent on that fd - my intuition tells me that that could work, even without having any infrastructure in wineserver. in fact, i get the feeling that it would be _better_ - and a lot simpler - than having any infrastructure in wineserver.
the reasoning goes roughly something like this:
* 5 threads are waiting - blocking - on a message-mode read named pipe. all of them are blocking to read from the "control" channel, in userspace (kernel32/ntdll).
* client 1 does a message-mode write, and in userspace (kernel32/ntdll) 4 bytes indicating the length of the message are sent, followed by N bytes on the unix_fd (exactly as is done at the moment).
* one and ONLY one of the 5 threads, thanks to STANDARD linux kernel scheduling, happens to be awake and gets to read the 4-byte length. all the other 4 stay asleep.
damnit, that's not right, is it. you want the other 4 to wake up once ONE of them has read the 4-byte length, and for ALL of them to move on to blocking on the "message" pipe - the actual content.
so, i suppose what you could do is have a per-read-thread filedescriptor (!!!) which _would_ have to involve arbitration by wineserver, so that when a write is performed, ALL of the clients get notified "wake up now, start trying to read".
is there _anything_ in wine which already exists that does what is required, as illustrated, above?
answers and information greatly appreciated so as to be able to improve this free software project's effectiveness.
l.
mwhaaahahah, i just came up with a _horrible_ idea :)
how about: ripping out the use of unix-pipes altogether, and replacing them with tdb (trivial database) in a mmap'd file? the nice thing about tdb is that it's LGPL'd. the messages could be saved and transferred via shared memory; tdb is multi-readable / multi-writeable, and it would save a bitch-awful job of implementing a shared memory sockets emulation layer (which i _have_ seen done, once).
l.
Hi Luke,
how about: ripping out the use of unix-pipes altogether, and replacing them with tdb (trivial database) in a mmap'd file? the nice thing about tdb is that it's LGPL'd. the messages could be saved and transferred via shared memory; tdb is multi-readable / multi-writeable, and it would save a bitch-awful job of implementing a shared memory sockets emulation layer (which i _have_ seen done, once).
The usual response around here is, show us some patches, and we'll let you know ;-)
Some test cases showing us where we're deficient would be a great start, btw. --Juan
On Thu, Feb 5, 2009 at 1:34 AM, Juan Lang juan.lang@gmail.com wrote:
Hi Luke,
hi juan, thanks for responding.
how about: ripping out the use of unix-pipes altogether, and replacing them with tdb (trivial database) in a mmap'd file? the nice thing
The usual response around here is, show us some patches, and we'll let you know ;-)
yehh, ha ha :) i was kinda hoping to not have to go through five different implementations!
i think i've finally come up with an idea that i believe will work: double-socketing.
very simple: you write down one filedescriptor, and read from a different one. wineserver proxies the data from one to the other, whenever it's requested. here's the important bit: wineserver ONLY allows ONE lot of data into the "read" socket at any one time.
it'll be ok (and desirable) to allow multiple "readers" of the "read" socket. what you _don't_ want is more than one reader trying to indicate "please start sending a new message" whilst there are other reader(s) still grabbing the previous message.
so i believe that a critical section (copying the style of the code around server_get_unix_fd) each around "please start a new message" and "please send some more read-data" would be sufficient.
what do you think?
also the advantage of double-socketing is that when it comes to doing SMB named pipes there will be a clean "way in and out" for SMBreadX, SMBwriteX and SMBtransactNamedPipe data over the IPC$ share. the double-sockets will still be needed (one for reading, one for writing) - a _third_ socket will be needed, to communicate with the smb server (via the break-out mechanism).
Some test cases showing us where we're deficient would be a great start, btw.
all there, in #17195. five so far.
* test1 (the first i wrote) just does send recv send recv. this one works (in existing wine)
* send2recv2 does send send recv recv and it's this one that shows _immediately_ that there's a problem with wine's messagemode code.
* shortread demonstrates the "standard" way to be able to read blocks of data when the message is _way_ beyond acceptable incoming buffer sizes (e.g. in smbs it was often the case that spoolss would result in SMBwriteX SMBwriteX SMBwriteX SMBtransactnamedpipe SMBreadX SMBreadX SMBreadX transferring _whopping_ great sized messages). again, wine keels over on this one as it doesn't know what to do at the message boundaries.
* threadread demonstrates several messages being queued followed by a number of threads performing simultaneous reads (deliberately using shortreads where the individual reads are fixed sizes). i only have this doing 1 thread at the moment because the data needs to be "reconstructed" properly otherwise (even on nt).
* threadwrite demonstrates several messages being written simultaneously, with a single process grabbing them all. this one _doesn't_ have to be limited in the number of threads that perform writes, which indicates in the internal design of NT that the writes are atomic.
i think i've finally come up with an idea that i believe will work: double-socketing.
(snip)
it'll be ok (and desirable) to allow multiple "readers" of the "read" socket. what you _don't_ want is more than one reader trying to indicate "please start sending a new message" whilst there are other reader(s) still grabbing the previous message.
so i believe that a critical section (copying the style of the code around server_get_unix_fd) each around "please start a new message" and "please send some more read-data" would be sufficient.
Out of curiosity, why is this better than a single socket with the length of each message prepended? I'm not advocating that that's a better approach, mind you. It just seems to me that, if you're putting critical sections of a sort into the server anyway, you could block clients just as easily with one socket as with two.
One difficulty I'm having trouble getting my head around is, isn't the data removed from a socket once it's ready by any process? Or does it remain for each process to read independently? I guess I'm not that familiar with how reading from a socket works. I'd always assumed that the former was true, and that therefore the only correct approach would be to buffer message-mode named pipe data in the server. This is ugly (and slow), which is why Alexandre's stated preference has been to push it into the Linux kernel.
For what it's worth, Steve French has expressed interest in doing just that. He asked for a clear spec for what filename these sockets should have, but we didn't have a clear app that we needed to fix, so we never followed through. You might approach him again. He's the maintainer of the CIFS kernel module for Linux, and he already has his own named pipe implementation. --Juan
On Thu, Feb 5, 2009 at 4:08 PM, Juan Lang juan.lang@gmail.com wrote:
i think i've finally come up with an idea that i believe will work: double-socketing.
(snip)
it'll be ok (and desirable) to allow multiple "readers" of the "read" socket. what you _don't_ want is more than one reader trying to indicate "please start sending a new message" whilst there are other reader(s) still grabbing the previous message.
so i believe that a critical section (copying the style of the code around server_get_unix_fd) each around "please start a new message" and "please send some more read-data" would be sufficient.
Out of curiosity, why is this better than a single socket with the length of each message prepended?
a length prepended is still needed.
what i had implemented so far (and demonstrated that it's flawed, thanks to the "threadread" test) is:
* writes send length-prepended to data * wineserver-function get_named_pipe_info * wineserver-function read_named_pipe
then:
* in the client (ntdll) instead of using read() you use server_read_named_pipe() iff it's a pipe. BUT, before you do that, you do a poll() on the unix_handle (obtained using server_get_unix_fd()). this is your "blocking mode". so, although you tell the _server_ to get the data for you, you _still_ have to do "block on socket". and, because you can't use read() to do the "blocking", you have to use poll() instead.
* in the wineserver-function, read_named_pipe MUST NOT EVER block on reads, but it is still being asked to perform _a_ read, and so there is some code that sets O_NONBLOCK, followed by a recv using MSG_PEEK to double-check that there's data, and _then_ a read is used to actually obtain the data.
the problem here is that setting O_NONBLOCK in wineserver, in order to not end up permanently hanging wineserver, is *interfered with* by the requirement to have "blocking mode" in the client (ntdll).
so there are two conflicting requirements, by using the same filedescriptor.
and no, you _can't_ do "everything in wineserver", because you _still_ need a mechanism to be able to tell the client (ntdll) to "block".
which is why i asked if there was a way for wineserver to tell a client wine thread/process to "go to sleep" [and didn't get an answer].
One difficulty I'm having trouble getting my head around is, isn't the data removed from a socket once it's ready by any process?
or a thread - yes. fortunately, read() and write() to/from sockets are at least atomic (whew).
Or does it remain for each process to read independently?
no, thank god. the data is removed. except if you use recv() with MSG_PEEK, of course, but then the data is _guaranteed_ not to be removed. ever.
I guess I'm not that familiar with how reading from a socket works. I'd always assumed that the former was true, and that therefore the only correct approach would be to buffer message-mode named pipe data in the server.
"relax, luther - it's much worse than you think" [Mission Impossible I]
:)
This is ugly (and slow),
well... it's unfortunate, but that's the way it's going to have to be. if i was told "there's these absolutely fantastic guaranteed-message-size, guaranteed-message-order characteristics you can get from message-mode named pipes, but they're a bit slower than normal pipes" when developing an application i'd go "great! i don't care if it's a bit slower, the features make my life a _lot_ easier".
_somewhere_ there has to be a "break" between "messages".
and you can't stop people from sending data down the pipe (in NtWritePipe).
therefore, logically, you have to have a "firewall" between "send" and "receive". and, because you _also_ need "blocking on read" characteristics (across multiple processes and threads), the most sensible way to implement that is with a unix filedescriptor on which all of the clients (ntdll) can block.
_but_.. if you also allow any _other_ process to send data down the _same_ unix filedescriptor (from NtWritePipe), you've just gone and screwed with the message-boundaries.
... actually, it may turn out to be the case that you literally need one filedescriptor _per message_. not joking about, or anything, but it may end up being the case that a queue of "struct fd*" pointers is required (in wineserver).
in non-message-mode, that would simply be "queue length of 1". in message-mode, you'd go "oh, dear, we got an EPIPE error when trying to read, that means that someone else got that message, wow big deal, let's grab another unix filedescriptor from wineserver and try again". if grabbing another filedescriptor fails, THEN you go "oh, whoops, let's return STATUS_PIPE_DISCONNECTED".
by having one filedescriptor per message, you are cast-iron guaranteeing 100% that individual messages will not interfere with each other.
and, the neat thing is, you wouldn't need any "buffering". you'd still need to indicate the length (and the easiest way to do that is _still_ to send it as the first 4 bytes).
which is why Alexandre's stated preference has been to push it into the Linux kernel.
well, here's the thing: you can't _guarantee_ that wine will be *exclusively* running on the latest-and-greatest version of the linux kernel, and, also, it would be a bit unfair to the FreeBSD folks (and anyone else who would like to port wine to other OSes)
so, this still has to be done in userspace, with an optimisation being "use a kernelspace implementation, if it exists".
plus, i think also it would help the reactos guys out because they're in the middle - they will still need something that doesn't rely on a linux kernel that they can't have.
For what it's worth, Steve French has expressed interest in doing just that.
that's veeery good news. i'll look forward to seeing that implemented.
He asked for a clear spec for what filename these sockets should have, but we didn't have a clear app that we needed to fix, so we never followed through. You might approach him again. He's the maintainer of the CIFS kernel module for Linux, and he already has his own named pipe implementation.
_cool_.
well, here's where i should explain what my goal is. my goal is to "get things started". these days i tend to get involved in free software projects at "critical juncture" points, where there is clearly cross-project non-communication and/or non-cooperation (accidental or otherwise), and where it's _really_ important that the stuff actually gets done, but is sufficiently complex and misunderstood that nobody really wants to tackle it.
message-mode namedpipes falls into that category, which is why i'm doing it.
so - once i have _a_ working implementation, then please do not be offended, or surprised, if, when asked to make further enhancements, or if asked to fit specific criteria (such as doing things a "slightly different way"), i decline to do so. [advance notice: any such requests should also be accompanied by an offer of financial or other compensation.] but - of course, anyone who finds that the "working implementation" _isn't_ working, that's a _completely_ different matter and i will immediately act to remedy that (if noone else does).
also from that strategic perspective, much as i would love to collaborate with steven on a kernel-level implementation of named pipes, i believe that it's extraneous: it's an optimisation. and if i were to work on that optimisation only, as the _only_ option, it would lock out other possibilities.
so, on balance, i'll not be contacting steven right now. that, and the fact that the samba team decided to block all communication on 16th december 2005, and have neither revoked it nor issued a public apology for doing so, means that i cannot contact him _anyway_.
l.
and no, you _can't_ do "everything in wineserver", because you _still_ need a mechanism to be able to tell the client (ntdll) to "block".
Well, if there are two fds still, but each end is either a client (reading or writing) and the other is the wineserver, I believe you could. That is, the server would only write complete messages to a fd, presumably prepended with the length. A reading client would then block if there weren't any data to read, and there wouldn't be any data until a complete message were written there by wineserver.
which is why i asked if there was a way for wineserver to tell a client wine thread/process to "go to sleep" [and didn't get an answer].
Right. There isn't any such mechanism, at least as far as I know.
_somewhere_ there has to be a "break" between "messages".
and you can't stop people from sending data down the pipe (in NtWritePipe).
therefore, logically, you have to have a "firewall" between "send" and "receive". and, because you _also_ need "blocking on read" characteristics (across multiple processes and threads), the most sensible way to implement that is with a unix filedescriptor on which all of the clients (ntdll) can block.
Yep, agreed. This discussion is more about what the structure of the file descriptors is. But in general, my opinion here doesn't carry much weight, Alexandre's does ;-) And he's unlikely to voice an opinion except on patches.
well, here's the thing: you can't _guarantee_ that wine will be *exclusively* running on the latest-and-greatest version of the linux kernel, and, also, it would be a bit unfair to the FreeBSD folks (and anyone else who would like to port wine to other OSes)
Sure, that's true.
well, here's where i should explain what my goal is. my goal is to "get things started". these days i tend to get involved in free software projects at "critical juncture" points, where there is clearly cross-project non-communication and/or non-cooperation (accidental or otherwise), and where it's _really_ important that the stuff actually gets done, but is sufficiently complex and misunderstood that nobody really wants to tackle it.
message-mode namedpipes falls into that category, which is why i'm doing it.
And we appreciate your attention to it. Really, we do :)
so - once i have _a_ working implementation, then please do not be offended, or surprised, if, when asked to make further enhancements, or if asked to fit specific criteria (such as doing things a "slightly different way"), i decline to do so.
Yep, that's been my assumption all along. That's why I've nudged you with respect to tests a couple of times. Thanks for putting some in the bug, by the way. Since I assumed it was unlikely you'd have the patience to work with AJ to get your patches accepted, I thought having some test cases, along with an implementation that isn't fundamentally flawed, would help point us in the right direction. Pointing out that handles can be shared between processes was a hint about one fundamental flaw that can trip you up. I'll try to look at what you're doing now and again to point out things that just can't work. For the most part, though, I'll assume you know what you're doing.
also from that strategic perspective, much as i would love to collaborate with steven on a kernel-level implementation of named pipes, i believe that it's extraneous: it's an optimisation. and if i were to work on that optimisation only, as the _only_ option, it would lock out other possibilities.
Okay, fair enough.
so, on balance, i'll not be contacting steven right now. that, and the fact that the samba team decided to block all communication on 16th december 2005, and have neither revoked it nor issued a public apology for doing so, means that i cannot contact him _anyway_.
Well, this is a bit off-topic, but he's a Linux kernel guy, not really part of the Samba team. But I understand that you don't want to go the Linux-kernel-module-only route, and I respect that.
Thanks, --Juan
On Thu, Feb 5, 2009 at 6:38 PM, Juan Lang juan.lang@gmail.com wrote:
and no, you _can't_ do "everything in wineserver", because you _still_ need a mechanism to be able to tell the client (ntdll) to "block".
Well, if there are two fds still, but each end is either a client (reading or writing) and the other is the wineserver, I believe you could.
that was "the plan"
That is, the server would only write complete messages to a fd, presumably prepended with the length.
that would be fine, if it wasn't for one thing: i've seen NT send _eight megabytes_ of data down a named pipe, on a spoolss service. i tried it out, once, with rpcclient. the limit (on NT 4.0) was around 8 mb of PDUs.
now, what i _can't_ tell you is whether that eight megabyte limit was in the RPC subsystem (or if it was in the SMB-NP subsystem), with the MSRPC infrastructure breaking it down into PDU-sized chunks [of MTU-size e.g. 1500] and i certainly can't tell you whether those PDUs were put into a single 8mb buffer that was then handed to a single NtWriteFile() call on a Named Pipe, or if it was done on a per-pdu basis....
but, ultimately, i would be _very_ wary of getting wineserver to malloc exactly the amount of memory that it was told was being written down the pipe.
not least is the issue that any such large mallocs or writes would put a _significant_ latency onto the response time of other wineserver calls (which is something i wanted to raise as a issue, separately. another time)
so, i'm doing it as a loop, and... *thinks*... i like the queue of "struct fd"s more than i like the double-socket idea, the more i think about it.
A reading client would then block if there weren't any data to read, and there wouldn't be any data until a complete message were written there by wineserver.
yes. this would be _very_ nice.
... except... what do you do when there is another message being written?
NtWriteFile (puts the complete message into the fd) NtWriteFile (what do you do now? you can't block the writer) NtReadFile (gets the first complete message) NtReadFile (what do you do now?)
this is why having a queue of "struct fd"s, one per message, could end up being the only viable solution.
which is why i asked if there was a way for wineserver to tell a client wine thread/process to "go to sleep" [and didn't get an answer].
Right. There isn't any such mechanism, at least as far as I know.
bugger :) mrrm... i encountered async_queue - that looks suspiciously like it's a wake-up mechanism.
_somewhere_ there has to be a "break" between "messages".
and you can't stop people from sending data down the pipe (in NtWritePipe).
therefore, logically, you have to have a "firewall" between "send" and "receive". and, because you _also_ need "blocking on read" characteristics (across multiple processes and threads), the most sensible way to implement that is with a unix filedescriptor on which all of the clients (ntdll) can block.
Yep, agreed. This discussion is more about what the structure of the file descriptors is. But in general, my opinion here doesn't carry much weight, Alexandre's does ;-) And he's unlikely to voice an opinion except on patches.
ha ha :)
anyone who helps crystallise ideas is valuable.
message-mode namedpipes falls into that category, which is why i'm doing it.
And we appreciate your attention to it. Really, we do :)
thanks :)
so - once i have _a_ working implementation, then please do not be offended, or surprised, if, when asked to make further enhancements, or if asked to fit specific criteria (such as doing things a "slightly different way"), i decline to do so.
Yep, that's been my assumption all along. That's why I've nudged you with respect to tests a couple of times. Thanks for putting some in the bug, by the way.
no problem. there will need to be more - a lot more.
e.g. PeekNamedPipe actually does an optional read, and _then_ tells you what's left. wine's implementation, because nobody ever really _does_ that "optional read", they only do the "peek".
Since I assumed it was unlikely you'd have the patience to work with AJ to get your patches accepted, I thought having some test cases, along with an implementation that isn't fundamentally flawed, would help point us in the right direction.
... well... now i'm... deeply impressed (with your foresight).
Pointing out that handles can be shared between processes was a hint about one fundamental flaw that can trip you up.
mrmmrmmm *grumble-about-how-nt-works* yeah that was appreciated, although it won't be until i do some tests using dup on handles, or simply blatantly passing the 32-bit handle pointer value between processes (over a socket or a file), that i'll know what is and isn't possible.
I'll try to look at what you're doing now and again to point out things that just can't work.
he he. appreciate it.
For the most part, though, I'll assume you know what you're doing.
*cackle*. are you sure that's wise? :)
Well, this is a bit off-topic, but he's a Linux kernel guy, not really part of the Samba team.
ahh. thank you for clarifying. i remember stephen from when he was with ibm, he was always at the cifs conferences.
But I understand that you don't want to go the Linux-kernel-module-only route, and I respect that.
*nods*.
thank you, juan.
l.