http://bugs.winehq.org/show_bug.cgi?id=17195
with only one significant bug - some memory corruption - the semantics are now correct in the namedpipe messagemode patch. i drew a diagram outlining the data structures / design http://bugs.winehq.org/attachment.cgi?id=19449 because it's so horrendous that even i want to throw up. assistance in finding the memory corruption would be really helpful. an assert is thrown after more than 4 short reads are performed on the same namedpipe, with this straaaange error: ead: file.c:631: server_get_named_pipe_fd: Assertion `wine_server_ptr_handle(fd_handle) == handle' failed. ead: server.c:628: server_get_unix_fd: Assertion `wine_server_ptr_handle(fd_handle) == handle' failed. wine: Assertion failed at address 0xf7ca8d7f (thread 000f), starting debugger... err:ntdll:RtlpWaitForCriticalSection section 0x7bc99ee0 "server.c: fd_cache_section" wait timed out in thread 000f, blocked by 0014, retrying (60 sec)
i'll try to write a simplified test because that's the "threadread" test on which this occurs.
i can't quite believe that all 5 tests can be passed [by reducing the number of short reads to less than 4]. it's been a bit of a "drunken walk" development style, i'm afraid :)
notes:
* in order to preserve message-mode semantics correctly, each message gets ITS OWN socketpair. the socketpair is used EXCLUSIVELY for that one message and is NOT re-used by the other end. so if it's client writing to server, one message, that's one socketpair. if it's server writing to client, one message, that's another socketpair. if client writes 20 messages to server, that's 20 new socketpairs, if at same time server writes 25 messages to client, and nobody's done any reading yet, that's a total of 45 outstanding socketpairs - 90 filedescriptors.
this is not a hoax. it's the only way i could think of that preserves messagemode semantics correctly.
* for each end - client and server - to be able to "block", waiting at any time, there MUST be at LEAST one socketpair in each of the queues. so, initially, that's what's done: one pair of socketpairs are created, one for the client, one for the server. on the FIRST write (to server, or to client), this initial "empty" socketpair must be written to; subsequent writes (to server or to client, each) must have a NEW socketpair created.
* on a read, a socketpair must "hang about" - MUST not be shut down - until there is a new one created (which happens only on a new write). in this way, multiple threads will always have a filedescriptor to block on.
* when a read has read all available data, the socketpair is marked as "finished" but is NOT shut down.
* on each write, if a new socketpair is created, then and ONLY then can "old" marked socketpairs be shut down.
* shutdown of read-sockets, on which [potentially many] reader[s] are waiting will result in an EPIPE. this is an ACCEPTABLE error IFF no data has been read yet, i.e. at the start of a new message. what happens, basically, is that the "old" socketpair - even though it had been used already by the previous reader - is what the thread "blocks" on (because you have to block on _something_). and on the _very_ first message, of course, you block on the specially-prepared initial socketpair, the one that had to be created and marked as "as-yet unwritten to".
* you can see at line 872 of ntdlls/file.c (after patch is applied) that oh whoopsie, EPIPE occurred, but we've not read any data at all, just grab another filedescriptor from wineserver and try again. in this way, wineserver hands out the next available message from the filedescriptor queue. if yet _another_ EPIPE occurs, then all that means is that another thread "got there first", wow big deal, around you go again.
* indicating that data has been read is performed using a special wineserver message "NAMED_PIPE_SUB_AVAILDATA" using server_set_named_pipe_info(). this horrible hack passes information to wineserver telling it to subtract the amount of data read in userspace from the filedescriptor data structure (fdqueue) representing that socketpair.
* indicating that data _will_ be available is performed again using server_set_named_pipe_info(), this time using a message "NAMED_PIPE_SET_AVAILDATA". the set is performed BEFORE the actual write, otherwise a reader will "unblock" and potentially try to set the available data to negative!!! this would be baaad.
* a critical section is performed around all uses of server_set_named_pipe_info() and server_get_named_pipe_fd(). this is NOT the same critical section as server_get_handle_fd(). i don't know if this makes any difference.
things that cannot be done:
* you cannot do the read or the write inside wineserver (server_named_pipe_read() will be deleted, later). the reason is complex to describe, but boils down to "blocking in userspace on the socket in order to wait for a message" and "non-blocking read on the same socket in order to get the data" are unfortunately mutually incompatible. one screws up the other if you set or clear O_NONBLOCK.
* you cannot send the length as a 4-byte "header" down the socket, to be read as a length and then that much data read out. if you have multiple readers blocking on the same filedescriptor, how do you make sure that only one of them reads the length?? they're all blocking on the same socket! send the length multiple times?? how do you know how many readers there are?? short answer: you can't. so - scratch that idea. [ok - maybe you could do a critical section, but you'd have to do one per file, and i'm not ready or willing to add an array of per-file-handle critical sections to ntdll without further discussion]
issues:
* multiple readers of the same socket, when reading data amounts that are larger than the SO_RCVBUF and/or SO_SNDBUF, could potentially end up corrupting the data because of overlaps in the reads. given that this is _already_ a problem, in the existing Wine code, i am not going to be dealing with this issue, in this iteration. particularly as solutions involve having to do per-file critical sections and YES that's a generic problem even on standard files not just pipes not just messagemode pipes ALL files.
* running out of socketpairs; protecting the message queue length. i believe there's a default message queue length in NT, but even so, the number of file descriptors this [necessary] design uses up is _horrendous_. one pair per message! but - tough luck. the options are pretty limited and constrained by the requirements.
so: if anyone says to me, "no this cannot work, you cannot _possibly_ have got correct messagemode semantics purely in userspace and without sending length fields, you _must_ do the reads and writes inside wineserver", then... well... that will be a good way to get me to unsubscribe from wine-devel. if anyone can think of any other ideas, or implementations, or optimisations, then... well... unless they're _really_ good ideas, they'll have to wait :) ideas such as using a separate socketpair for communicating "length" messages are welcome, if accompanied by well-laid-out explanations, because my brain is turning to jelly with this stuff and i could do with some light reading instead of working it out myself :)
l.
http://bugs.winehq.org/show_bug.cgi?id=17195
with only one significant bug - some memory corruption - the semantics are now correct in the namedpipe messagemode patch. i drew a diagram outlining the data structures / design http://bugs.winehq.org/attachment.cgi?id=19449 because it's so horrendous that even i want to throw up. assistance in finding the memory corruption would be really helpful. an assert is thrown after more than 4 short reads are performed on the same namedpipe, with this straaaange error: ead: file.c:631: server_get_named_pipe_fd: Assertion `wine_server_ptr_handle(fd_handle) == handle' failed. ead: server.c:628: server_get_unix_fd: Assertion `wine_server_ptr_handle(fd_handle) == handle' failed. wine: Assertion failed at address 0xf7ca8d7f (thread 000f), starting debugger... err:ntdll:RtlpWaitForCriticalSection section 0x7bc99ee0 "server.c: fd_cache_section" wait timed out in thread 000f, blocked by 0014, retrying (60 sec)
i'll try to write a simplified test because that's the "threadread" test on which this occurs.
i can't quite believe that all 5 tests can be passed [by reducing the number of short reads to less than 4]. it's been a bit of a "drunken walk" development style, i'm afraid :)
notes:
* in order to preserve message-mode semantics correctly, each message gets ITS OWN socketpair. the socketpair is used EXCLUSIVELY for that one message and is NOT re-used by the other end. so if it's client writing to server, one message, that's one socketpair. if it's server writing to client, one message, that's another socketpair. if client writes 20 messages to server, that's 20 new socketpairs, if at same time server writes 25 messages to client, and nobody's done any reading yet, that's a total of 45 outstanding socketpairs - 90 filedescriptors.
this is not a hoax. it's the only way i could think of that preserves messagemode semantics correctly.
* for each end - client and server - to be able to "block", waiting at any time, there MUST be at LEAST one socketpair in each of the queues. so, initially, that's what's done: one pair of socketpairs are created, one for the client, one for the server. on the FIRST write (to server, or to client), this initial "empty" socketpair must be written to; subsequent writes (to server or to client, each) must have a NEW socketpair created.
* on a read, a socketpair must "hang about" - MUST not be shut down - until there is a new one created (which happens only on a new write). in this way, multiple threads will always have a filedescriptor to block on.
* when a read has read all available data, the socketpair is marked as "finished" but is NOT shut down.
* on each write, if a new socketpair is created, then and ONLY then can "old" marked socketpairs be shut down.
* shutdown of read-sockets, on which [potentially many] reader[s] are waiting will result in an EPIPE. this is an ACCEPTABLE error IFF no data has been read yet, i.e. at the start of a new message. what happens, basically, is that the "old" socketpair - even though it had been used already by the previous reader - is what the thread "blocks" on (because you have to block on _something_). and on the _very_ first message, of course, you block on the specially-prepared initial socketpair, the one that had to be created and marked as "as-yet unwritten to".
* you can see at line 872 of ntdlls/file.c (after patch is applied) that oh whoopsie, EPIPE occurred, but we've not read any data at all, just grab another filedescriptor from wineserver and try again. in this way, wineserver hands out the next available message from the filedescriptor queue. if yet _another_ EPIPE occurs, then all that means is that another thread "got there first", wow big deal, around you go again.
* indicating that data has been read is performed using a special wineserver message "NAMED_PIPE_SUB_AVAILDATA" using server_set_named_pipe_info(). this horrible hack passes information to wineserver telling it to subtract the amount of data read in userspace from the filedescriptor data structure (fdqueue) representing that socketpair.
* indicating that data _will_ be available is performed again using server_set_named_pipe_info(), this time using a message "NAMED_PIPE_SET_AVAILDATA". the set is performed BEFORE the actual write, otherwise a reader will "unblock" and potentially try to set the available data to negative!!! this would be baaad.
* a critical section is performed around all uses of server_set_named_pipe_info() and server_get_named_pipe_fd(). this is NOT the same critical section as server_get_handle_fd(). i don't know if this makes any difference.
things that cannot be done:
* you cannot do the read or the write inside wineserver (server_named_pipe_read() will be deleted, later). the reason is complex to describe, but boils down to "blocking in userspace on the socket in order to wait for a message" and "non-blocking read on the same socket in order to get the data" are unfortunately mutually incompatible. one screws up the other if you set or clear O_NONBLOCK.
* you cannot send the length as a 4-byte "header" down the socket, to be read as a length and then that much data read out. if you have multiple readers blocking on the same filedescriptor, how do you make sure that only one of them reads the length?? they're all blocking on the same socket! send the length multiple times?? how do you know how many readers there are?? short answer: you can't. so - scratch that idea. [ok - maybe you could do a critical section, but you'd have to do one per file, and i'm not ready or willing to add an array of per-file-handle critical sections to ntdll without further discussion]
issues:
* multiple readers of the same socket, when reading data amounts that are larger than the SO_RCVBUF and/or SO_SNDBUF, could potentially end up corrupting the data because of overlaps in the reads. given that this is _already_ a problem, in the existing Wine code, i am not going to be dealing with this issue, in this iteration. particularly as solutions involve having to do per-file critical sections and YES that's a generic problem even on standard files not just pipes not just messagemode pipes ALL files.
* running out of socketpairs; protecting the message queue length. i believe there's a default message queue length in NT, but even so, the number of file descriptors this [necessary] design uses up is _horrendous_. one pair per message! but - tough luck. the options are pretty limited and constrained by the requirements.
so: if anyone says to me, "no this cannot work, you cannot _possibly_ have got correct messagemode semantics purely in userspace and without sending length fields, you _must_ do the reads and writes inside wineserver", then... well... that will be a good way to get me to unsubscribe from wine-devel. if anyone can think of any other ideas, or implementations, or optimisations, then... well... unless they're _really_ good ideas, they'll have to wait :) ideas such as using a separate socketpair for communicating "length" messages are welcome, if accompanied by well-laid-out explanations, because my brain is turning to jelly with this stuff and i could do with some light reading instead of working it out myself :)
l.
On Sat, Feb 14, 2009 at 7:02 PM, Luke Kenneth Casson Leighton luke.leighton@googlemail.com wrote:
http://bugs.winehq.org/show_bug.cgi?id=17195
with only one significant bug - some memory corruption - the semantics are now correct in the namedpipe messagemode patch.
found it. as suspected, utilising the same critical section fd_cache_section as in ntdll/file/server.c this issue went away. it wasn't memory corruption at all, it was the use of the wineserver socket to send filedescriptors using ioctls: server_get_named_pipe_fd() was picking up a filedescriptor that was intended for server_get_unix_fd() and vice-versa.
so - fixed! done! working!
... sort-of :)
the one remaining "fly in the ointment", which i'm going to ignore, is demonstrated by threadread and threadwrite tests: running unix out of socketpair file handles.
20 writes by 5 threads are queued up by e.g. threadwrite, resulting in a whopping two _hundred_ filedescriptors being handed out. reading this: http://msdn.microsoft.com/en-us/library/aa365150.aspx describes how use of a NamedPipe should block on write:
"The write operation will block until the data is read from the pipe so that the additional buffer quota can be released. Therefore, if your specified buffer size is too small, the system will grow the buffer as needed, but the downside is that the operation will block. If the operation is overlapped, a system thread is blocked; otherwise, the application thread is blocked."
ignoring the bollocks about "buffer sizes", and "buffer quotas", which give an indication of how microsoft internally implements NamedPipes, it's clear then that blocking the application is perfectly acceptable.
exactly how this is to be achieved is yet to be determined, but as i keep mentioning, it's highly likely that a per-file-handle mutex is required.
quick tests:
1) test1, shortread, send2recv2: pass. 2) reducing BUF_SIZE in the tests code to 128 works, threadread and threadwrite: success 3) removing the interim use of ncacn_ip_tcp in SVCCTL and returning to ncacn_np: success. 4) going back to running a simple python2.7.exe "multiprocessing" tests, which use message-mode named pipes, the whole damn _point_ of this exercise: success 5) running MSYS: fail.
bugger.
a veery cursory look at the output, it looks like it's a non-message-mode pipe. arg. no surprise there, then. *sigh*.
will let you know how i get on.
l.
On Sat, Feb 14, 2009 at 7:02 PM, Luke Kenneth Casson Leighton luke.leighton@googlemail.com wrote:
http://bugs.winehq.org/show_bug.cgi?id=17195
with only one significant bug - some memory corruption - the semantics are now correct in the namedpipe messagemode patch.
found it. as suspected, utilising the same critical section fd_cache_section as in ntdll/file/server.c this issue went away. it wasn't memory corruption at all, it was the use of the wineserver socket to send filedescriptors using ioctls: server_get_named_pipe_fd() was picking up a filedescriptor that was intended for server_get_unix_fd() and vice-versa.
so - fixed! working!
... sort-of :)
the one remaining "fly in the ointment", which i'm going to ignore, is demonstrated by threadread and threadwrite tests: running unix out of socketpair file handles.
20 writes by 5 threads are queued up by e.g. threadwrite, resulting in a whopping two _hundred_ filedescriptors being handed out. reading this: http://msdn.microsoft.com/en-us/library/aa365150.aspx describes how use of a NamedPipe should block on write:
"The write operation will block until the data is read from the pipe so that the additional buffer quota can be released. Therefore, if your specified buffer size is too small, the system will grow the buffer as needed, but the downside is that the operation will block. If the operation is overlapped, a system thread is blocked; otherwise, the application thread is blocked."
ignoring the bollocks about "buffer sizes", and "buffer quotas", which give an indication of how microsoft internally implements NamedPipes, it's clear then that blocking the application is perfectly acceptable.
exactly how this is to be achieved is yet to be determined, but as i keep mentioning, it's highly likely that a per-file-handle mutex is required.
quick tests:
1) test1, shortread, send2recv2: pass. 2) reducing BUF_SIZE in the tests code to 128 works, threadread and threadwrite: success 3) removing the interim use of ncacn_ip_tcp in SVCCTL and returning to ncacn_np: success. 4) going back to running a simple python2.7.exe "multiprocessing" tests, which use message-mode named pipes, the whole damn _point_ of this exercise: success 5) running MSYS: fail.
bugger.
a veery cursory look at the output, it looks like it's a non-message-mode pipe. arg. no surprise there, then. *sigh*.
will let you know how i get on.
l.
- running MSYS: fail.
found it: rxvt.exe is looking for libX11.dll (!!) which is nothing to do with NamedPipes. running c:/msys/bin/sh.exe --login -i is: success.