[GSoC] Message mode pipes proposal - wine-devel

28 Mar 2011


      Hi, all,
...
From the GSoC project ideas page, I became interested in the proposal
for implementing message-based pipes (see bug:
http://bugs.winehq.org/show_bug.cgi?id=17195). The most recent patches
to the bug cited above appear to use a separate socketpair for each
message sent down the pipe, and must make potentially multiple
wineserver calls for each interaction with the pipe. This will clearly
begin to have problems once a large number of file descriptors are in
the air.
While a discussion on #winehackers suggested that implementing
message-based pipes efficiently and safely entirely in userspace would
be quite difficult, I feel that implementing a _correct_
implementation (regardless of efficiency) would be a practical goal.
After creating a correct implementation, work could then be done to
improve performance.
First, a bit about me: I am a computer science student at the
University of Massachusetts, heading into my senior year. I have had a
lot of experience with Linux development, somewhat less on the Windows
side. I am fluent in English and Japanese, although I doubt Japanese
language experience would be helpful for this project :)
In previous summer internships I have developed code for interprocess
communication; in one instance, I wrote a ring buffer synchronization
protocol to transfer data from a Windows user-space program to a
real-time process running in kernel-mode (hosted via a third-party
Windows real-time HAL).
I propose the following overall plan:
* First, I will prepare a comprehensive test suite for message-based
pipes. Benchmarks would also be prepared to help determine whether
optimization is necessary. This will likely borrow from the tests
already posted to the bug I linked above.
* I will write a wineserver-internal implementation of message-based
pipes. That is, all NtReadFile/NtWriteFile() requests would be
completely redirected to wineserver; there would be no attempt to
expose a file descriptor for the client process to access directly.
Wineserver in turn would simply manage a simple, in-memory queue of
messages. Although this is unscalable and has a lot of context switch
and copying overhead, it is simple, and would allow programs making
light usage of message-based pipes to work. It also provides a clear
place to hook on a new, faster protocol that requires a custom
NtReadFile/NtWriteFile.
After this point, I will begin work on optimizing the implementation.
Some ideas on how to do this might include:
* Implement the pipe using a ring buffer in shared memory. Client
processes can then directly access the pipe buffer to pass data
without wineserver's involvement. This has the downside that a
user-mode process can inadvertently corrupt the pipe's state; this may
be acceptable if the effects can be limited to a trashed pipe buffer,
instead of crashing unrelated processes.
* Implement the pipe using a ring buffer in shared memory, but expose
only a read-only file descriptor to client processes. This avoids the
corruption issues, but writes must be managed by the server and will
incur overhead.
* Implement the pipe as a single server-managed socketpair over which
shared memory backing files are passed with SCM_RIGHTS, plus a lock
(possibly implemented in shared memory using futexes, or by flock on a
/dev/shm file). Clients acquire the lock, then MSG_PEEK the socketpair
to retrieve an anonymous (unlinked immediately after creation) shared
memory file descriptor. The shared memory buffer holds a count of
remaining bytes, plus the actual message data. Clients then perform a
normal recvmsg() to dequeue the message if they successfully retrieved
the entire message. This approach is complex, and it is unclear what
the performance impacts of having so many files being thrown around
would be. However, it does avoid the context-switch costs and
wineserver load that are a problem in the implementations currently on
the bug tracker.
* Implement a Linux kernel module to implement message-based pipe
semantics natively. Although quite efficient, this will only apply to
Linux, and may introduce difficulty keeping up with Linux kernel
changes in the future. It is also unclear whether distributions and/or
upstream Linux maintainers will welcome this approach.
I will likely end up trying out several of these approaches and
comparing actual performance results.
Finally, if time allows, I will also investigate integration with
samba, in order to support connecting to named pipes on remote
servers.
I would appreciate any comments on this proposal prior to actual submission.
Thanks,
Bryan Donlan