David Howells <dhowells(a)cambridge.redhat.com> writes:
> So this saves you the cost of the fd transfer net packet. Though you still
> have to do the two context switches, which is my main contention.
I suspect we are doing more than two switches (though I haven't proved
it), which is why I think there is a margin for improvement. You'll
obviously always have the context switch cost unless everything is in
the kernel.
> True, but I'd have thought that the context switches involved are still a cost
> you can't get rid of so easily. Out of interest, how do you plan on doing the
> locking stuff for Read/WriteFile? Cache it locally? It is unfortunate, but you
> can't really make use of UNIX file locking, since this is mostly advisory and
> as such doesn't actively stop read/write calls.
Yes, we'll need to store the locks in the server and check them before
each read/write (and probably also release them afterwards if
necessary). There may be some optimisations possible, but we should
probably do it the easy way first.
> Seriously, though, whilst this'd be a lot easier in many ways (and it would
> allow you to avoid the context-switch penalties), you wouldn't be able to take
> full advantage of the available support in the kernel, which is more capable
> than the standard UNIX userspace API suggests.
I don't see why. I'm not suggesting keeping the current socket stuff,
just reusing the structures. So basically instead of passing the
address of the stack arguments (which is really ugly IMO) to your
ioctl, you pass one of the server request structures. This allows your
changes to be localized to wine_server_call and doesn't require
changing any of the routines that make server calls. Obviously you'd
need some more changes for a few calls like ReadFile/WriteFile, but
most operations could switch to your mechanism without needing any
change. You simply cannot require people to recompile all of Wine to
use your module.
> > I still think that it should be possible to improve that by a small
> > kernel hack. It will never be as fast as doing everything in the
> > kernel of course, but it may just be fast enough to avoid the need to
> > reimplement the whole server.
>
> If you want to suggest exactly what you'd like to see as a hack...
I don't know exactly, there are many ways of doing it; you can have a
specialized fifo, a network protocol, an ioctl, etc. Basically any
mechanism that ensures that we do the strict mimimum number of context
switches and schedule() calls for a server call. And probably also a
way to transfer chunks of memory from the client address space so that
we don't need the shared memory area.
> As far as I've observed (I've got Win2000 available), most Windows DLL's have
> 512-byte (sector) alignment internally, _not_ 4096-byte (page) alignment for
> the sections. This means that the separate sections can't be mmap'd (or else
> they'd lose their required relative relationships):
Actually the file alignment doesn't need to be 4096, it needs to match
the filesystem block size. On a FAT filesystem the block size is 512
so Linux will happily mmap every section. On a 1k-block ext2 fs it
will be able to mmap about 50% of them.
> Also, since DLLs and EXEs are not compiled as PIC (the MSDEV compiler not
> having such an option as far as I can recall), the fixup tables usually seem
> to apply to just about every page in the code section.
Only if the dll cannot be loaded at the preferred address, which
shouldn't happen too often. I'm not saying your patch is useless, but
I doubt the gain is as large as you seem to think.
--
Alexandre Julliard
julliard(a)winehq.com