Hi. Recently, I am working on an multi-thread application that widely uses mutex functions, namely: WaitForSingleObject and ReleaseMutex.
After some benchmarks I found that these functions are 50..100 times(sic!) slower than in Windows. And this is when only one thread tries to use the mutex - i.e. there is nothing to be waiting for.
The test program compiled natively for Linux (using pthreads library) demonstrates only 1.5 times slower functions (compared to the same program not using mutexes at all) which is pretty acceptable.
So, what can I do in order to speed up these functions? Is it a bug (that need to be reported and fixed) or some fundamental architecture problem? Should I think about different way of mutex implementation?
This is because every call involving a kernel object handle is done via RPC to the wineserver process.
The semantics of things like DuplicateHandle, and all of the various types of waitable kernel objects, need to be reproduced exactly. Even in the case of a single object used by a single thread, in order to optimize out the wineserver call you'd have to somehow be sure no one had duplicated that object into another process. Or you'd have to give the wineserver enough information to duplicate it while letting you wait for/manipulate the object without an RPC call.
So, I don't know that it's necessarily a fundamental architecture problem, but there's a lot you have to think about. And I can't recommend taking on a project like this to a new Wine developer.
So, thinking about changes in my code, are critical sections faster than mutexes in WINE?
On Tue, 29 Apr 2014 12:40:01 -0500 Vincent Povirk madewokherd@gmail.com wrote:
This is because every call involving a kernel object handle is done via RPC to the wineserver process.
The semantics of things like DuplicateHandle, and all of the various types of waitable kernel objects, need to be reproduced exactly. Even in the case of a single object used by a single thread, in order to optimize out the wineserver call you'd have to somehow be sure no one had duplicated that object into another process. Or you'd have to give the wineserver enough information to duplicate it while letting you wait for/manipulate the object without an RPC call.
So, I don't know that it's necessarily a fundamental architecture problem, but there's a lot you have to think about. And I can't recommend taking on a project like this to a new Wine developer.
Hi John,
If this is your own application and depending on which features you need exactly (synchronization in one application or between multiple processes?), the easiest way is just to use a different set of synchronization primitives.
You could either use CriticalSections (which internally use very fast futex commands on Linux) or slim reader/writer locks (see: http://msdn.microsoft.com/en-us/library/windows/desktop/ms681930(v=vs.85).as... ), which only use wineserver calls when they are blocking. Both methods should definitely bring some performance boost.
Regards, Sebastian
Am 29.04.2014 19:40, schrieb Vincent Povirk:
This is because every call involving a kernel object handle is done via RPC to the wineserver process.
The semantics of things like DuplicateHandle, and all of the various types of waitable kernel objects, need to be reproduced exactly. Even in the case of a single object used by a single thread, in order to optimize out the wineserver call you'd have to somehow be sure no one had duplicated that object into another process. Or you'd have to give the wineserver enough information to duplicate it while letting you wait for/manipulate the object without an RPC call.
So, I don't know that it's necessarily a fundamental architecture problem, but there's a lot you have to think about. And I can't recommend taking on a project like this to a new Wine developer.
Thanks, Sebastian.
I changed my program to use critical sections (it needs only thread syncronization) instead of mutex and the speed increased up to 10 times in Wine.
In Windows the speed increase is not so big.
I know that it is not recommended to shape the application after WINE, but my program is actually hybrid anyway - it uses Linux system calls and other Linux features when in Linux, so it is not a big deal.
Thanks also to all responders for the help.
P.S.: If anyone is interested, the application we're talking about is this: http://fresh.flatassembler.net
On Tue, 29 Apr 2014 20:21:31 +0200 Sebastian Lackner sebastian@fds-team.de wrote:
Hi John,
If this is your own application and depending on which features you need exactly (synchronization in one application or between multiple processes?), the easiest way is just to use a different set of synchronization primitives.
You could either use CriticalSections (which internally use very fast futex commands on Linux) or slim reader/writer locks (see: http://msdn.microsoft.com/en-us/library/windows/desktop/ms681930(v=vs.85).as... ), which only use wineserver calls when they are blocking. Both methods should definitely bring some performance boost.
Regards, Sebastian
Am 29.04.2014 19:40, schrieb Vincent Povirk:
This is because every call involving a kernel object handle is done via RPC to the wineserver process.
The semantics of things like DuplicateHandle, and all of the various types of waitable kernel objects, need to be reproduced exactly. Even in the case of a single object used by a single thread, in order to optimize out the wineserver call you'd have to somehow be sure no one had duplicated that object into another process. Or you'd have to give the wineserver enough information to duplicate it while letting you wait for/manipulate the object without an RPC call.
So, I don't know that it's necessarily a fundamental architecture problem, but there's a lot you have to think about. And I can't recommend taking on a project like this to a new Wine developer.