Hi all,
Are there any restrictions in synchronizing wine generated threads (generated through the win32 api) using the pthreads library native calls?
I have an app that is experiencing extremely slow performance (operation takes 17 seconds on native windows, over 3 minutes on Wine). One idea for solving this problem was to switch some of the synchronization operations over to native pthreads (it's using wine as part of a migration to Linux process).
The problem is that when I try to call pthread_mutex_init, pthread_mutex_lock and friends, I get quite a few deadlocks, and even when not deadlocking, it still takes over 42 seconds. When doing the same without the mutexes (I'm locking write access to a pipe - I'm not sure it's even necessary, and it is working without the lock too), wine performs at 21-26 seconds. I don't think that obtaining a posix mutex should have such harsh effect on performance.
Is it at all ok to try and synchronize threads created with "CreateThread" using posix constructs?
Shachar
On Wed, 21 Jul 2004 13:08:57 +0300, Shachar Shemesh wrote:
Are there any restrictions in synchronizing wine generated threads (generated through the win32 api) using the pthreads library native calls?
On NPTL (pthread) systems I think this should work. On kthread systems pthreads is emulated by Wine anyway so you'd gain nothing.
I have an app that is experiencing extremely slow performance (operation takes 17 seconds on native windows, over 3 minutes on Wine). One idea for solving this problem was to switch some of the synchronization operations over to native pthreads (it's using wine as part of a migration to Linux process).
Do you know for sure it's the synchro primitives that are causing the slowdown?
The problem is that when I try to call pthread_mutex_init, pthread_mutex_lock and friends, I get quite a few deadlocks, and even when not deadlocking, it still takes over 42 seconds. When doing the same without the mutexes (I'm locking write access to a pipe - I'm not sure it's even necessary, and it is working without the lock too), wine performs at 21-26 seconds. I don't think that obtaining a posix mutex should have such harsh effect on performance.
What kind of mutex are you using? Critical sections should only RPC to the wineserver (which is the slow bit) when contended, iirc ... every time you block this involves a wineserver RPC so with high lock contention that could be a problem.
thanks -mike
Mike Hearn wrote:
On Wed, 21 Jul 2004 13:08:57 +0300, Shachar Shemesh wrote:
The problem is that when I try to call pthread_mutex_init, pthread_mutex_lock and friends, I get quite a few deadlocks, and even when not deadlocking, it still takes over 42 seconds. When doing the same without the mutexes (I'm locking write access to a pipe - I'm not sure it's even necessary, and it is working without the lock too), wine performs at 21-26 seconds. I don't think that obtaining a posix mutex should have such harsh effect on performance.
What kind of mutex are you using? Critical sections should only RPC to the wineserver (which is the slow bit) when contended, iirc ... every time you block this involves a wineserver RPC so with high lock contention that could be a problem.
If you (Shachar) have access to the source then it might be worth adding spin locks similar to the work just done with critical sections, assuming this program is two processes communicating and the lock is in high contention. In fact, it might be worth making a version of our existing critical section code that works over shared memory and use that.
Rob
Mike Hearn wrote:
On Wed, 21 Jul 2004 13:08:57 +0300, Shachar Shemesh wrote:
Are there any restrictions in synchronizing wine generated threads (generated through the win32 api) using the pthreads library native calls?
On NPTL (pthread) systems I think this should work. On kthread systems pthreads is emulated by Wine anyway so you'd gain nothing.
This is RedHat 9, kernel 2.4.20-31.9. It has NPTL, but we are not using it at the moment (LD_ASSUME_KERNEL=2.4.1). This does explain the problems we're having.
I have an app that is experiencing extremely slow performance (operation takes 17 seconds on native windows, over 3 minutes on Wine). One idea for solving this problem was to switch some of the synchronization operations over to native pthreads (it's using wine as part of a migration to Linux process).
Do you know for sure it's the synchro primitives that are causing the slowdown?
Pretty much. We tried writing an app that does little else, and the same ratios were kept.
The problem is that when I try to call pthread_mutex_init, pthread_mutex_lock and friends, I get quite a few deadlocks, and even when not deadlocking, it still takes over 42 seconds. When doing the same without the mutexes (I'm locking write access to a pipe - I'm not sure it's even necessary, and it is working without the lock too), wine performs at 21-26 seconds. I don't think that obtaining a posix mutex should have such harsh effect on performance.
What kind of mutex are you using?
Already said - pthread_mutex_lock and friends. Before, we were using Mutex and Event, with a call to "WaitForMultipleObjects".
Critical sections should only RPC to the wineserver (which is the slow bit) when contended, iirc ...
How can you tell whether you are in contention without RPCing? That makes no sense to me. It's also not how I read the code.
every time you block this involves a wineserver RPC so with high lock contention that could be a problem.
thanks -mike
Shachar
On Wed, 2004-07-21 at 14:40 +0300, Shachar Shemesh wrote:
This is RedHat 9, kernel 2.4.20-31.9. It has NPTL, but we are not using it at the moment (LD_ASSUME_KERNEL=2.4.1). This does explain the problems we're having.
I'm pretty sure LD_ASSUME_KERNEL doesn't work with Wine in the way it does with most apps, but only Alexandre would know for sure.
Certainly if you have it available using NPTL is recommended.
Pretty much. We tried writing an app that does little else, and the same ratios were kept.
Sounds similar to the problems TransGaming were having that prompted the SHM wineserver.
Already said - pthread_mutex_lock and friends. Before, we were using Mutex and Event, with a call to "WaitForMultipleObjects".
Ok, that's what I was after (what win32 mutexes) ...
Critical sections should only RPC to the wineserver (which is the slow bit) when contended, iirc ...
How can you tell whether you are in contention without RPCing? That makes no sense to me. It's also not how I read the code.
You have a lock count that you do an InterlockedIncrement/Decrement on, so if nothing was holding the lock you just sail right through it. If something already changed the lock count then you block on a wait fd provided by the wineserver:
if (interlocked_inc( &crit->LockCount )) { if (crit->OwningThread == (HANDLE)GetCurrentThreadId()) { crit->RecursionCount++; return STATUS_SUCCESS; }
/* Now wait for it */ RtlpWaitForCriticalSection( crit ); }
Obviously this is only the case for these kind of locks. WaitForSingleObject will usually RPC.
thanks -mike
Mike Hearn mike@navi.cx writes:
I'm pretty sure LD_ASSUME_KERNEL doesn't work with Wine in the way it does with most apps, but only Alexandre would know for sure.
It works just like with other apps, it disables NPTL.
Certainly if you have it available using NPTL is recommended.
Definitely.
On Wed, 21 Jul 2004 11:43:59 -0700, Alexandre Julliard wrote:
It works just like with other apps, it disables NPTL.
I said that because I remember when NPTL first became a problem for us, LD_ASSUME_KERNEL was not enough to make Wine work properly again. It would appear to work but you'd still get odd deadlocks and such. Am I misremembering?
thanks -mike
Mike Hearn mike@navi.cx writes:
On Wed, 21 Jul 2004 11:43:59 -0700, Alexandre Julliard wrote:
It works just like with other apps, it disables NPTL.
I said that because I remember when NPTL first became a problem for us, LD_ASSUME_KERNEL was not enough to make Wine work properly again. It would appear to work but you'd still get odd deadlocks and such. Am I misremembering?
There were other issues, non-NPTL glibc caused various breakages too. Anyway at this point there is no reason at all to use LD_ASSUME_KERNEL.