http://bugs.winehq.org/show_bug.cgi?id=26500
Summary: Critical section busy wait Product: Wine Version: 1.3.16 Platform: All OS/Version: All Status: UNCONFIRMED Severity: normal Priority: P2 Component: ntdll AssignedTo: wine-bugs@winehq.org ReportedBy: vvoznesensky@gmail.com
The critical section machinery (EnterCriticalSection, LeaveCriticalSection, TryEnterCriticalSection) is realized suboptimal: waiting thread consumes CPU cycles polling LockCount instead of using system lock like pthread_mutex_lock do (please look at http://www.jbox.dk/sanos/source/lib/pthread/mutex.c.html ) .
Suggested solution: throw out custom realization and use pthread_mutex_t, pthread_mutex_lock, pthread_mutex_unlock and pthread_mutex_lock instead.
http://bugs.winehq.org/show_bug.cgi?id=26500
Dmitry Timoshkov dmitry@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Platform|All |Other Resolution| |INVALID Summary|Critical section busy wait |Suggestion: throw out | |custom realization and use | |pthread_mutex_t OS/Version|All |other Severity|normal |enhancement
--- Comment #1 from Dmitry Timoshkov dmitry@codeweavers.com 2011-03-21 04:00:15 CDT --- (In reply to comment #0)
The critical section machinery (EnterCriticalSection, LeaveCriticalSection, TryEnterCriticalSection) is realized suboptimal: waiting thread consumes CPU cycles polling LockCount
That's exactly the intent of that API usage. If you are interested why - read MSDN or any other source.
http://bugs.winehq.org/show_bug.cgi?id=26500
Dmitry Timoshkov dmitry@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #2 from Dmitry Timoshkov dmitry@codeweavers.com 2011-03-21 04:00:27 CDT --- Closing invalid.
http://bugs.winehq.org/show_bug.cgi?id=26500
--- Comment #3 from Dmitry Timoshkov dmitry@codeweavers.com 2011-03-21 04:04:31 CDT --- Besides, critical section implementation uses futexes on Linux to speed up waiting.
http://bugs.winehq.org/show_bug.cgi?id=26500
Vladimir Voznesensky vvoznesensky@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|CLOSED |UNCONFIRMED Resolution|INVALID |
--- Comment #4 from Vladimir Voznesensky vvoznesensky@gmail.com 2011-03-21 05:00:12 CDT --- (In reply to comment #3)
Besides, critical section implementation uses futexes on Linux to speed up waiting.
Dear Dmitry, I'm sorry. Please, look _carefully_ at pthread_mutex_lock realization. It uses both futexes and events!
Declaration: struct pthread_mutex { long lock; // Exclusive access to mutex state: // 0: unlocked/free // 1: locked - no other waiters // -1: locked - with possible other waiters
long recursion; // Number of unlocks a thread needs to perform // before the lock is released (recursive mutexes only) int kind; // Mutex type pthread_t owner; // Thread owning the mutex handle_t event; // Mutex release notification to waiting threads }; Look here for realization: http://www.jbox.dk/sanos/source/lib/pthread/mutex.c.html#:105
lock variable is used for futex AND event is used for kernel synchronization!
http://bugs.winehq.org/show_bug.cgi?id=26500
--- Comment #5 from Dmitry Timoshkov dmitry@codeweavers.com 2011-03-21 06:35:06 CDT --- (In reply to comment #4)
Please, look _carefully_ at pthread_mutex_lock realization. It uses both futexes and events!
...
lock variable is used for futex AND event is used for kernel synchronization!
How does it make this bug more valid?
http://bugs.winehq.org/show_bug.cgi?id=26500
--- Comment #6 from Vladimir Voznesensky vvoznesensky@gmail.com 2011-03-22 04:16:14 CDT --- (In reply to comment #5)
(In reply to comment #4)
Please, look _carefully_ at pthread_mutex_lock realization. It uses both futexes and events!
...
lock variable is used for futex AND event is used for kernel synchronization!
How does it make this bug more valid?
Windows implementation does not have to busy wait forever. Look http://msdn.microsoft.com/en-us/library/ms682530.aspx :
Spinning means that when a thread tries to acquire a critical section that is locked, the thread enters a loop, checks to see if the lock is released, and if the lock is not released, the thread goes to sleep. ... if the critical section is unavailable, the calling thread spins dwSpinCount times before performing a wait operation on a semaphore that is associated with the critical section. If the critical section becomes free during the spin operation, the calling thread avoids the wait operation.
As you can see, a kernel synchro object is used in Windows for waiting the thread owning the lock. The same behaviour (except spinning) is realized in libpthread. So, why should WineLib have a very different suboptimal behaviour?
http://bugs.winehq.org/show_bug.cgi?id=26500
Dmitry Timoshkov dmitry@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |INVALID
--- Comment #7 from Dmitry Timoshkov dmitry@codeweavers.com 2011-03-22 04:21:38 CDT --- (In reply to comment #6)
As you can see, a kernel synchro object is used in Windows for waiting the thread owning the lock. The same behaviour (except spinning) is realized in libpthread. So, why should WineLib have a very different suboptimal behaviour?
Because Wine implements win32 API and not posix one. Spinning is an important part of the implementation.
http://bugs.winehq.org/show_bug.cgi?id=26500
Dmitry Timoshkov dmitry@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #8 from Dmitry Timoshkov dmitry@codeweavers.com 2011-03-22 04:21:50 CDT --- Closing.
http://bugs.winehq.org/show_bug.cgi?id=26500
--- Comment #9 from Dmitry Timoshkov dmitry@codeweavers.com 2011-03-22 04:23:09 CDT --- And calling Wine implementation "suboptimal" deserves at least some argumentation.
http://bugs.winehq.org/show_bug.cgi?id=26500
--- Comment #10 from Vladimir Voznesensky vvoznesensky@gmail.com 2011-03-22 05:14:26 CDT --- Dear Dmitry,
1. Windows uses synchronization object to prevent waiting thread to consume too much CPU cycles by giving CPU to other running thread by waiting on this object. That's MSDN doc say. Am I right?
2. Winelib does not use kernel object, so it: 2.a: Does not behave the way Windows API do; 2.b: Is suboptimal because of busy waits.
3. Linux pthread goes kernel waiting after the first futex check without spinning, but, anyway, this realization is much less suboptimal (if at all) than the Windows one.
http://bugs.winehq.org/show_bug.cgi?id=26500
--- Comment #11 from Dmitry Timoshkov dmitry@codeweavers.com 2011-03-22 05:33:25 CDT --- (In reply to comment #10)
- Windows uses synchronization object to prevent waiting thread to consume too
much CPU cycles by giving CPU to other running thread by waiting on this object. That's MSDN doc say. Am I right?
- Winelib does not use kernel object, so it:
2.a: Does not behave the way Windows API do;
kernel objects in Wine are implemented in wineserver, so making them behave like kernel-side objects, but that doesn't really matter.
2.b: Is suboptimal because of busy waits.
Where do you see busy waits? Using spin counts? If yes, that's exactly how spinning is supposed to work.
http://bugs.winehq.org/show_bug.cgi?id=26500
--- Comment #12 from Henri Verbeet hverbeet@gmail.com 2011-03-22 05:44:44 CDT --- (In reply to comment #10)
- Winelib does not use kernel object, so it:
2.a: Does not behave the way Windows API do; 2.b: Is suboptimal because of busy waits.
Please read RtlEnterCriticalSection() more carefully. Pay particular attention to wait_semaphore().
http://bugs.winehq.org/show_bug.cgi?id=26500
--- Comment #13 from Vladimir Voznesensky vvoznesensky@gmail.com 2011-03-22 06:21:26 CDT --- There is no wait_semaphore in RtlEnterCriticalSection, but it is in RtlpWaitForCriticalSection that is called from the former function. Thank you.