I'm looking into Bug 15323, and it seems to come down to a particular undocumented behaviour of CreateThread on Windows, which is probably also a bug in the usage of it by warpatch.bin, the Warhammer Online patcher.
Problem code outline: Main thread: flag = false; createThread( newThreadFun ) sleep until flag = false;
newThreadFun: flag = true CreateHiddenWindowAndWaitForMessageFromMainThread; flag = false; return;
On inspection, the only way this can work is if the main thread's "sleep until" test is done before newThreadFun gets started.
And this is what we see. If the sleep is hit, the patcher never shows any windows or continues on. If the patcher works, we _never_ slept.
I've attached a hack patch against kernel32's thread test suite which reduces it to a single test of this behaviour.
Under Windows, the test always passes, at least in my limited testing.
Under Wine, the test usually fails if wineserver is not already running, and usually passes if wineserver is already running. However, both cases can also give the opposite result every so often. This matches the observed problems starting the Warhammer Online patcher.
So anyway, I guess the question is, am I right about this behaviour or have I overlooked something relevant?
And the follow-up question, if this is a threading behaviour difference between Wine and MS Win32, should Wine replicate it?
And then the follow-up to the follow-up... How would this be replicated?
Maybe a new thread should start at a slightly lower priority than its parent thread? Or simply have ntdll/thread.c's start_thread call sched_yield(); before call_thread_func.
I tried this last idea, and the test passed more often, but still failed frequently with no wineserver running. It also improved the chances of warpatch starting correctly (I only had one failure in about ten tries, rather than nine failures in ten tries).
On Sun, Mar 01, 2009 at 06:08:53PM +1100, Paul TBBle Hampson wrote:
I'm looking into Bug 15323, and it seems to come down to a particular undocumented behaviour of CreateThread on Windows, which is probably also a bug in the usage of it by warpatch.bin, the Warhammer Online patcher.
Problem code outline: Main thread: flag = false; createThread( newThreadFun ) sleep until flag = false;
newThreadFun: flag = true CreateHiddenWindowAndWaitForMessageFromMainThread; flag = false; return;
On inspection, the only way this can work is if the main thread's "sleep until" test is done before newThreadFun gets started.
Isn't that backwards? Doesn't the above rely on the new thread pre-empting the main thread? (The above example seems to be implementing a function call!)
In any case the above example is doomed to failure. I don't expect that windows defines which thread runs first. And on a multi-processor system they are likely to both run.
So I'd guess that warpatch.bin or broken.
David
On Mon, Mar 2, 2009 at 12:25 AM, David Laight david@l8s.co.uk wrote:
On Sun, Mar 01, 2009 at 06:08:53PM +1100, Paul TBBle Hampson wrote:
I'm looking into Bug 15323, and it seems to come down to a particular undocumented behaviour of CreateThread on Windows, which is probably also a bug in the usage of it by warpatch.bin, the Warhammer Online patcher.
Problem code outline: Main thread: flag = false; createThread( newThreadFun ) sleep until flag = false;
newThreadFun: flag = true CreateHiddenWindowAndWaitForMessageFromMainThread; flag = false; return;
On inspection, the only way this can work is if the main thread's "sleep until" test is done before newThreadFun gets started.
Isn't that backwards? Doesn't the above rely on the new thread pre-empting the main thread? (The above example seems to be implementing a function call!)
In any case the above example is doomed to failure. I don't expect that windows defines which thread runs first. And on a multi-processor system they are likely to both run.
So I'd guess that warpatch.bin or broken.
David
-- David Laight: david@l8s.co.uk
Msdn sayes: "The ExitProcess, ExitThread, CreateThread, CreateRemoteThread functions, and a process that is starting (as the result of a call by CreateProcess) are serialized between each other within a process. Only one of these events can happen in an address space at a time. This means that the following restrictions hold:
* During process startup and DLL initialization routines, new threads can be created, but they do not begin execution until DLL initialization is done for the process. * Only one thread in a process can be in a DLL initialization or detach routine at a time. * ExitProcess does not complete until there are no threads in their DLL initialization or detach routines."
So maybe this hidden synchronization is causing also newly created thread not to run until CreateThread has exited.
Does wine even follow these document restrictions?
On Mon, Mar 02, 2009 at 12:29:14AM +0200, Pauli Nieminen wrote:
Msdn sayes: "The ExitProcess, ExitThread, CreateThread, CreateRemoteThread functions, and a process that is starting (as the result of a call by CreateProcess) are serialized between each other within a process. Only one of these events can happen in an address space at a time. This means that the following restrictions hold:
* During process startup and DLL initialization routines, new
threads can be created, but they do not begin execution until DLL initialization is done for the process. * Only one thread in a process can be in a DLL initialization or detach routine at a time. * ExitProcess does not complete until there are no threads in their DLL initialization or detach routines."
So maybe this hidden synchronization is causing also newly created thread not to run until CreateThread has exited.
How would I know if one of these was the case? 'cause if that's what's going on, it'd explain the problem quite neatly.
I should mention that the actual code being run here is inside a dll file (libpatchui.dll) not the main warpatch.bin.
On Mon, Mar 2, 2009 at 5:28 AM, Paul TBBle Hampson Paul.Hampson@pobox.com wrote:
On Mon, Mar 02, 2009 at 12:29:14AM +0200, Pauli Nieminen wrote:
Msdn sayes: "The ExitProcess, ExitThread, CreateThread, CreateRemoteThread functions, and a process that is starting (as the result of a call by CreateProcess) are serialized between each other within a process. Only one of these events can happen in an address space at a time. This means that the following restrictions hold:
* During process startup and DLL initialization routines, new threads can be created, but they do not begin execution until DLL initialization is done for the process. * Only one thread in a process can be in a DLL initialization or detach routine at a time. * ExitProcess does not complete until there are no threads in their DLL initialization or detach routines."
So maybe this hidden synchronization is causing also newly created thread not to run until CreateThread has exited.
How would I know if one of these was the case? 'cause if that's what's going on, it'd explain the problem quite neatly.
I should mention that the actual code being run here is inside a dll file (libpatchui.dll) not the main warpatch.bin.
There seems to be at least partial synchronization for dll loading but threading part doesn't include this lock. It seems like wine doesn't need it so maybe it is enough to just simulate windows with following small patch.
For me above restriction just sounds like performance bottleneck for future CPUs with x cores where X>100 cores :)
Pauli
disclaimer: I didn't test this so might not be complete solution. If it isn't enough then you could try add lock calls to create thread and exit thread paths too.
diff --git a/dlls/ntdll/thread.c b/dlls/ntdll/thread.c index 87b9d57..88acc3e 100644 --- a/dlls/ntdll/thread.c +++ b/dlls/ntdll/thread.c @@ -466,7 +466,9 @@ static void start_thread( struct startup_info *info ) PRTL_THREAD_START_ROUTINE func = info->entry_point; void *arg = info->entry_arg; struct debug_info debug_info; - + ULONG magic; + + LdrLockLoaderLock(0, NULL, &magic); debug_info.str_pos = debug_info.strings; debug_info.out_pos = debug_info.output; thread_data->debug_info = &debug_info; @@ -480,6 +482,7 @@ static void start_thread( struct startup_info *info ) InsertHeadList( &tls_links, &teb->TlsLinks ); RtlReleasePebLock();
+ LdrUnlockLoaderLock(0, &magic); /* NOTE: Windows does not have an exception handler around the call to * the thread attach. We do for ease of debugging */ if (unhandled_exception_filter)
On Mon, Mar 02, 2009 at 06:51:06AM +0200, Pauli Nieminen wrote:
There seems to be at least partial synchronization for dll loading but threading part doesn't include this lock. It seems like wine doesn't need it so maybe it is enough to just simulate windows with following small patch.
disclaimer: I didn't test this so might not be complete solution. If it isn't enough then you could try add lock calls to create thread and exit thread paths too.
[patch]
This just caused all the threads and processes to fail to start, locking on the main process or something.
I'll play with this further if I have time, barring further comment on this thread, see if I can confirm that the behaviour I'm seeing is due to the code I'm looking at being somehow inside the dll attach code.
On Sun, Mar 01, 2009 at 10:25:32PM +0000, David Laight wrote:
On Sun, Mar 01, 2009 at 06:08:53PM +1100, Paul TBBle Hampson wrote:
I'm looking into Bug 15323, and it seems to come down to a particular undocumented behaviour of CreateThread on Windows, which is probably also a bug in the usage of it by warpatch.bin, the Warhammer Online patcher.
Problem code outline: Main thread: flag = false; createThread( newThreadFun ) sleep until flag = false;
newThreadFun: flag = true CreateHiddenWindowAndWaitForMessageFromMainThread; flag = false; return;
On inspection, the only way this can work is if the main thread's "sleep until" test is done before newThreadFun gets started.
Isn't that backwards? Doesn't the above rely on the new thread pre-empting the main thread? (The above example seems to be implementing a function call!)
Nope. If newThreadFun runs before the main thread checks flag, it sets the flag and spins waiting for messages (with no visible window or other way I know of to get messages) and the main thread spins on sleep.
If newThreadFun runs after mainthread has passed the sleep check, it sets the flag and spins waiting for messages with no visible window. However, the main thread pops up a window, and later on unhides the thread's window (or tells the thread to do it. I don't know exactly).
In any case the above example is doomed to failure. I don't expect that windows defines which thread runs first. And on a multi-processor system they are likely to both run.
Not that I could replicate, on my Core2Duo laptop running XP x64.
So I'd guess that warpatch.bin or broken.
That's my guess, but it'd be nice to be able to say for sure before I chase it up with Mythic, given that it's taken two weeks for the last issue I raised with them to get elevated to someone who didn't suggest I create my Windows XP Administrator account and resend them another dxdiag.