Dear Andi,
thanks for your advice.
I made a backtrace with gdb and got mostly the first result appended below, sometimes also the second one. Wine gets stuck in a PeekMessageA. There are two wine processes both calling WaitMessage and PeekMessageA (both need the X11 lock). Sometimes the calls are simultaneous (probably this is possible because both processes have their own X11 lock, isn't it?). Now everything works fine as long as these simultaneous calls are block structured, i.e. of form:
P1 calls, P2 calls, P2 returns, P1 returns.
But the deadlock appears when the simultaneous calls are not block structured in this way, but interleaved:
P1 calls, P2 calls, P1 returns, P1 calls, P2 returns, P2 calls
Actually, just such a sequence (with some more inessential calls in between) happens just before the deadlock. Then, the last calls of both P1 and P2 never return.
Now my question is: does this "interleaved" calling cause the deadlock, or are interleaved callings as such harmless, and the problem is caused by the application, i.e. P1 and P2 are not properly interacting? (But note that the program runs without problems under Windows.)
Greetings, Till
------------------------------------------ excerpt from relay trace:
080b9c48:Call user32.WaitMessage() ret=1801776a 08074060:Call user32.PeekMessageA(406e2668,00000000,00000000,00000000,00000000) ret=004072c9 08074060:Ret user32.PeekMessageA() retval=00000000 ret=004072c9 08074060:Call user32.PeekMessageA(406e2668,00000000,00000000,00000000,00000000) ret=004072c9 080b9c48:Ret user32.WaitMessage() retval=00000001 ret=1801776a 080b9c48:Call user32.PeekMessageA(48c12c88,00000000,00000000,00000000,00000001) ret=180177b6 080b9c48:Ret user32.PeekMessageA() retval=00000000 ret=180177b6 080b9c48:Call user32.WaitMessage() ret=1801776a 08074060:Ret user32.PeekMessageA() retval=00000000 ret=004072c9 08074060:Call user32.PeekMessageA(406e2668,00000000,00000000,00000000,00000000) ret=004072c9
----------------------------------------- gdb backtrace results:
#0 0x40312cd4 in read () from /lib/libc.so.6 #1 0x40102c84 in __DTOR_END__ () from /usr/local/lib/libntdll.dll.so #2 0x400c3c8a in WaitForMultipleObjectsEx (count=1, handles=0x406e2414, wait_all=0, timeout=60000, alertable=0) at ../../scheduler/synchro.c:265 #3 0x400c3aa3 in WaitForSingleObject (handle=892, timeout=60000) at ../../scheduler/synchro.c:205 #4 0x400ccaa0 in RtlpWaitForCriticalSection (crit=0x40c82b04) at critsection.c:123 #5 0x400ccc2b in RtlEnterCriticalSection (crit=0x40c82b04) at critsection.c:173 #6 0x40c6fdec in wine_tsx11_lock () at x11drv_main.c:140 #7 0x40c61d0a in X11DRV_MsgWaitForMultipleObjectsEx (count=0, handles=0x0, timeout=0, mask=0, flags=0) at event.c:168 #8 0x40a12f13 in PeekMessageW (msg_out=0x406e2668, hwnd=0, first=0, last=0, flags=0) at message.c:2068 #9 0x40a1306b in PeekMessageA (msg=0x406e2668, hwnd=0, first=0, last=0, flags=0) at message.c:2126 #10 0x004072c9 in ?? () #11 0x00407884 in ?? () #12 0x0040a9fc in ?? () #13 0x400bf758 in start_process () at ../../scheduler/process.c:564 #14 0x400c3e0d in call_on_thread_stack (func=0x400bf510) at ../../scheduler/sysdeps.c:112
#0 0x402ef421 in nanosleep () from /lib/libc.so.6 #1 0x40319a9e in usleep () from /lib/libc.so.6 #2 0x4116e0de in TIME_MMSysTimeThread (arg=0x403e1810) at time.c:174 #3 0x400c4bed in THREAD_Start () at ../../scheduler/thread.c:267 #4 0x400c3ef8 in SYSDEPS_StartThread (teb=0x47833000) at ../../scheduler/sysdeps.c:165
------------------------------------------------------------------ some more gdb backtraces got after killing the active processes:
#0 0x40312cd4 in read () from /lib/libc.so.6 #1 0x40102c84 in __DTOR_END__ () from /usr/local/lib/libntdll.dll.so #2 0x400c3eaa in WaitForMultipleObjectsEx (count=0, handles=0x0, wait_all=0, timeout=520, alertable=0) at ../../scheduler/synchro.c:265 #3 0x400c3c51 in Sleep (timeout=520) at ../../scheduler/synchro.c:186 #4 0x00407100 in ?? () #5 0x00407254 in ?? () #6 0x00407884 in ?? () #7 0x0040a9fc in ?? () #8 0x400bf7f8 in start_process () at ../../scheduler/process.c:564 #9 0x400c402d in call_on_thread_stack (func=0x400bf5b0) at ../../scheduler/sysdeps.c:112
#0 0x40312cd4 in ?? () #1 0x400c3eaa in WaitForMultipleObjectsEx (count=2, handles=0x47a62e34, wait_all=0, timeout=4294967295, alertable=0) at ../../scheduler/synchro.c:265 #2 0x400c3d28 in WaitForMultipleObjects (count=2, handles=0x47a62e34, wait_all=0, timeout=4294967295) at ../../scheduler/synchro.c:225 #3 0x1a00e6ba in ?? () #4 0x0000e853 in ?? ()
#0 0x40318cc4 in ?? () #1 0x40ecb695 in _X11TransBytesReadable () from /usr/X11R6/lib/libX11.so.6 #2 0x40eaea80 in _XEventsQueued () from /usr/X11R6/lib/libX11.so.6 #3 0x40e92194 in XCheckTypedWindowEvent () from /usr/X11R6/lib/libX11.so.6 #4 0x40c6ce08 in X11DRV_GetClipboardData (wFormat=13) at clipboard.c:1138 #5 0x40c6cb22 in X11DRV_IsClipboardFormatAvailable (wFormat=13) at clipboard.c:1027 #6 0x409cadf7 in CLIPBOARD_RenderFormat (lpFormat=0x40a3fc74) at ../../windows/clipboard.c:423 #7 0x409cb21f in CLIPBOARD_RenderText (wFormat=1) at ../../windows/clipboard.c:621 #8 0x409cbd45 in GetClipboardData (wFormat=1) at ../../windows/clipboard.c:1060 #9 0x18003127 in ?? () #10 0x47fdbde7 in ?? () #11 0x1a048609 in ?? () #12 0x1a048609 in ?? () #13 0x1a0485fe in ?? () #14 0x1a04ac35 in ?? () #15 0x1a0485fe in ?? () #16 0x1a04ac35 in ?? () #17 0x1a0485fe in ?? () #18 0x1a0485fe in ?? () #19 0x1a0485fe in ?? () #20 0x1a0485fe in ?? () #21 0x1a048609 in ?? () #22 0x1a048609 in ?? () #23 0x1a048609 in ?? () #24 0x1a04ac35 in ?? () #25 0x1a04ac35 in ?? () #26 0x1a04ac35 in ?? () #27 0x1a0485fe in ?? () #28 0x1a048609 in ?? () #29 0x1a048609 in ?? () #30 0x1a04ac35 in ?? () #31 0x1a048170 in ?? () #32 0x486d2e34 in ?? () #33 0x2ee3db9b in ?? () Cannot access memory at address 0xc308c483
#0 0x40312cd4 in ?? () #1 0x400c3eaa in WaitForMultipleObjectsEx (count=2, handles=0x489c2c0c, wait_all=0, timeout=4294967295, alertable=0) at ../../scheduler/synchro.c:265 #2 0x400c3d28 in WaitForMultipleObjects (count=2, handles=0x489c2c0c, wait_all=0, timeout=4294967295) at ../../scheduler/synchro.c:225 #3 0x1a00e6ba in ?? () #4 0x0000e853 in ?? () Cannot access memory at address 0x56e58955
---------------------------------------------------------
Andreas Mohr wrote:
On Thu, Oct 17, 2002 at 10:01:33AM +0200, Till Mossakowski wrote:
Hi,
I have a problem with the X11 driver (it seems), related to bug 1027. When I run my program (Fitch.exe), wine gets stuck. All threads do a WaitForMultipleObjects, the main thread a WaitMessage. And I get an error
err:ntdll:RtlpWaitForCriticalSection section 0x40af4b04 "x11drv_main.c: X11DRV_CritSection" wait timed out, retrying (60 sec) tid=080b8780
I read the FAQ about critical section errors, and it helped my to get the following calling stack:
KERNEL32.DLL.WaitForMultipleObjectsEx+0x16a X11DRV.DLL.MsgWaitForMultipleObjectsEx+0xf3 USER32.DLL.MsgWaitForMultipleObjectsEx+0x14f USER32.DLL.WaitMessage+0x22 WINAWT.DLL.PrintEmbeddedFrame+0x7ca JAVAI.DLL.mmiFrameMethod+0x3625 JAVAI.DLL.mmiFrameMethod+0xb60 COMDLG32.DLL.PrintDlgExW+0x2a2dc0 JPEG.DLL..reloc+0x12e2cb9b *** Invalid address 0xc308c483 (IBMJITC.DLL..reloc+0x1df5f482)
(Indeed, Fitch is written in Java - however, unfortunately it is not possible to run it directly under Linux. I also tried to get diagnostic messages from the Java runtime systems, but it seems that all classes are loaded correctly, and then the error occurs.)
When I set "Synchronous" = "Y" in the x11drv section of config, everything works, but *very* slowly. Sometimes it takes 15 seconds until a mouse click leads to the desired action. I also played with the other x11drv settings in the config file, but without effect.
The documentation says that one should set "Synchronous" = "Y" for debugging purposes. But how proceed then?
This seems to be the classical deadlock problem. If you add synchronous X11 request processing, then the X11 handling is slower, thus it's not activating a lock when some other thread also (tries to) activate(s) a lock. Could you find out which wine process is stuck exactly where by using the standard gdb attach debugging ? gdb wine attach <pid> backtrace
(using current CVS, of course...)
The WINAWT.DLL.PrintEmbeddedFrame+0x7ca and then the WaitMessage could mean that it's trying to print something and then probably Wine does some WaitMessage and gets stuck as some other thread is already holding the X11 lock and can't proceed for some other reason (since it's trying to enter a different lock already established by our thread ?).
Should be rather useful to get this deadlock nailed...
Iff WaitMessage() indeed gets called by Wine and not the program itself, then it's either in windows/win.c or controls/menu.c (maybe add some debug traces there to confirm this invocation).
Good luck ! ;-)