Hi,
I seem to have found a recreatable threading problem and I am not convinced it it wines fault either. Are there any experts here who want to offer any advice on where to go to persue this please?
Problem: 'wine appname' hangs when it creates one of its threads. The hang is inside pthread_create and control is never returned to the caller
Environment uname -a returns kernel 2.4.21-0.13mdk, and its basically Mandrake Linux 9.1 with no updates. Wine is cvs as of today, configured with --with-nptl
More info: Looking through an strace, I can see a difference between a working and failing case, and it would appear there is a timing issue inside the pthread routines. However, I would normally expect it far more likely that it is a wine bug than a kernel / threading bug, so I dont really know how to proceed. I can recreate this about 95% of the time with one app and one app only, so I really suspect it to be a wine problem, but I just cant see why.
Here's a working case =============== (Note SY3 and SY4 are fixmes I added each side of the pthread_create)
394 issues a create thread. [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 3) = 38 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0 [pid 394] write(12, "\240@4@\0\0\0\0\230,s@\240\177\v@\0000\20J\0\0\0\200\0"..., 148 <unfinished ...> : 398 is the pthread thread manager thread [pid 398] getppid() = 394 [pid 398] read(10, "\240@4@\0\0\0\0\230,s@\240\177\v@\0000\20J\0\0\0\200\0"..., 148) = 148 [pid 398] clone(child_stack=0x4a102bd0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|0x21) = 400 : It wakes up 394 to indicate it can continue [pid 398] kill(394, SIGRTMIN) = 0 : [pid 394] <... write resumed> ) = 148 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0 [pid 394] rt_sigsuspend([] <unfinished ...> *** sigsuspend gets back a SIGRTMIN to wake it up. [pid 394] --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- [pid 394] <... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call) [pid 394] sigreturn() = ? (mask now [RTMIN]) [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 4
Failing case ======== 394 issues a create thread. [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 3) = 38 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0 [pid 394] write(12, "\240@4@\0\0\0\0$&s@\240\177\v@\0000xJ\0\0\0\0\0\0\0\0`"..., 148 <unfinished ...> : 398 is the pthread thread manager thread [pid 398] <... poll resumed> [{fd=10, events=POLLIN, revents=POLLIN}], 1, 2000) = 1 [pid 398] getppid() = 394 [pid 398] read(10, "\240@4@\0\0\0\0$&s@\240\177\v@\0000xJ\0\0\0\0\0\0\0\0`"..., 148) = 148 [pid 398] clone(child_stack=0x4a782bd0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|0x21) = 401 : It wakes up 394 to indicate it can continue [pid 398] kill(394, SIGRTMIN) = 0 : (Now for the difference...) [pid 394] <... write resumed> ) = 148 : vvv What is this, and why is it getting the signal [pid 394] --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- [pid 394] sigreturn() = ? (mask now []) : vvv Since the signal is absorbed already, suspend just hangs forever. [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0 [pid 394] rt_sigsuspend([] <unfinished ...>
Any help? Its really annoying Jason