Hi,
I seem to have found a recreatable threading problem and I am not convinced it it wines fault either. Are there any experts here who want to offer any advice on where to go to persue this please?
Problem: 'wine appname' hangs when it creates one of its threads. The hang is inside pthread_create and control is never returned to the caller
Environment uname -a returns kernel 2.4.21-0.13mdk, and its basically Mandrake Linux 9.1 with no updates. Wine is cvs as of today, configured with --with-nptl
More info: Looking through an strace, I can see a difference between a working and failing case, and it would appear there is a timing issue inside the pthread routines. However, I would normally expect it far more likely that it is a wine bug than a kernel / threading bug, so I dont really know how to proceed. I can recreate this about 95% of the time with one app and one app only, so I really suspect it to be a wine problem, but I just cant see why.
Here's a working case =============== (Note SY3 and SY4 are fixmes I added each side of the pthread_create)
394 issues a create thread. [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 3) = 38 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0 [pid 394] write(12, "\240@4@\0\0\0\0\230,s@\240\177\v@\0000\20J\0\0\0\200\0"..., 148 <unfinished ...> : 398 is the pthread thread manager thread [pid 398] getppid() = 394 [pid 398] read(10, "\240@4@\0\0\0\0\230,s@\240\177\v@\0000\20J\0\0\0\200\0"..., 148) = 148 [pid 398] clone(child_stack=0x4a102bd0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|0x21) = 400 : It wakes up 394 to indicate it can continue [pid 398] kill(394, SIGRTMIN) = 0 : [pid 394] <... write resumed> ) = 148 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0 [pid 394] rt_sigsuspend([] <unfinished ...> *** sigsuspend gets back a SIGRTMIN to wake it up. [pid 394] --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- [pid 394] <... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call) [pid 394] sigreturn() = ? (mask now [RTMIN]) [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 4
Failing case ======== 394 issues a create thread. [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 3) = 38 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0 [pid 394] write(12, "\240@4@\0\0\0\0$&s@\240\177\v@\0000xJ\0\0\0\0\0\0\0\0`"..., 148 <unfinished ...> : 398 is the pthread thread manager thread [pid 398] <... poll resumed> [{fd=10, events=POLLIN, revents=POLLIN}], 1, 2000) = 1 [pid 398] getppid() = 394 [pid 398] read(10, "\240@4@\0\0\0\0$&s@\240\177\v@\0000xJ\0\0\0\0\0\0\0\0`"..., 148) = 148 [pid 398] clone(child_stack=0x4a782bd0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|0x21) = 401 : It wakes up 394 to indicate it can continue [pid 398] kill(394, SIGRTMIN) = 0 : (Now for the difference...) [pid 394] <... write resumed> ) = 148 : vvv What is this, and why is it getting the signal [pid 394] --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- [pid 394] sigreturn() = ? (mask now []) : vvv Since the signal is absorbed already, suspend just hangs forever. [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0 [pid 394] rt_sigsuspend([] <unfinished ...>
Any help? Its really annoying Jason
I assume you tried both with NPTL support compiled in and without, right?
Having said that I've also seen wierd freezes inside pthreads, despite most other apps working correctly (with rhymbox in particular). So, there might be some other strange issue here. I'll try an strace on RB to see if it's a similar problem.
On Sun, 2003-08-10 at 16:33, Jason Edmeades wrote:
Hi,
I seem to have found a recreatable threading problem and I am not convinced it it wines fault either. Are there any experts here who want to offer any advice on where to go to persue this please?
Problem: 'wine appname' hangs when it creates one of its threads. The hang is inside pthread_create and control is never returned to the caller
Environment uname -a returns kernel 2.4.21-0.13mdk, and its basically Mandrake Linux 9.1 with no updates. Wine is cvs as of today, configured with --with-nptl
More info: Looking through an strace, I can see a difference between a working and failing case, and it would appear there is a timing issue inside the pthread routines. However, I would normally expect it far more likely that it is a wine bug than a kernel / threading bug, so I dont really know how to proceed. I can recreate this about 95% of the time with one app and one app only, so I really suspect it to be a wine problem, but I just cant see why.
Here's a working case
(Note SY3 and SY4 are fixmes I added each side of the pthread_create)
394 issues a create thread. [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 3) = 38 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0 [pid 394] write(12, "\240@4@\0\0\0\0\230,s@\240\177\v@\0000\20J\0\0\0\200\0"..., 148 <unfinished ...> : 398 is the pthread thread manager thread [pid 398] getppid() = 394 [pid 398] read(10, "\240@4@\0\0\0\0\230,s@\240\177\v@\0000\20J\0\0\0\200\0"..., 148) = 148 [pid 398] clone(child_stack=0x4a102bd0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|0x21) = 400 : It wakes up 394 to indicate it can continue [pid 398] kill(394, SIGRTMIN) = 0 : [pid 394] <... write resumed> ) = 148 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [RTMIN], 8) = 0 [pid 394] rt_sigsuspend([] <unfinished ...> *** sigsuspend gets back a SIGRTMIN to wake it up. [pid 394] --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- [pid 394] <... rt_sigsuspend resumed> ) = -1 EINTR (Interrupted system call) [pid 394] sigreturn() = ? (mask now [RTMIN]) [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 4
Failing case
394 issues a create thread. [pid 394] write(2, "fixme:thread:SYSDEPS_SpawnThread"..., 38fixme:thread:SYSDEPS_SpawnThread SY 3) = 38 [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0 [pid 394] write(12, "\240@4@\0\0\0\0$&s@\240\177\v@\0000xJ\0\0\0\0\0\0\0\0`"..., 148 <unfinished ...> : 398 is the pthread thread manager thread [pid 398] <... poll resumed> [{fd=10, events=POLLIN, revents=POLLIN}], 1, 2000) = 1 [pid 398] getppid() = 394 [pid 398] read(10, "\240@4@\0\0\0\0$&s@\240\177\v@\0000xJ\0\0\0\0\0\0\0\0`"..., 148) = 148 [pid 398] clone(child_stack=0x4a782bd0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|0x21) = 401 : It wakes up 394 to indicate it can continue [pid 398] kill(394, SIGRTMIN) = 0 : (Now for the difference...) [pid 394] <... write resumed> ) = 148 : vvv What is this, and why is it getting the signal [pid 394] --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- [pid 394] sigreturn() = ? (mask now []) : vvv Since the signal is absorbed already, suspend just hangs forever. [pid 394] rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0 [pid 394] rt_sigsuspend([] <unfinished ...>
Any help? Its really annoying Jason
Mike Hearn wrote:
I assume you tried both with NPTL support compiled in and without, right?
I've had other problems without nptl support, so I havent tried this exact scenario. I guess I need to do a configure, make clean, then a make for the whole of wine to do this? If so, I'll put it off until I've fixed a graphical problem I am narrowing in on and let you know!
Thanks Jason
"Jason Edmeades" us@the-edmeades.demon.co.uk wrote:
Environment uname -a returns kernel 2.4.21-0.13mdk, and its basically Mandrake Linux 9.1 with no updates. Wine is cvs as of today, configured with --with-nptl
I may be wrong, but according to the information posted to one of our support forums, support for NPTL was backported from 2.5 kernels by RedHat, and nobody else has support for NPTL in 2.4 kernel series.
So, you shouldn't compile --with-nptl on Mandrake.
s?, 10.08.2003 kl. 17.33 skrev Jason Edmeades:
vvv What is this, and why is it getting the signal [pid 394] --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- [pid 394] sigreturn() = ? (mask now [])
Perhaps an earlier sigprocmask erroneously leaves the signal unmasked?
vvv What is this, and why is it getting the signal [pid 394] --- SIGRTMIN (Unknown signal 32) @ 0 (0) --- [pid 394] sigreturn() = ? (mask now [])
Perhaps an earlier sigprocmask erroneously leaves the signal unmasked?
I thought that too, but it didnt appear so, but you have got me wondering so I'll check again when I get back to Linux.
However from other updates I would assume I dont need with-nptl for Mandrake 9.1, hence I'll try a complete recompile without it and see what other problems I have.
Jason