http://bugs.winehq.org/show_bug.cgi?id=21636
Summary: MJ12node.exe freezes after a while Product: Wine Version: 1.1.38 Platform: x86 URL: http://www.majestic12.co.uk/projects/dsearch/download. php OS/Version: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: -unknown AssignedTo: wine-bugs@winehq.org ReportedBy: refic@psimerion.org
Created an attachment (id=26122) --> (http://bugs.winehq.org/attachment.cgi?id=26122) console output
After applying the patch from http://bugs.winehq.org/show_bug.cgi?id=21624 MJ12node.exe starts fine, but after running for a while it suddenly freezes. Console output attached. I can surely do more testing if needed.
http://bugs.winehq.org/show_bug.cgi?id=21636
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |download
http://bugs.winehq.org/show_bug.cgi?id=21636
Anastasius Focht focht@gmx.net changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |focht@gmx.net
--- Comment #1 from Anastasius Focht focht@gmx.net 2010-02-08 13:24:18 --- Hello,
--- quote --- after running for a while it suddenly freezes --- quote ---
After start, is there any user interaction involved (clicking controls) until the time the app crashes? How long does it take? If you do any user interaction, please describe every step to how reproduce it.
Please attach compressed trace log:
(remove any log.txt before that)
WINEDEBUG=+tid,+seh,+relay wine MJ12node.exe >>log.txt 2&>1
Regards
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #2 from Kari refic@psimerion.org 2010-02-08 15:27:36 --- Hi,
When I started the program for the first time I made some changes to options, gave registration details etc (never freezed during this). Then I restarted it because one option requires this if changed. Then I simply let it run and do what it does. It sometimes takes less than a minute for the freeze to occur and sometimes a few minutes. This time it took a few minutes and made a biiiig log.txt.
Oh, one thing: when I start it _second time_ the GUI won't draw (I see borders but it's "empty") until I minimize and restore it. I also noticed that if I use those wine's debug options it freezes immediately after minimizing (won't restore). So when I tested it I just let it ran and watched my network usage; when it went back to ~0 I knew the program had crashed.
Here's the log: http://refic.psimerion.org/log_freeze_1.txt.bz2
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #3 from Anastasius Focht focht@gmx.net 2010-02-08 17:17:10 --- Hello,
I see the app runs in crawler mode as soon as a user account is registered. Unfortunately the relay log doesn't show any suspicious stuff (those CLR exceptions are part of program flow).
I ran the app for several minutes, crawling several thousands URIs but did not experience any hang or crash so I guess it has something to do with your machine config again.
If it's heap corruption, it might be worthwhile to run the app with heap checking enabled... Beware, the log might get large and app startup/responsiveness will be very slow:
(remove any log.txt before that)
WINEDEBUG=+tid,+seh,+relay,+heap wine MJ12node.exe >>log.txt 2>&1
and attach/provide new log.
Regards
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #4 from Kari refic@psimerion.org 2010-02-09 02:49:16 --- This time the freeze occurred right after start, it didn't even crawl any URLs I think. Used a clean wineprefix and current git version of wine.
Here's the log: http://refic.psimerion.org/log_freeze_2.txt.bz2
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #5 from Kari refic@psimerion.org 2010-02-09 03:39:11 --- And here's when it freezed after running a few minutes: http://refic.psimerion.org/log_freeze_3.txt.bz2
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #6 from Anastasius Focht focht@gmx.net 2010-02-10 15:15:50 --- Hello,
unfortunately I can't reproduce it here. What kind of system are you on?
$ uname -a
$ /lib/libc.so.6
$ winegcc -v
Please provide additional server trace (make sure no lingering wineserver process exists prior):
WINEDEBUG=+tid,+seh,+loaddll,+process,+server wine ./MJ12node.exe >>trace.txt 2>&1
Terminate wineserver externally (wineserver -k) when you get the freeze.
---
The snippet where this occurs is always the same sequence:
From log2 (threads 0x2f gets nested sigs while being suspended):
--- snip --- ... 0009:Call KERNEL32.CreateThread(00000000,00000000,79ecafc5,001dbcd0,00000004,0033ea00) ret=79ecaf77 ... 0009:Ret KERNEL32.CreateThread() retval=00000258 ret=79ecaf77 ... 0039:Call KERNEL32.SuspendThread(00000258) ret=79f6e649 0039:Ret KERNEL32.SuspendThread() retval=00000000 ret=79f6e649 002f:err:seh:setup_exception_record nested exception on signal stack in thread 002f eip 7bc73480 esp 7ffbfc3c stack 0x7712000-0x7810000 0039:Call KERNEL32.GetThreadContext(00000258,0657e588) ret=79f6e75f 0039:Ret KERNEL32.GetThreadContext() retval=00000000 ret=79f6e75f 0039:Call KERNEL32.ResumeThread(00000258) ret=7a0dc002 0039:Ret KERNEL32.ResumeThread() retval=00000001 ret=7a0dc002 ... --- snip ---
From log3 (threads 0x28 and 0x31 get nested sigs while being suspended):
--- snip --- 0033:Call KERNEL32.CreateThread(00000000,00000000,79ecafc5,001db2f0,00000004,03b6e894) ret=79ecaf77 ... 0033:Ret KERNEL32.CreateThread() retval=00000238 ret=79ecaf77 ... 000b:Call KERNEL32.SuspendThread(00000238) ret=79f6e649 000b:Ret KERNEL32.SuspendThread() retval=00000000 ret=79f6e649 0028:err:seh:setup_exception_record nested exception on signal stack in thread 0028 eip 7bc73480 esp 7ffbfc3c stack 0x7672000-0x7770000 000b:Call KERNEL32.GetThreadContext(00000238,0572e588) ret=79f6e75f 000d:Call KERNEL32.GetLastError() ret=79e74ade 000d:Ret KERNEL32.GetLastError() retval=000003e5 ret=79e74ade 000d:Call KERNEL32.SetLastError(000003e5) ret=79e74af7 000d:Ret KERNEL32.SetLastError() retval=000003e5 ret=79e74af7 000d:Call KERNEL32.SetLastError(000003e5) ret=79e74aca 000d:Ret KERNEL32.SetLastError() retval=000003e5 ret=79e74aca 000d:Call KERNEL32.WaitForSingleObjectEx(000000d4,ffffffff,00000000) ret=79e77fd1 000b:Ret KERNEL32.GetThreadContext() retval=00000000 ret=79f6e75f 000b:Call KERNEL32.ResumeThread(00000238) ret=7a0dc002 000b:Ret KERNEL32.ResumeThread() retval=00000001 ret=7a0dc002 ... 000b:Call KERNEL32.SuspendThread(0000034c) ret=79f6e649 000b:Ret KERNEL32.SuspendThread() retval=00000000 ret=79f6e649 0031:err:seh:setup_exception_record nested exception on signal stack in thread 0031 eip 7bc73480 esp 7ffa7c3c stack 0x68a2000-0x69a0000 000b:Call KERNEL32.GetThreadContext(0000034c,0572e588) ret=79f6e75f 000b:Ret KERNEL32.GetThreadContext() retval=00000000 ret=79f6e75f 000b:Call KERNEL32.ResumeThread(0000034c) ret=7a0dc002 000b:Ret KERNEL32.ResumeThread() retval=00000001 ret=7a0dc002 ... --- snip ---
Since bug 10338 the thread signal stack has been increased to 16K minimum which ought to be enough.
The API call sequence SuspendThread() + GetThreadContext() is to make sure the target (remote) thread is really stopped to access critical data structures (that might be manipulated if the target thread would be still running). I remember having read SuspendThread() queues an APC for the target (remote) thread so it is not guaranteed that a user thread is really stopped after the call returns (especially true on multi-processor machines) - hence the additional GetThreadContext().
Now the question is what goes wrong that lets the signal stack get eaten up... might be even broken kernel+glibc combo.
Regards
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #7 from Kari refic@psimerion.org 2010-02-10 15:35:59 --- Thanks for working on this.
Here's the log you asked for, freeze occurred right after start: http://refic.psimerion.org/trace_1.txt.bz2
And here's some system info: http://refic.psimerion.org/system.txt
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #8 from Anastasius Focht focht@gmx.net 2010-02-10 17:18:59 --- Created an attachment (id=26193) --> (http://bugs.winehq.org/attachment.cgi?id=26193) trace setup_exception_record (only for diagnosis)
Hello,
can you apply the attached patch to wine 1.1.38 or later (GIT head)? It doesn't fix the problem, it just emits more trace output for me to get to know the eip range causing page faults before the thread gets killed.
Put it into your wine source folder and apply:
$ patch -p1 < except_trace.patch
recompile and make install (if you don't run wine loader directly from build folder).
Generate a new log with:
WINEDEBUG=+tid,+seh,+relay wine ./MJ12node.exe >>ex_trace.txt 2>&1
wait for freeze and compress/attach it (or provide download location as usual).
Regards
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #9 from Kari refic@psimerion.org 2010-02-11 01:20:03 --- And here it is: http://refic.psimerion.org/ex_trace.txt.bz2
Applied your patch to current git head and recompiled.
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #10 from Kari refic@psimerion.org 2010-02-13 02:15:49 --- I have no idea about this, just a thought, but could this be related to http://bugs.winehq.org/show_bug.cgi?id=20380 ?
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #11 from Anastasius Focht focht@gmx.net 2010-02-24 12:51:36 --- Hello,
--- quote --- I too would like to confirm the effectiveness of the patch attached to comment #81. It fixes a sudden freezing of a .NET 2.0 app "MJ12node" (see bug #21636). Not sure if it's sound related though because MJ12node does not produce any sound. I can also be totally wrong about this, but the patch fixes the problem anyway. Perhaps someone should investigate this. --- quote ---
if the signal mask patch works for you, this bug should be marked dupe of bug 20380
Regards
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #12 from Kari refic@psimerion.org 2010-04-17 04:30:01 --- Many thanks for your help Anastasius. I've been looking into this issue again and it seems that the patch didn't actually help that much - now it dies with segmentation fault instead after running for some time. Sorry for being too hasty about it.
I'm using Linux 2.6.34-rc4 which has a fix for the signal issue (bug #20380) but it seems that it doesn't change anything over here.
http://bugs.winehq.org/show_bug.cgi?id=21636
--- Comment #13 from Kari refic@psimerion.org 2010-05-15 09:23:31 --- Small update:
Using kernel 2.6.34-rc7 and latest Wine from git the node no longer freezes after a while, it just stops doing anything instead. Menus can be accessed, the window can be moved etc, but otherwise it's dead.
Maybe it has something to do with too many concurrent connections? I've set the node so that it can use 300 concurrent connections and 50mbit bandwidth when crawling. I should probably test it with lighter settings too..
One thing that I noticed from the node's own "console": it said an error about having too many open files or something like that. I've never seen it on Windows so it's most probably caused by Wine.
It seems to run pretty damn well before it stops though, so I hope it'll get fixed some day! And I'll surely help any way I can.
http://bugs.winehq.org/show_bug.cgi?id=21636
Anastasius Focht focht@gmx.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |INVALID Summary|MJ12node.exe freezes after |MJ12node.exe freezes after |a while |a while (Linux kernel | |signal delivery bug, | |reported fixed 2.6.34-rc7+)
--- Comment #14 from Anastasius Focht focht@gmx.net 2010-08-18 11:41:44 --- Hello,
revisiting.
--- quote --- Using kernel 2.6.34-rc7 and latest Wine from git the node no longer freezes after a while, it just stops doing anything instead. Menus can be accessed, the window can be moved etc, but otherwise it's dead.
Maybe it has something to do with too many concurrent connections? I've set the node so that it can use 300 concurrent connections and 50mbit bandwidth when crawling. I should probably test it with lighter settings too..
One thing that I noticed from the node's own "console": it said an error about having too many open files or something like that. I've never seen it on Windows so it's most probably caused by Wine. --- quote ---
Well even if the original problem reported was not a dupe of bug 20380 it was a similar related kernel bug (signal delivery), which seems to be fixed in recent Linux kernel versions - based on your last comment.
Regarding the "too many open files" messages: it's outside the scope of wine if the app keeps that much files open. It could also be an app bug (file handle leak). You could try to temporarily raise that limit using:
--- snip --- $ su # ulimit -n 4096 # su <your_username> $ wine foo.exe --- snip ---
You can check the current limits:
--- snip --- $ ulimit -aH --- snip ---
Regards
http://bugs.winehq.org/show_bug.cgi?id=21636
Dmitry Timoshkov dmitry@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #15 from Dmitry Timoshkov dmitry@codeweavers.com 2010-08-19 00:09:05 --- Closing invalid.