https://bugs.winehq.org/show_bug.cgi?id=54807
Bug ID: 54807 Summary: some apps hangs at exit (regression) Product: Wine Version: unspecified Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: ntdll Assignee: wine-bugs@winehq.org Reporter: roxmail@list.ru Distribution: ---
Since wine-6.5 Keil UV4 C51 (demo) hang at exit or close project (not always, but most likely).
d144890c88c69c937b5e3ffa9b85d0d9d475c44d is the first bad commit commit d144890c88c69c937b5e3ffa9b85d0d9d475c44d Author: Zebediah Figura z.figura12@gmail.com Date: Mon Mar 22 23:04:29 2021 -0500
ntdll: Hold fd_cache_section while removing and closing handles.
Otherwise, it's possible for the old handle to be re-added to the cache between removing and closing it.
Signed-off-by: Zebediah Figura z.figura12@gmail.com Signed-off-by: Alexandre Julliard julliard@winehq.org
this change fix it (at least for wine-8.4 and early, i suppose it partially revert this commit):
diff -urN a/dlls/ntdll/unix/server.c b/dlls/ntdll/unix/server.c --- a/dlls/ntdll/unix/server.c 2023-03-25 10:41:17.844720197 +0300 +++ b/dlls/ntdll/unix/server.c 2023-03-25 10:41:34.456106724 +0300 @@ -1777,6 +1777,8 @@ * retrieve it again */ fd = remove_fd_from_cache( handle );
+ server_leave_uninterrupted_section( &fd_cache_mutex, &sigset ); + SERVER_START_REQ( close_handle ) { req->handle = wine_server_obj_handle( handle ); @@ -1784,8 +1786,6 @@ } SERVER_END_REQ;
- server_leave_uninterrupted_section( &fd_cache_mutex, &sigset ); - if (fd != -1) close( fd );
if (ret != STATUS_INVALID_HANDLE || !handle) return ret;
https://bugs.winehq.org/show_bug.cgi?id=54807
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Regression SHA1| |d144890c88c69c937b5e3ffa9b8 | |5d0d9d475c44d CC| |z.figura12@gmail.com Summary|some apps hangs at exit |Keil UV4 C51 (demo) hangs |(regression) |on exit Version|unspecified |6.4 Keywords| |regression
--- Comment #1 from Austin English austinenglish@gmail.com --- Is this the application? https://www.keil.com/demo/eval/c51.htm
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #2 from Roman roxmail@list.ru --- I am not sure about version 9.61, this report was about 9.03. I was found a bug in my old hobby project (somewhere in time wine-7), project was made with keil-9.03 and it was in my wine profile since.
https://bugs.winehq.org/show_bug.cgi?id=54807
Roman roxmail@list.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|6.4 |6.5 Summary|Keil UV4 C51 (demo) hangs |Keil UV4 C51 9.03 (demo) |on exit |hangs on exit
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #3 from Zeb Figura z.figura12@gmail.com --- Annoying that it's not freely downloadable; this may be rather hard to debug. Can you please attach a log with WINEDEBUG=+server,+seh,+pid ?
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #4 from Roman roxmail@list.ru --- Created attachment 75471 --> https://bugs.winehq.org/attachment.cgi?id=75471 WINEDEBUG=+server,+seh,+pid wine ./Uv4.exe &>UV4_wine_hang_log.txt
File is quite large unpacked ~162M (there are many zero bytes between text blocks). Hang probability seems somehow related with mesa version and currently it is a less reproducible. (wine-8.20, Mesa 23.2.1 refresh - kisak-mesa PPA )
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #5 from Zeb Figura z.figura12@gmail.com --- (In reply to Roman from comment #4)
Created attachment 75471 [details] WINEDEBUG=+server,+seh,+pid wine ./Uv4.exe &>UV4_wine_hang_log.txt
File is quite large unpacked ~162M (there are many zero bytes between text blocks). Hang probability seems somehow related with mesa version and currently it is a less reproducible. (wine-8.20, Mesa 23.2.1 refresh - kisak-mesa PPA )
Hrm, I assume you killed the program somehow in that log? I'm kind of struggling to even figure out which thread is getting stuck (and where); it's not obvious.
I do notice some instances of thread termination, which might break things in subtle ways. I'd be interested to see if [1] helps at all.
Otherwise I probably need more channels, at least +thread,+timestamp,+file,+reg,+msg,+message and if possible +sync.
[1] https://gitlab.winehq.org/wine/wine/-/merge_requests/4445
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #6 from Roman roxmail@list.ru ---
Hrm, I assume you killed the program somehow in that log? I
Yes, sorry, there are ton of messages at the end of log seems caused by this, not sure how to make log without that (if it is preferred).
I do notice some instances of thread termination, which might break things in subtle ways. I'd be >interested to see if [1] helps at all.
Otherwise I probably need more channels, at least +thread,+timestamp,+file,+reg,+msg,+message >and if possible +sync.
will do, but not sure how long it will be
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #7 from Roman roxmail@list.ru ---
not sure how to make log without that (if it is preferred).
Make a copy log before killing ?
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #8 from Zeb Figura z.figura12@gmail.com --- (In reply to Roman from comment #6)
Hrm, I assume you killed the program somehow in that log? I
Yes, sorry, there are ton of messages at the end of log seems caused by this, not sure how to make log without that (if it is preferred).
No, having those messages in the log is fine, preferable actually. The problem is that it's hard for me to tell *where* they are. The messages I see look like the program exited normally, rather than dying via an external SIGKILL or anything.
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #9 from Zeb Figura z.figura12@gmail.com --- Oh, actually, this is probably it?
0178: terminate_thread( handle=01cc, exit_code=0 ) 01b0: *killed* exit_code=0 01b0: *sent signal* signal=3 0178: terminate_thread() = 0 { self=0 } 0170: get_thread_info( handle=fffffffe, access=00000000 ) 0170: get_thread_info() = 0 { pid=00e8, tid=0170, teb=7ff70000, entry_point=6fffffcc1cd0, affinity=0000000f, exit_code=259, priority=0, last=0, suspend_count=0, dbg_hidden=0, desc_len=44, desc=L"" } 0170: terminate_thread( handle=fffffffe, exit_code=0 ) 0170: terminate_thread() = 0 { self=1 } 0170: select( flags=2, cookie=7fffff5fda7c, timeout=0, size=0, prev_apc=0000, result={}, data={}, contexts={} ) 0170: select() = TIMEOUT { apc_handle=0000, signaled=1, call={}, contexts={} } 0170: *killed* exit_code=0 01ac: *killed* exit_code=0 019c: *killed* exit_code=0 0198: *killed* exit_code=0 0178: *killed* exit_code=0 0188: *wakeup* signaled=0 0180: *wakeup* signaled=0 0058: *wakeup* signaled=1
0170 (which isn't part of the same process actually) terminates itself cleanly, but then 01ac and the following 3 threads seem to have been killed externally. It threw me off originally because I assumed those are all from the same process.
That does point to the TerminateThread() being the problem. I guess 01b0 is in the middle of an NtClose() when it dies? It'd certainly be left holding the fd cache mutex in that case.
In that case unwinding won't be enough, I think. I think the only way is to mask off SIGQUIT.
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #10 from Zeb Figura z.figura12@gmail.com --- Created attachment 75487 --> https://bugs.winehq.org/attachment.cgi?id=75487 add SIGQUIT to server_block_set
As in this patch. Can you please try this?
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #11 from Roman roxmail@list.ru --- Created attachment 75488 --> https://bugs.winehq.org/attachment.cgi?id=75488 WINEDEBUG=+thread,+timestamp,+file,+reg,+msg,+message wine ./Uv4.exe &>UV4_wine_hang_log-1.txt
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #12 from Roman roxmail@list.ru --- Created attachment 75489 --> https://bugs.winehq.org/attachment.cgi?id=75489 WINEDEBUG=+thread,+timestamp,+file,+reg,+msg,+message,+sync wine ./Uv4.exe &>UV4_wine_hang_log_sync.txt
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #13 from Roman roxmail@list.ru --- no luck with wine-9196fecb639b433e6a6f687e0ce62669c9cc2bf0 + https://gitlab.winehq.org/wine/wine/-/merge_requests/4445 + https://bugs.winehq.org/attachment.cgi?id=75487&action=diff nor wine-8.20+https://bugs.winehq.org/attachment.cgi?id=75487&action=diff , logs from second
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #14 from Zeb Figura z.figura12@gmail.com --- Created attachment 75670 --> https://bugs.winehq.org/attachment.cgi?id=75670 more traces
Ech, sorry, I meant those channels in addition to +server,+seh,+pid.
While you're at it can you please also attach this patch that adds extra traces?
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #15 from Roman roxmail@list.ru --- What exactly WINEDEBUG should contain?
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #16 from Zeb Figura z.figura12@gmail.com --- (In reply to Roman from comment #15)
What exactly WINEDEBUG should contain?
WINEDEBUG=+server,+seh,+pid,+thread,+timestamp,+file,+reg,+msg,+message,+sync
If that makes it hard or impossible to reproduce then remove +sync. If it's still impossible I'll try to find some other combination that gives enough information. Thanks!
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #17 from Roman roxmail@list.ru --- Created attachment 75702 --> https://bugs.winehq.org/attachment.cgi?id=75702 WINEDEBUG=+server,+seh,+pid,+thread,+timestamp,+file,+reg,+msg,+message,+sync wine ~/.wine/drive_c/Keil/UV4/Uv4.exe &>UV4_wine_hang_log-18.12.2024.txt
wine-9.0-rc2 + https://bugs.winehq.org/attachment.cgi?id=75670&action=diff
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #18 from Zeb Figura z.figura12@gmail.com --- I see the problem now. It *is* related to killing threads while they're holding the mutex.
The reason that adding SIGQUIT to the set isn't enough to fix it, is that the server closes the remote end of the socket, which generates EPIPE, which in turn causes wine_server_call() to abort_thread() with the mutex still held.
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #19 from Roman roxmail@list.ru --- Great that you find a reason, but it fixable?
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #20 from Zeb Figura z.figura12@gmail.com --- (In reply to Roman from comment #19)
Great that you find a reason, but it fixable?
I sent https://gitlab.winehq.org/wine/wine/-/merge_requests/5118. The problem with that merge request is that, while it's probably reliably good enough for this application, it isn't really enough to solve the broader problem of kernel-side code getting into an inconsistent state when threads are terminated. It needs more thought.
https://bugs.winehq.org/show_bug.cgi?id=54807
--- Comment #21 from Zeb Figura z.figura12@gmail.com --- (In reply to Zeb Figura from comment #20)
(In reply to Roman from comment #19)
Great that you find a reason, but it fixable?
I sent https://gitlab.winehq.org/wine/wine/-/merge_requests/5118. The problem with that merge request is that, while it's probably reliably good enough for this application, it isn't really enough to solve the broader problem of kernel-side code getting into an inconsistent state when threads are terminated. It needs more thought.
For posterity, this is the wrong merge request, the right one is https://gitlab.winehq.org/wine/wine/-/merge_requests/4811.