https://bugs.winehq.org/show_bug.cgi?id=45546
Anastasius Focht focht@gmx.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Component|-unknown |ntdll Status|UNCONFIRMED |NEW Summary|Magic The Gathering Arena |Magic The Gathering Arena |updater: Unity fork of |updater: Unity fork of |Mono-runtime reports 'Fatal |Mono-runtime reports 'Fatal |error in gc, |error in gc, |GetThreadContext failed' |GetThreadContext failed' | |(suspension of Mono-managed | |threads sometimes exceed | |100ms) Ever confirmed|0 |1
--- Comment #8 from Anastasius Focht focht@gmx.net --- Hello folks,
confirming, although very hard to reproduce on my machine. While trying to reproduce I downloaded around 200 GB (2 GB each try) in the process .. hope they don't ban me ^^
Installer: https://mtgarena.downloads.wizards.com/Live/Windows32/updates.txt
--- snip --- [Latest] Name = MTGArena ReleaseDate = 09\25\2018 ProductVersion = 0.1.845.0 FilePath = [APPDIR]MTGALauncher.exe MSIURL = https://mtgarena.downloads.wizards.com/Installers/Windows32/0.1.845.0/Live/M... URL = https://mtgarena.downloads.wizards.com/Live/Windows32/MTGAInstaller.exe Version = 0.1.845.0 BinaryUpdater = false Localized = false ProductCode = {3F0CBA48-C470-429F-BDF3-5FEC0940B4EA} --- snip ---
It seems they now use a newer, partially updated the Mono runtime (bdwgc) for Unity.
Protection ID scan:
--- snip --- -=[ ProtectionID v0.6.9.0 DECEMBER]=- (c) 2003-2017 CDKiLLER & TippeX Build 24/12/17-21:05:42 Ready... Scanning -> C:\Program Files (x86)\Wizards of the Coast\MTGA\MonoBleedingEdge\EmbedRuntime\mono-2.0-bdwgc.dll File Type : 32-Bit Dll (Subsystem : Win GUI / 2), Size : 3886528 (03B4DC0h) Byte(s) | Machine: 0x14C (I386) Compilation TimeStamp : 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) [TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | PE Header | - | Offset: 0x00000120 | VA: 0x10000120 | - [TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | Export | - | Offset: 0x0038A1D4 | VA: 0x1038B9D4 | - [TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | DebugDirectory | - | Offset: 0x00388D44 | VA: 0x1038A544 | - [TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | DebugDirectory | - | Offset: 0x00388D60 | VA: 0x1038A560 | - [TimeStamp] 0x5AF21305 -> Tue 08th May 2018 21:13:41 (GMT) | DebugDirectory | - | Offset: 0x00388D7C | VA: 0x1038A57C | - -> File Appears to be Digitally Signed @ Offset 03B3600h, size : 017C0h / 06080 byte(s) [LoadConfig] Struct determined as v8 (Expected size 140 | Actual size 64) [!] Executable uses SEH Tables (/SAFESEH) (3 calculated 3 recorded... 0 invalid addresses) [LoadConfig] CodeIntegrity -> Flags 0x0 | Catalog 0x0 (0) | Catalog Offset 0x0 | Reserved 0x0 [LoadConfig] GuardAddressTakenIatEntryTable 0x1039A750 | Count 0x1038A608 (272147976) [LoadConfig] GuardLongJumpTargetTable 0x0 | Count 0x0 (0) [LoadConfig] HybridMetadataPointer 0x1 | DynamicValueRelocTable 0x1038A618 [LoadConfig] FailFastIndirectProc 0x1038A620 | FailFastPointer 0x0 [LoadConfig] UnknownZero1 0x1039A750 [File Heuristics] -> Flag #1 : 00000100000001001101000100000100 (0x0404D104) [Entrypoint Section Entropy] : 6.60 (section #0) ".text " | Size : 0x2C73E6 (2913254) byte(s) [DllCharacteristics] -> Flag : (0x0140) -> ASLR | DEP [SectionCount] 6 (0x6) | ImageSize 0x425000 (4345856) byte(s) [Export] 99% of function(s) (1081 of 1084) are in file | 0 are forwarded | 1078 code | 6 data | 0 uninit data | 0 unknown | [ModuleReport] [IAT] Modules -> MSWSOCK.dll | WS2_32.dll | ole32.dll | OLEAUT32.dll | PSAPI.DLL | VERSION.dll | ADVAPI32.dll | WINMM.dll | KERNEL32.dll | USER32.dll | SHELL32.dll [Debug Info] (record 1 of 3) (file offset 0x388D40) Characteristics : 0x0 | TimeDateStamp : 0x5AF21305 (Tue 08th May 2018 21:13:41 (GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0) Type : 2 (0x2) -> CodeView | Size : 0x67 (103) AddressOfRawData : 0x38A8C4 | PointerToRawData : 0x3890C4 CvSig : 0x53445352 | SigGuid 8B1A1B99-19B1-402D-AA9C2AD249077FA6 Age : 0x1 (1) | Pdb : C:\buildslave\mono\build\msvc\build\bdwgc\Win32\bin\Release\mono-2.0-bdwgc.pdb [Debug Info] (record 2 of 3) (file offset 0x388D5C) Characteristics : 0x0 | TimeDateStamp : 0x5AF21305 (Tue 08th May 2018 21:13:41 (GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0) Type : 12 (0xC) -> Undocumented | Size : 0x14 (20) AddressOfRawData : 0x38A92C | PointerToRawData : 0x38912C [Debug Info] (record 3 of 3) (file offset 0x388D78) Characteristics : 0x0 | TimeDateStamp : 0x5AF21305 (Tue 08th May 2018 21:13:41 (GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0) Type : 13 (0xD) -> Undocumented | Size : 0x320 (800) AddressOfRawData : 0x38A940 | PointerToRawData : 0x389140 ... [CompilerDetect] -> Visual C++ 14.0 (Visual Studio 2015) [!] File appears to have no protection or is using an unknown protection - Scan Took : 1.806 Second(s) [0000004AAh (1194) tick(s)] [246 of 580 scan(s) done]
Scanning -> C:\Program Files\Wizards of the Coast\MTGA\UnityPlayer.dll File Type : 32-Bit Dll (Subsystem : Win GUI / 2), Size : 17423808 (0109DDC0h) Byte(s) | Machine: 0x14C (I386) Compilation TimeStamp : 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) [TimeStamp] 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) | PE Header | - | Offset: 0x00000178 | VA: 0x10000178 | - [TimeStamp] 0xFFFFFFFF -> Sun 07th Feb 2106 06:28:15 (GMT) | Export | - | Offset: 0x00FDFC74 | VA: 0x10FE0E74 | - [TimeStamp] 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) | DebugDirectory | - | Offset: 0x00FDA444 | VA: 0x10FDB644 | - [TimeStamp] 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) | DebugDirectory | - | Offset: 0x00FDA460 | VA: 0x10FDB660 | - [TimeStamp] 0x5B23C5A7 -> Fri 15th Jun 2018 13:56:55 (GMT) | DebugDirectory | - | Offset: 0x00FDA47C | VA: 0x10FDB67C | - -> File Appears to be Digitally Signed @ Offset 0109C600h, size : 017C0h / 06080 byte(s) [LoadConfig] Struct determined as v8 (Expected size 140 | Actual size 64) [LoadConfig] CodeIntegrity -> Flags 0x0 | Catalog 0x0 (0) | Catalog Offset 0x0 | Reserved 0x0 [LoadConfig] GuardAddressTakenIatEntryTable 0x0 | Count 0x0 (0) [LoadConfig] GuardLongJumpTargetTable 0x0 | Count 0x0 (0) [LoadConfig] HybridMetadataPointer 0x0 | DynamicValueRelocTable 0x0 [LoadConfig] FailFastIndirectProc 0x0 | FailFastPointer 0x0 [LoadConfig] UnknownZero1 0x0 [File Heuristics] -> Flag #1 : 00000100000001001101000100000100 (0x0404D104) [Entrypoint Section Entropy] : 6.64 (section #0) ".text " | Size : 0xE2D9FC (14866940) byte(s) [DllCharacteristics] -> Flag : (0x0140) -> ASLR | DEP [SectionCount] 7 (0x7) | ImageSize 0x115A000 (18194432) byte(s) [Export] 100% of function(s) (1 of 1) are in file | 0 are forwarded | 1 code | 0 data | 0 uninit data | 0 unknown | [VersionInfo] Product Version : 2018.1.6.5753908 [VersionInfo] File Version : 2018.1.6.5753908 [ModuleReport] [IAT] Modules -> HID.DLL | KERNEL32.dll | USER32.dll | VERSION.dll | ole32.dll | SHLWAPI.dll | SETUPAPI.dll | ADVAPI32.dll | GDI32.dll | SHELL32.dll | OPENGL32.dll | WINMM.dll | WS2_32.dll | OLEAUT32.dll | IMM32.dll | DNSAPI.dll | IPHLPAPI.DLL | WINHTTP.dll [Debug Info] (record 1 of 3) (file offset 0xFDA440) Characteristics : 0x0 | TimeDateStamp : 0x5B23C5A7 (Fri 15th Jun 2018 13:56:55 (GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0) Type : 2 (0x2) -> CodeView | Size : 0x88 (136) AddressOfRawData : 0xFDD0EC | PointerToRawData : 0xFDBEEC CvSig : 0x53445352 | SigGuid E6BD450A-B375-47BC-A34A5DC610712C8C Age : 0x1 (1) | Pdb : C:\buildslave\unity\build\artifacts\StandalonePlayer\win32_release_vs2017_nondev\UnityPlayer_Win32_x86_mono.pdb [Debug Info] (record 2 of 3) (file offset 0xFDA45C) Characteristics : 0x0 | TimeDateStamp : 0x5B23C5A7 (Fri 15th Jun 2018 13:56:55 (GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0) Type : 12 (0xC) -> Undocumented | Size : 0x14 (20) AddressOfRawData : 0xFDD174 | PointerToRawData : 0xFDBF74 [Debug Info] (record 3 of 3) (file offset 0xFDA478) Characteristics : 0x0 | TimeDateStamp : 0x5B23C5A7 (Fri 15th Jun 2018 13:56:55 (GMT)) | MajorVer : 0 / MinorVer : 0 -> (0.0) Type : 13 (0xD) -> Undocumented | Size : 0x384 (900) AddressOfRawData : 0xFDD188 | PointerToRawData : 0xFDBF88 ... [c] Unity Version 2018.1.6f1_57cc34175ccf detected [!] File appears to have no protection or is using an unknown protection --- snip ---
Using the Unity version info I found matching debug symbols (compressed symbol store)
https://symbolserver.unity3d.com/
https://symbolserver.unity3d.com/UnityPlayer_Win32_x86_mono.pdb/E6BD450AB375...
For the Mono dll itself it wasn't possible to find matching debug symbols hence I figured out a compatible version which works with the launcher (replaced it):
https://symbolserver.unity3d.com/mono-2.0-bdwgc.dll/5B7AB7E43de000/ https://symbolserver.unity3d.com/mono-2.0-bdwgc.pdb/3B9BBE126F9746AA9EEA4AB6...
The problem seems to be a rare occurrence when a Mono-managed worker thread fails to suspend in time (at least what Wine thinks is a reasonable amount -> 100ms). During application lifetime there are many "world-stop" garbage collection events when all registered threads are to be suspended.
Mono itself has some debugging facilities:
--- snip --- $ export MONO_LOG_DEST=syslog $ export MONO_LOG_MASK=gc $ export MONO_LOG_LEVEL=debug
$ export GC_LOOP_ON_ABORT=1 $ export GC_PRINT_VERBOSE_STATS=1 ... --- snip ---
MTGA.gc.log
--- snip --- ... Grow heap to 256 KiB after 0 bytes allocated Initiating full world-stop collection! 0 bytes in heap blacklisted for interior pointers Found new system malloc AllocationBase at 00110000
--> Marking for collection #1 after 0 allocated bytes Pushed 1 thread stacks GC #1 freed 0 bytes, heap 256 KiB (+ 0 KiB unmapped) World-stopped marking took 1 msecs (1 in average) Bytes recovered before sweep - f.l. count = 0 In-use heap: 0% (0 KiB pointers + 0 KiB other) Immediately reclaimed 0 bytes, heapsize: 262144 bytes (0 unmapped) 0 finalization entries; 0/0 short/long disappearing links alive 0 finalization-ready objects; 0/0 short/long links cleared Finalize plus initiate sweep took 0 + 0 msecs Complete collection took 1 msecs Adding block map for size of 170 granules (1360 bytes) Adding block map for size of 42 granules (336 bytes) Adding block map for size of 1 granules (8 bytes) Adding block map for size of 5 granules (40 bytes) Adding block map for size of 4 granules (32 bytes) Adding block map for size of 12 granules (96 bytes) ... --> Marking for collection #3131 after 12029608 allocated bytes Pushed 18 thread stacks GC #3131 freed 1291456 bytes, heap 52444 KiB (+ 0 KiB unmapped) World-stopped marking took 17 msecs (2 in average) Bytes recovered before sweep - f.l. count = -174624 In-use heap: 84% (12547 KiB pointers + 31721 KiB other) Immediately reclaimed -166432 bytes, heapsize: 53702656 bytes (0 unmapped) 899 finalization entries; 9/5 short/long disappearing links alive 88 finalization-ready objects; 0/0 short/long links cleared Finalize plus initiate sweep took 0 + 0 msecs Complete collection took 17 msecs Grow heap to 68828 KiB after 2090736 bytes allocated Initiating full world-stop collection! 163840 bytes in heap blacklisted for interior pointers
--> Marking for collection #3132 after 2825704 allocated bytes Pushed 18 thread stacks GC #3132 freed 1279720 bytes, heap 68828 KiB (+ 0 KiB unmapped) World-stopped marking took 14 msecs (2 in average) Bytes recovered before sweep - f.l. count = -119608 In-use heap: 65% (13151 KiB pointers + 32137 KiB other) Immediately reclaimed 30184 bytes, heapsize: 70479872 bytes (0 unmapped) 909 finalization entries; 9/5 short/long disappearing links alive 142 finalization-ready objects; 0/0 short/long links cleared Finalize plus initiate sweep took 0 + 0 msecs Complete collection took 14 msecs Grew fo table to 8192 entries Grew fo table to 16384 entries --- snip ---
Relevant thread callstacks gathered using winedbg and x32dbg on the rare failing case (manually annotated):
--- snip --- Backtracing for thread 0073 in process 0008 (C:\Program Files\Wizards of the Coast\MTGA\MTGA.exe): Backtrace: =>0 0xf7f4a047 __kernel_vsyscall+0x7() in [vdso].so (0x093bea08) 1 0xf7d178d7 __libc_read+0x66() in libpthread.so.0 (0x093bea08) 2 0x7bc90ec6 wait_select_reply+0x1e() [/home/focht/projects/wine/mainline-src/dlls/ntdll/server.c:359] in ntdll (0x093bea08) 3 0x7bc91d8a server_select+0x163() [/home/focht/projects/wine/mainline-src/dlls/ntdll/server.c:618] in ntdll (0x093beb18) 4 0x7bc9c208 wait_objects+0xb6() [/home/focht/projects/wine/mainline-src/dlls/ntdll/sync.c:1033] in ntdll (0x093bec58) 5 0x7bc9c246 NtWaitForMultipleObjects+0x38(wait_any=1, alertable=0) [/home/focht/projects/wine/mainline-src/dlls/ntdll/sync.c:1044] in ntdll (0x093bec98) 6 0x7b47ec0d WaitForMultipleObjectsEx+0xad() [/home/focht/projects/wine/mainline-src/dlls/kernel32/sync.c:221] in kernel32 (0x093bedf8) 7 0xf7aa284e X11DRV_MsgWaitForMultipleObjectsEx+0xcd() [/home/focht/projects/wine/mainline-src/dlls/winex11.drv/event.c:497] in winex11 (0x093bee48) 8 0x7e8dfa3a wait_message+0x21() [/home/focht/projects/wine/mainline-src/dlls/user32/winproc.c:1164] in user32 (0x093bee98) 9 0x7e8a2e98 wait_objects+0xe0() [/home/focht/projects/wine/mainline-src/dlls/user32/message.c:3080] in user32 (0x093bef48) 10 0x7e8a45d7 GetMessageW+0xe7() [/home/focht/projects/wine/mainline-src/dlls/user32/message.c:3841] in user32 (0x093bef88) 11 0x7e86ba0d DIALOG_DoDialogBox+0xb3() [/home/focht/projects/wine/mainline-src/dlls/user32/dialog.c:798] in user32 (0x093befe8) 12 0x7e86bd00 DialogBoxIndirectParamAorW+0x54() [/home/focht/projects/wine/mainline-src/dlls/user32/dialog.c:875] in user32 (0x093bf018) 13 0x7e86bd79 DialogBoxIndirectParamW+0x2a() [/home/focht/projects/wine/mainline-src/dlls/user32/dialog.c:895] in user32 (0x093bf058) 14 0x7e8a8ee9 MessageBoxIndirectW+0x115() [/home/focht/projects/wine/mainline-src/dlls/user32/msgbox.c:541] in user32 (0x093bf0c8) 15 0x7e8a8d73 MessageBoxIndirectA+0x100() [/home/focht/projects/wine/mainline-src/dlls/user32/msgbox.c:503] in user32 (0x093bf148) 16 0x7e8a8af1 MessageBoxExA+0x66(langid=0) [/home/focht/projects/wine/mainline-src/dlls/user32/msgbox.c:419] in user32 (0x093bf1b8) 17 0x7e8a8a48 MessageBoxA+0x27() [/home/focht/projects/wine/mainline-src/dlls/user32/msgbox.c:387] in user32 (0x093bf1f8) 18 0x0202f8c7 in mono-2.0-bdwgc (+0x25f8c6) (0x093bf220) ; <mono-2.0-bdwgc.GC_default_on_abort> 19 0x020389e2 in mono-2.0-bdwgc (+0x2689e1) (0x093bf510) ; <mono-2.0-bdwgc.GC_push_stack_for> 20 0x0203800d in mono-2.0-bdwgc (+0x26800c) (0x093bf53c) ; <mono-2.0-bdwgc.GC_push_all_stacks> 21 0x020369a5 in mono-2.0-bdwgc (+0x2669a4) (0x093bf554) ; <mono-2.0-bdwgc.GC_mark_some_inner> 22 0x0203686b in mono-2.0-bdwgc (+0x26686a) (0x093bf58c) ; <mono-2.0-bdwgc.GC_mark_some> 23 0x0203a7fe in mono-2.0-bdwgc (+0x26a7fd) (0x093bf5a8) ; <mono-2.0-bdwgc.GC_stopped_mark> 24 0x0203ada1 in mono-2.0-bdwgc (+0x26ada0) (0x093bf5c0) ; <mono-2.0-bdwgc.GC_try_to_collect_inner> 25 0x02031689 in mono-2.0-bdwgc (+0x261688) (0x093bf5d4) ; <mono-2.0-bdwgc.GC_collect_or_expand> 26 0x020308bf in mono-2.0-bdwgc (+0x2608be) (0x093bf5f4) ; <mono-2.0-bdwgc.GC_allocobj> 27 0x02033f7c in mono-2.0-bdwgc (+0x263f7b) (0x093bf610) ; <mono-2.0-bdwgc.GC_generic_malloc_inner> 28 0x02033a40 in mono-2.0-bdwgc (+0x263a3f) (0x093bf62c) ; <mono-2.0-bdwgc.GC_gcj_malloc> 29 0x01ebd012 in mono-2.0-bdwgc (+0xed011) (0x093bf648) ; <mono-2.0-bdwgc.mono_gc_alloc_obj> 30 0x01e6db9b in mono-2.0-bdwgc (+0x9db9a) (0x093bf664) ; <mono-2.0-bdwgc.mono_object_new_alloc_specific_checked> 31 0x01e6df7c in mono-2.0-bdwgc (+0x9df7b) (0x093bf680) ; <mono-2.0-bdwgc.mono_object_new_specific_checked> 32 0x01e72083 in mono-2.0-bdwgc (+0xa2082) (0x093bf6c8) ; <mono-2.0-bdwgc.ves_icall_object_new_specific> 33 0x003f1021 (0x093bf718) ; Mono JIT Code 34 0x08c2fd80 (0x093bf748) ; Mono JIT Code 35 0x08ed3e80 (0x093bf778) ; Mono JIT Code 36 0x08ed3b38 (0x093bf808) ; Mono JIT Code 37 0x08ed391b (0x093bf858) ; Mono JIT Code 38 0x08ee4199 (0x093bf8f8) ; Mono JIT Code 39 0x08edc030 (0x093bf958) ; Mono JIT Code 40 0x08edba81 (0x093bf988) ; Mono JIT Code 41 0x08edb85c (0x093bf9e8) ; Mono JIT Code 42 0x08c29e4c (0x093bfa2c) ; Mono JIT Code 43 0x01ed06df in mono-2.0-bdwgc (+0x1006de) (0x093bfa6c) ; <mono-2.0-bdwgc.mono_jit_runtime_invoke> 44 0x01e687d7 in mono-2.0-bdwgc (+0x987d6) (0x093bfa94) ; <mono-2.0-bdwgc.do_runtime_invoke> 45 0x01e6fb9b in mono-2.0-bdwgc (+0x9fb9a) (0x093bfae4) ; <mono-2.0-bdwgc.mono_runtime_try_invoke_array> 46 0x01e6c73f in mono-2.0-bdwgc (+0x9c73e) (0x093bfb4c) ; <mono-2.0-bdwgc.mono_message_invoke> 47 0x01e71e63 in mono-2.0-bdwgc (+0xa1e62) (0x093bfba8) ; <mono-2.0-bdwgc.ves_icall_System_Runtime_Remoting_Messaging_AsyncResult_Invoke> 48 0x08e6c948 (0x093bfbf8) ; Mono JIT Code 49 0x08e6c89c (0x093bfc18) ; Mono JIT Code 50 0x08d4a1f8 (0x093bfcb8) ; Mono JIT Code 51 0x08d49c8c (0x093bfcf8) ; Mono JIT Code 52 0x0711026d (0x093bfd44) ; Mono JIT Code 53 0x01ed06df in mono-2.0-bdwgc (+0x1006de) (0x093bfd84) ; <mono-2.0-bdwgc.mono_jit_runtime_invoke> 54 0x01e687d7 in mono-2.0-bdwgc (+0x987d6) (0x093bfdac) ; <mono-2.0-bdwgc.do_runtime_invoke> 55 0x01e879f1 in mono-2.0-bdwgc (+0xb79f0) (0x093bfddc) ; <mono-2.0-bdwgc.try_invoke_perform_wait_callback> 56 0x01e8822c in mono-2.0-bdwgc (+0xb822b) (0x093bfe48) ; <mono-2.0-bdwgc.worker_callback> 57 0x01e8a791 in mono-2.0-bdwgc (+0xba790) (0x093bfe70) ; <mono-2.0-bdwgc.worker_thread> 58 0x01e85431 in mono-2.0-bdwgc (+0xb5430) (0x093bfecc) ; <mono-2.0-bdwgc.start_wrapper_internal> 59 0x01e8521c in mono-2.0-bdwgc (+0xb521b) (0x093bfedc) ; <mono-2.0-bdwgc.start_wrapper> 60 0x7bc97970 call_thread_func_wrapper+0xb() in ntdll (0x093bfeec) 61 0x7bc979d6 call_thread_func+0x63() [/home/focht/projects/wine/mainline-src/dlls/ntdll/signal_i386.c:2654] in ntdll (0x093bffdc) 62 0x7bc97962 call_thread_entry+0x9() in ntdll (0x093bffec) --- snip ---
https://github.com/mono/mono/blob/c5b88ec4f323f2bdb7c7d0a595ece28dae66579c/l...
--- snip --- void GC_push_all_stacks() { DWORD thread_id = GetCurrentThreadId(); GC_bool found_me = FALSE; int i; int dummy; ptr_t sp, stack_min; GC_thread thread; LONG my_max = GC_get_max_thread_index();
for (i = 0; i <= my_max; i++) { thread = thread_table + i; if (thread -> in_use && thread -> stack_base) { if (thread -> id == thread_id) { sp = (ptr_t) &dummy; found_me = TRUE; } else { CONTEXT context; context.ContextFlags = CONTEXT_INTEGER|CONTEXT_CONTROL; if (!GetThreadContext(thread_table[i].handle, &context)) ABORT("GetThreadContext failed"); ... --- snip ---
Offending managed thread with late suspension:
--- snip --- Backtracing for thread 0056 in process 0008 (C:\Program Files\Wizards of the Coast\MTGA\MTGA.exe): Backtrace: =>0 0xf7f8c049 __kernel_vsyscall+0x9() in [vdso].so (0x0a65f5d8) 1 0xf7d599c7 __close+0x56() in libpthread.so.0 (0x0a65f5d8) 2 0x7bc76adc close_handle+0x8d() [/home/focht/projects/wine/mainline-src/dlls/ntdll/om.c:399] in ntdll (0x0a65f5d8) 3 0x7bc76bf0 NtClose+0x1c() [/home/focht/projects/wine/mainline-src/dlls/ntdll/om.c:436] in ntdll (0x0a65f5f8) 4 0x7b47298c CloseHandle+0x104() [/home/focht/projects/wine/mainline-src/dlls/kernel32/process.c:3352] in kernel32 (0x0a65f678) 5 0x01ebe93d in mono-2.0-bdwgc (+0xee93c) (0x0a65f694) ; CloseHandle 6 0x01e1bef6 in mono-2.0-bdwgc (+0x4bef5) (0x0a65f6a8) ; mono-2.0-bdwgc.mono_w32file_close+1D 7 0x071251d8 (0x0a65f6f8) ; mono-2.0-bdwgc.ves_icall_System_IO_MonoIO_Close+16 8 0x003fcd38 (0x0a65f758) ; Mono JIT Code 9 0x0712517c (0x0a65f778) ; Mono JIT Code 10 0x08fe66e4 (0x0a65f798) ; Mono JIT Code 11 0x090051bf (0x0a65f858) ; Mono JIT Code 12 0x090045a8 (0x0a65f8f8) ; Mono JIT Code 13 0x08ffd4c0 (0x0a65f958) ; Mono JIT Code 14 0x08ffcfc1 (0x0a65f988) ; Mono JIT Code 15 0x08ffcd9c (0x0a65f9e8) ; Mono JIT Code 16 0x08839e4c (0x0a65fa2c) ; Mono JIT Code 17 0x01ed06df in mono-2.0-bdwgc (+0x1006de) (0x0a65fa6c) ; mono-2.0-bdwgc.retry+9 18 0x01e687d7 in mono-2.0-bdwgc (+0x987d6) (0x0a65fa94) ; mono-2.0-bdwgc.mono_jit_runtime_invoke+2FF 19 0x01e6fb9b in mono-2.0-bdwgc (+0x9fb9a) (0x0a65fae4) ; mono-2.0-bdwgc.do_runtime_invoke+57 20 0x01e6c73f in mono-2.0-bdwgc (+0x9c73e) (0x0a65fb4c) ; mono-2.0-bdwgc.mono_runtime_try_invoke_array+38B 21 0x01e71e63 in mono-2.0-bdwgc (+0xa1e62) (0x0a65fba8) ; mono-2.0-bdwgc.mono_message_invoke+1EF 22 0x0885d878 (0x0a65fbf8) ; mono-2.0-bdwgc.ves_icall_System_Runtime_Remoting_Messaging_AsyncResult_Invoke+B3 23 0x0885d7cc (0x0a65fc18) ; Mono JIT Code 24 0x0884a7a8 (0x0a65fcb8) ; Mono JIT Code 25 0x0884a23c (0x0a65fcf8) ; Mono JIT Code 26 0x07110685 (0x0a65fd44) ; Mono JIT Code 27 0x01ed06df in mono-2.0-bdwgc (+0x1006de) (0x0a65fd84) ; mono-2.0-bdwgc.retry+9 28 0x01e687d7 in mono-2.0-bdwgc (+0x987d6) (0x0a65fdac) ; mono-2.0-bdwgc.mono_jit_runtime_invoke+2FF 29 0x01e879f1 in mono-2.0-bdwgc (+0xb79f0) (0x0a65fddc) ; mono-2.0-bdwgc.do_runtime_invoke+57 30 0x01e8822c in mono-2.0-bdwgc (+0xb822b) (0x0a65fe48) ; mono-2.0-bdwgc.try_invoke_perform_wait_callback+41 31 0x01e8a791 in mono-2.0-bdwgc (+0xba790) (0x0a65fe70) ; mono-2.0-bdwgc.worker_callback+31C 32 0x01e85431 in mono-2.0-bdwgc (+0xb5430) (0x0a65fecc) ; mono-2.0-bdwgc.worker_thread+241 33 0x01e8521c in mono-2.0-bdwgc (+0xb521b) (0x0a65fedc) ; mono-2.0-bdwgc.start_wrapper_internal+1F1 34 0x7bc979e0 call_thread_func_wrapper+0xb() in ntdll (0x0a65feec) ; mono-2.0-bdwgc.start_wrapper+3C 35 0x7bc97a46 call_thread_func+0x63() [/home/focht/projects/wine/mainline-src/dlls/ntdll/signal_i386.c:2654] in ntdll (0x0a65ffdc) 36 0x7bc979d2 call_thread_entry+0x9() in ntdll (0x0a65ffec) --- snip ---
Trace log, showing client/server side interaction, with additional diagnostics added.
NOTE: Any kind of tracing/debugging causes interference (timing) and makes it even harder to reproduce.
--- snip --- $ pwd /home/focht/.wine/drive_c/Program Files (x86)/Wizards of the Coast/MTGA
# clean download cache to force re-download $ rm -rf MTGA_Data/Downloads/
$ WINEDEBUG=+thread,+loaddll,+process,+msgbox,+server wine ./MTGA.exe >>log.txt 2>&1 ... 0055: new_thread( process=ffffffff, access=001fffff, suspend=1, request_fd=184, objattr={} ) 0055: new_thread() = 0 { tid=0056, handle=03e0 } 0055:fixme:thread:RtlCreateUserThread new thread -> tid=0x56, handle=0x3e0, entry=0x1e851e0 0055: resume_thread( handle=03e0 ) 0055: resume_thread() = 0 { count=1 } 0056: *fd* 192 <- 253 0055: close_handle( handle=03e0 ) 0055: close_handle() = 0 0056: *fd* 194 <- 254 0056: init_thread( unix_pid=3804, unix_tid=3884, debug_level=1, teb=81f58000, entry=01e851e0, reply_fd=192, wait_fd=194, cpu=x86 ) 0056: init_thread() = 0 { pid=0008, tid=0056, server_start=1d458de30de85cc (-3.5809720), info_size=0, version=572, all_cpus=00000001, suspend=0 } 0055: select( flags=2, cookie=092bf70c, timeout=infinite, prev_apc=0000, result={}, data={WAIT,handles={03dc}} ) 0055: select() = PENDING { timeout=infinite, call={APC_NONE}, apc_handle=0000 } ... 0056: create_file( access=80100080, sharing=00000001, create=1, options=00000060, attrs=00000080, objattr={rootdir=0000,attributes=00000040,sd={},name=L""}, filename="/home/focht/.wine/dosdevices/c:/Program Files/Wizards of the Coast/MTGA/MTGA_Data/Downloads/Audio/TEMP__BAT_DAR_0f5927532656771f5592ba8b19fc4e51.mtga" ) 0056: create_file() = 0 { handle=1510 } 0056: get_handle_fd( handle=1510 ) 0056: *fd* 1510 -> 54 0056: get_handle_fd() = 0 { type=1, cacheable=1, access=00120089, options=00000060 } 0056: create_file( access=c0100080, sharing=00000000, create=5, options=00000060, attrs=00000080, objattr={rootdir=0000,attributes=00000040,sd={},name=L""}, filename="/home/focht/.wine/dosdevices/c:/Program Files/Wizards of the Coast/MTGA/MTGA_Data/Downloads/Audio/BAT_DAR_0f5927532656771f5592ba8b19fc4e51.mtga" ) 0056: create_file() = 0 { handle=1514 } 0056: get_handle_fd( handle=1514 ) 0056: *fd* 1514 -> 55 0056: get_handle_fd() = 0 { type=1, cacheable=1, access=0012019f, options=00000060 } 0056: select( flags=0, cookie=0a65f0dc, timeout=+5.0000000, prev_apc=0000, result={}, data={WAIT,handles={103c}} ) 0056: select() = 0 { timeout=1d458de4db5b238 (+5.0000000), call={APC_NONE}, apc_handle=0000 } 0056: release_semaphore( handle=103c, count=00000001 ) 0056: release_semaphore() = 0 { prev_count=00000000 } ... 0062: get_thread_info( handle=03ec, tid_in=0000 ) 0062: get_thread_info() = 0 { pid=0008, tid=0056, teb=81f58000, entry_point=01e851e0, affinity=000000ff, exit_code=259, priority=0, last=0 } 0055: select( flags=2, cookie=81f5f16c, timeout=0, prev_apc=0000, result={}, data={} ) 0055: select() = PENDING { timeout=1d458de4acd8910 (+0.0000000), call={APC_NONE}, apc_handle=0000 } 0062: suspend_thread( handle=03ec ) 0056: *sent signal* signal=10 0062: suspend_thread() = 0 { count=0 } ... 0062: get_thread_context( handle=012c, flags=00000003, suspend=1 ) 0062: get_thread_context() = 0 { self=0, context={cpu=x86,eip=f7f8c047,esp=0033f590,ebp=0033f5f8,eflags=00200286,cs=0023,ss=002b,eax=00000003,ebx=00000007,ecx=0033f5d8,edx=00000010,esi=f7d65000,edi=00000000} } 0062: get_thread_context( handle=0170, flags=00000003, suspend=1 ) 0062: get_thread_context() = 0 { self=0, context={cpu=x86,eip=f7f8c047,esp=0267f930,ebp=0267f998,eflags=00000286,cs=0023,ss=002b,eax=00000003,ebx=00000011,ecx=0267f978,edx=00000010,esi=f7d65000,edi=00000000} } 0062: get_thread_context( handle=0344, flags=00000003, suspend=1 ) 0062: get_thread_context() = 0 { self=0, context={cpu=x86,eip=f7f8c047,esp=0944f9f0,ebp=0944fa58,eflags=00000286,cs=0023,ss=002b,eax=00000003,ebx=000000a7,ecx=0944fa38,edx=00000010,esi=f7d65000,edi=00000000} } 0062: get_thread_context( handle=0398, flags=00000003, suspend=1 ) 0062: get_thread_context() = 0 { self=0, context={cpu=x86,eip=f7f8c047,esp=08e8f4c0,ebp=08e8f528,eflags=00000282,cs=0023,ss=002b,eax=00000003,ebx=000000b5,ecx=08e8f508,edx=00000010,esi=f7d65000,edi=00000000} } 0062: get_thread_context( handle=03b8, flags=00000003, suspend=1 ) 0062: get_thread_context() = 0 { self=0, context={cpu=x86,eip=f7f8c047,esp=08f9fab0,ebp=08f9fb18,eflags=00000282,cs=0023,ss=002b,eax=00000003,ebx=000000b9,ecx=08f9faf8,edx=00000010,esi=f7d65000,edi=00000000} } 0062: get_thread_context( handle=03d0, flags=00000003, suspend=1 ) 0062: get_thread_context() = 0 { self=0, context={cpu=x86,eip=f7f8c047,esp=092bfa60,ebp=092bfac8,eflags=00000286,cs=0023,ss=002b,eax=00000003,ebx=000000be,ecx=092bfaa8,edx=00000010,esi=f7d65000,edi=00000000} } 0062: get_thread_context( handle=03ec, flags=00000003, suspend=1 ) 0062: get_thread_context() = PENDING { self=0, context={} } 0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 ) 0062: get_thread_context() = PENDING { self=0, context={} } 0062:fixme:thread:get_thread_context **STATUS_PENDING** try 0, handle=0x3ec 0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 ) 0062: get_thread_context() = PENDING { self=0, context={} } 0062:fixme:thread:get_thread_context **STATUS_PENDING** try 1, handle=0x3ec 0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 ) 0062: get_thread_context() = PENDING { self=0, context={} } 0062:fixme:thread:get_thread_context **STATUS_PENDING** try 2, handle=0x3ec 0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 ) 0062: get_thread_context() = PENDING { self=0, context={} } ... 0062:fixme:thread:get_thread_context **STATUS_PENDING** try 96, handle=0x3ec 0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 ) 0062: get_thread_context() = PENDING { self=0, context={} } 0062:fixme:thread:get_thread_context **STATUS_PENDING** try 97, handle=0x3ec 0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 ) 0062: get_thread_context() = PENDING { self=0, context={} } 0062:fixme:thread:get_thread_context **STATUS_PENDING** try 98, handle=0x3ec 0062: get_thread_context( handle=03ec, flags=00000003, suspend=0 ) 0062: get_thread_context() = PENDING { self=0, context={} } 0062:fixme:thread:get_thread_context **STATUS_PENDING** try 99, handle=0x3ec 0062:fixme:thread:get_thread_context ***STATUS_PENDING*** exceeded, handle=0x3ec 0062: get_desktop_window( force=0 ) 0062: get_desktop_window() = 0 { top_window=00010020, msg_window=00010026 } ... 0056: set_suspend_context( context={cpu=x86,eip=f7f8c049,esp=0a65f400,ebp=0a65f5d8,eflags=00000286,cs=0023,ss=002b,ds=002b,es=002b,fs=0063,gs=006b,eax=00000000,ebx=000000fb,ecx=00000002,edx=00000000,esi=f7d65000,edi=00000000,dr0=00000000,dr1=00000000,dr2=00000000,dr3=00000000,dr6=00000000,dr7=00000000,fp.ctrl=ffff027f,fp.status=ffff0120,fp.tag=ffffffff,fp.err_off=0884b3cb,fp.err_sel=00000023,fp.data_off=0a65fbb0,fp.data_sel=ffff002b,fp.cr0npx=00000120,fp.reg0=-128,fp.reg1=0,fp.reg2=1,fp.reg3=11,fp.reg4=0.232558,fp.reg5=1,fp.reg6=16,fp.reg7=12.5871,extended={...}} ) 0056: set_suspend_context() = 0 0056: select( flags=2, cookie=81f5b16c, timeout=0, prev_apc=0000, result={}, data={} ) 0056: select() = PENDING { timeout=1d458de4ae06a4e (+0.0000000), call={APC_NONE}, apc_handle=0000 } --- snip ---
The thread suspension happens - but very late. Too late for get_thread_context() to succeed. I ran some additional timing/statistics with a small patch applied, showing the outliers:
--- snip --- $ WINEDEBUG=+thread,+timestamp wine ./MTGA.exe ... 177.921:005e:fixme:thread:get_thread_context **STATUS_PENDING** try 1, handle=0x550 ... 177.936:005e:fixme:thread:get_thread_context **STATUS_PENDING** try 14, handle=0x550 ... 214.062:0062:fixme:thread:get_thread_context **STATUS_PENDING** try 1, handle=0x52c ... 215.253:0054:fixme:thread:get_thread_context **STATUS_PENDING** try 153, handle=0x52c ... 272.157:0062:fixme:thread:get_thread_context **STATUS_PENDING** try 1, handle=0x3d8 ... 272.221:0062:fixme:thread:get_thread_context **STATUS_PENDING** try 49, handle=0x3d8 ... 324.818:0057:fixme:thread:get_thread_context **STATUS_PENDING** try 1, handle=0x4ac ... 325.047:0057:fixme:thread:get_thread_context **STATUS_PENDING** try 208, handle=0x4ac --- snip ---
There are some very rare outliers that go into range of 150ms-250ms. Unfortunately it's not really possible to figure out what kind of managed code is executed at that time. As soon as a native/managed debugger is active it will always end up in final suspended state at the next interruption checkpoint. There is no "peek at running code". Although Mono itself has many interruption checkpoints on its own to manage thread suspension at managed code level there are likely some code sequences that run largely uninterruptable (no API callouts that end up in syscalls). Wine 's thread context retrieval mechanism relies on timely SIGUSR1 delivery to the target thread which in turn calls wait_suspend() to provide a suspend context on the server side.
I increased the wait time for thread suspension a bit more (loop 500 times -> 500ms).
https://source.winehq.org/git/wine.git/blob/HEAD:/dlls/ntdll/thread.c#l915
--- snip --- 915 NTSTATUS get_thread_context( HANDLE handle, context_t *context, unsigned int flags, BOOL *self ) 916 { 917 NTSTATUS ret; 918 DWORD dummy, i; 919 920 SERVER_START_REQ( get_thread_context ) 921 { 922 req->handle = wine_server_obj_handle( handle ); 923 req->flags = flags; 924 req->suspend = 1; 925 wine_server_set_reply( req, context, sizeof(*context) ); 926 ret = wine_server_call( req ); 927 *self = reply->self; 928 } 929 SERVER_END_REQ; 930 931 if (ret == STATUS_PENDING) 932 { 933 for (i = 0; i < 100; i++) 934 { 935 SERVER_START_REQ( get_thread_context ) 936 { 937 req->handle = wine_server_obj_handle( handle ); 938 req->flags = flags; 939 req->suspend = 0; 940 wine_server_set_reply( req, context, sizeof(*context) ); 941 ret = wine_server_call( req ); 942 } 943 SERVER_END_REQ; 944 if (ret == STATUS_PENDING) 945 { 946 LARGE_INTEGER timeout; 947 timeout.QuadPart = -10000; 948 NtDelayExecution( FALSE, &timeout ); 949 } 950 else break; 951 } 952 NtResumeThread( handle, &dummy ); 953 if (ret == STATUS_PENDING) ret = STATUS_ACCESS_DENIED; 954 } 955 return ret; 956 } --- snip ---
It helped the app, I didn't encounter a single failure anymore. I also put a WARN hint as developer diagnostics if the status was still STATUS_PENDING out of the loop to indicate something is fishy (thread has no interruption checkpoints?).
Again: the root cause why thread suspension in GC "world-stop collection" case sometimes takes very long is unknown (within Mono managed code).
$ sha1sum MTGAInstaller.exe bdc80950b31353e664f48154a9c22621d93b0737 MTGAInstaller.exe
$ du -sh MTGAInstaller.exe 3.7M MTGAInstaller.exe
$ wine --version wine-3.17
Regards