[Bug 49590] New: Battle.net Agent.exe hang/crash
https://bugs.winehq.org/show_bug.cgi?id=49590 Bug ID: 49590 Summary: Battle.net Agent.exe hang/crash Product: Wine-staging Version: 5.13 Hardware: x86-64 URL: https://www.blizzard.com/apps/battle.net/desktop OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: -unknown Assignee: wine-bugs(a)winehq.org Reporter: maciej.stanczew+b(a)gmail.com CC: leslie_alistair(a)hotmail.com, z.figura12(a)gmail.com Distribution: ArchLinux On Staging 5.13, when using Battle.net App, its Agent.exe process will often (not always) misbehave. Example behavior: - Crashes on launch -- can be verified by processes dying and new ones being spawned, and by empty "Crash.txt" file in 'drive_c/ProgramData/Battle.net/Agent/Agent.<version>/Errors'; - Hangs using 100% CPU and doesn't exit when Battle.net App is closed; - Blocks launching of games; for example when launching Diablo III, I see 'Diablo' in process list, but it won't actually launch the game until I kill Agent.exe (which at the time is hanged with 100% CPU consumption). Sometimes error message BLZBNTBNA00000005 will be shown in Battle.net App, which is described as: "The Blizzard Battle.net desktop app failed to communicate with the Blizzard Update Agent, which is required to install, update, launch, and uninstall Blizzard games." https://battle.net/support/en/article/16531 This is not happening with both Staging 5.12 and with vanilla Wine 5.13. With those versions, a single Agent.exe lives alongside Battle.net App, doesn't hang, and exits when Battle.net App is closed. I'm not able to check cooperation with games on those versions because of bug 45349 and bug 42741. Since Agent.exe is spawned by Battle.net App, it's difficult to get Wine logs for its execution. If I launch Agent.exe manually (without Battle.net App), it seems to not hang/crash. I only managed to get two exceptions when Battle.net App was running; one when I manually launched Agent.exe when previous instance died: 07c0:err:seh:NtRaiseException Unhandled exception code c0000005 flags 0 addr 0x7bc265b8 and I'm not even sure how I got the second one, as it happened only once and I've been unable to reproduce it again -- but hey, it might be useful: 01f8:err:virtual:virtual_setup_exception stack overflow 976 bytes in thread 01f8 addr 0xf7a91c2e stack 0x220c30 (0x220000-0x221000-0x320000) I'll try to do a bisection next, to find where in Staging 5.13 the problem was introduced/uncovered. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #1 from Maciej Stanczew <maciej.stanczew+b(a)gmail.com> --- Created attachment 67765 --> https://bugs.winehq.org/attachment.cgi?id=67765 Logs Results are in: f6954e6e77dfd443f5bdc28190ad478e0d6fb77d is the first bad commit commit f6954e6e77dfd443f5bdc28190ad478e0d6fb77d Author: Zebediah Figura <z.figura12(a)gmail.com> Date: Wed Jul 8 20:46:51 2020 -0500 Rebase against 262e4ab9e0eeb126dde5cb4cba13fbf7f1d1cef0. For reproduction with logs, I have to kill the Agent.exe process that was spawned by Battle.net, and then quickly launch Agent.exe manually (before Battle.net spawns another one automatically). Repro is sporadic, but I'm able to get it eventually if I try enough times (10+). wine-5.12-64-ge0e3b6bc91 (Staging) [553c1cff]: - No crashes in 20 manual retries - 'Errors' directory in ProgramData is empty wine-5.12-97-g262e4ab9e0 (Staging) [f6954e6e]: - 4 crashes in 20 manual retries, plus 2 crashes of automatically spawned Agent - 'Errors' directory already gets one entry just after starting Battle.net I have attached logs with +seh,+ntdll,+relay,+timestamp from f6954e6e and from Staging 5.13, and also some Agent crash reports. All builds tested are with PE support. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 Maciej Stanczew <maciej.stanczew+b(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Regression SHA1| |f6954e6e77dfd443f5bdc28190a | |d478e0d6fb77d -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 i.Dark_Templar <idarktemplar(a)mail.ru> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |idarktemplar(a)mail.ru --- Comment #2 from i.Dark_Templar <idarktemplar(a)mail.ru> --- Created attachment 67771 --> https://bugs.winehq.org/attachment.cgi?id=67771 battle.net error screenshot.png This or similar issue appeared to me as well after I upgraded to wine-staging 5.13 with PE modules support. Issue does not reproduce in wine-staging 5.11 both with and without PE modules support. To me it happens almost soon after launching a game from Battle.net, for example Starcraft II. Each time message box is displayed, it brings Battle.net application's window to foreground. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #3 from i.Dark_Templar <idarktemplar(a)mail.ru> --- (In reply to Maciej Stanczew from comment #1)
Results are in:
f6954e6e77dfd443f5bdc28190ad478e0d6fb77d is the first bad commit commit f6954e6e77dfd443f5bdc28190ad478e0d6fb77d Author: Zebediah Figura <z.figura12(a)gmail.com> Date: Wed Jul 8 20:46:51 2020 -0500
Rebase against 262e4ab9e0eeb126dde5cb4cba13fbf7f1d1cef0.
I've looked a bit in this commit. In this commit winebuild-Fake_Dlls patchset is removed for wine-staging 5.12. Later a different implementation, patchset winebuild-pe_syscall_thunks, is added for wine-staging 5.13. I think it might be an issue in this patchset, and a regression in wine-staging 5.13 compared to wine-staging 5.11. But I might be wrong. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #4 from Maciej Stanczew <maciej.stanczew+b(a)gmail.com> --- Created attachment 67772 --> https://bugs.winehq.org/attachment.cgi?id=67772 Logs non-staging (In reply to i.Dark_Templar from comment #2)
To me it happens almost soon after launching a game from Battle.net, for example Starcraft II. Each time message box is displayed, it brings Battle.net application's window to foreground. It looks very similar to my case, probably it's the same issue.
(In reply to i.Dark_Templar from comment #3)
I've looked a bit in this commit. In this commit winebuild-Fake_Dlls patchset is removed for wine-staging 5.12. Later a different implementation, patchset winebuild-pe_syscall_thunks, is added for wine-staging 5.13. I think it might be an issue in this patchset, and a regression in wine-staging 5.13 compared to wine-staging 5.11. But I might be wrong. These were also my first suspicions, but they turned out to be false, or at least incomplete.
I was able to reproduce the issue on plain Wine 5.13. Logs attached. I have done further bisection of upstream commits, and got this as the commit that introduces regression: 82cd85b07918a4437428497ffaf7f13286b83479 is the first bad commit commit 82cd85b07918a4437428497ffaf7f13286b83479 Author: Zebediah Figura <z.figura12(a)gmail.com> Date: Tue Jul 7 18:58:34 2020 -0500 ntdll: Set the thread creation time in NtQuerySystemInformation(SystemProcessInformation). Process Hacker displays this information. With this commit reverted from Staging 5.13: - I can use Battle.net App and launch games without any popups appearing about Agent crashing; - I can't reproduce the exception anymore by killing and manually launching Agent.exe. However, automatically spawned Agent still sometimes crashes -- I can see it happening in process list, and by entries in 'Errors' directory. It seems to happen most often when launching games: Agent will crash, new one will spawn, and eventually one will "stick" and things will proceed. I don't see any functional impact -- no popups and game hangs. I didn't see any crashes in Staging 5.10. Unfortunately because of the time between disabling Fake_Dlls and introduction of pe_syscall_thunks, there's no way to do a bisect with launching games to trigger this crash. After all the testing and reverting I'm starting to think there might be multiple issues leading to Agent crashing. But the one this bug was initially about seems to be related to 82cd85b07918a4437428497ffaf7f13286b83479. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 Maciej Stanczew <maciej.stanczew+b(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Product|Wine-staging |Wine Component|-unknown |ntdll Regression SHA1|f6954e6e77dfd443f5bdc28190a |82cd85b07918a4437428497ffaf |d478e0d6fb77d |7f13286b83479 -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 Paul Gofman <pgofman(a)codeweavers.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pgofman(a)codeweavers.com --- Comment #5 from Paul Gofman <pgofman(a)codeweavers.com> --- The blamed commit is misleading, I suggest removing it from Regression SHA1 field. Bisect showed that because it stopped working after that one, but the crash present now is not related. As far as my testing goes so far, the reintroduced syscall thunks patchset is also not at fault. I could reproduce crashes in Agent.exe with the latest Staging and Starcraft. It looks like some memory overwrite issue. WINEDEBUG=warn+heap shows tail overwrites, and the crashes are always in ntdll heap allocation / free functions, which clearly suggests that heap control data is smashed. Can you try Staging without ntdll-Heap_Improvements patchset (staging/patchinstall.py --all -W ntdll-Heap_Improvements). That was fixing the issue for me, would be interesting to confirm if that is the same issue I am seeing. It is not much likely that ntdll-Heap_Improvements is at fault per se, it just introduces a different memory control structures layout which appears to be more vulnerable. It is yet to be verified if the memory smash is solely due to Agent code or maybe imposed by something in Wine. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #6 from Paul Gofman <pgofman(a)codeweavers.com> --- As a separate note, mainstream Wine with syscall thunks patchset applied also crashes for me but in a different executable (always the same address in libcef.dll), that does not look related. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 Maciej Stanczew <maciej.stanczew+b(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Regression SHA1|82cd85b07918a4437428497ffaf | |7f13286b83479 | -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #7 from Maciej Stanczew <maciej.stanczew+b(a)gmail.com> --- (In reply to Paul Gofman from comment #5)
The blamed commit is misleading, I suggest removing it from Regression SHA1 field. Bisect showed that because it stopped working after that one, but the crash present now is not related. True, after some more time of using Battle.net I see that those crashes during launching of games can in fact lead to the broken state ("Whoops!" popups and game hangs).
Can you try Staging without ntdll-Heap_Improvements patchset (staging/patchinstall.py --all -W ntdll-Heap_Improvements). That was fixing the issue for me, would be interesting to confirm if that is the same issue I am seeing. Initially it looked better, but then I got one Agent crash during game launch, and another just after launching Battle.net. I haven't managed to get it into the same broken state, but based on previous debugging data, I might have just not tried enough times. For now I'll stay on this configuration and report anything new if it happens.
-- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #8 from Maciej Stanczew <maciej.stanczew+b(a)gmail.com> ---
I might have just not tried enough times Yup, that was it. Just now I launched Battle.net and then Diablo III, and I got 4 back to back Agent crashes, resulting in the popup appearing and the game being delayed from starting. (The 5th Agent didn't crash and the game finally launched after about 30 s of delay.)
-- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #9 from i.Dark_Templar <idarktemplar(a)mail.ru> --- (In reply to Paul Gofman from comment #5)
Can you try Staging without ntdll-Heap_Improvements patchset (staging/patchinstall.py --all -W ntdll-Heap_Improvements). That was fixing the issue for me, would be interesting to confirm if that is the same issue I am seeing.
I've rebuilt wine-staging 5.13 without ntdll-Heap_Improvements patchset and issue didn't reproduce again for me yet. Thank you for workaround. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #10 from Paul Gofman <pgofman(a)codeweavers.com> --- I think I found at least one reason for heap corruption, that is free of the random pointer by Wine (which accidentally happens to be a pointer already freed previously). I've sent a patch for that [1]. However, I still see crashes in memory management with ntdll-Heap_Improvements patchset. I am not sure yet if this is a bug in the patchset itself or caused by some memory naughtiness by Wine or Agent.exe, this needs further debugging. Does the linked patch without ntdll-Heap_Improvements patchset (that is, ntdll-Heap_Improvements excluded from Staging pacthes by -W ntdll-Heap_Improvements) solve the issue completely? 1. https://source.winehq.org/patches/data/189533 -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #11 from Paul Gofman <pgofman(a)codeweavers.com> --- Here is the fix for another heap corruption I found: https://www.winehq.org/pipermail/wine-devel/2020-July/170551.html With the patch from comment #10 and this one Agent is not crashing for me anymore (no Staging patches disabled, ntdll-Heap_Improvements is in place) and I don't see heap validation errors so far. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #12 from i.Dark_Templar <idarktemplar(a)mail.ru> --- I didn't test it for long time yet, but it looks like this issue doesn't reproduce for me with wine-staging 5.13 with ntdll-Heap_Improvements patchset and 2 patches from last comments. Thank you! -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #13 from Maciej Stanczew <maciej.stanczew+b(a)gmail.com> --- Staging 5.13 (all patchsets) + the two patches linked = no crash in about 2 hours of testing. Looks like that was it :) -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 Maciej Stanczew <maciej.stanczew+b(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|ntdll |-unknown -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 --- Comment #14 from Paul Gofman <pgofman(a)codeweavers.com> --- Great, thanks for testing and bisecting. Those patches have been committed upstream as [1] and [2] and will appear in the next Staging rebase, so this can probably be marked as fixed. 1. https://source.winehq.org/git/wine.git/commit/3d54677586eb0a9f379839cd06c04d... 2. https://source.winehq.org/git/wine.git/commit/3feaca754613df248bc576b801d885... -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 Maciej Stanczew <maciej.stanczew+b(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Fixed by SHA1| |3feaca754613df248bc576b801d | |885baa8637050 Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #15 from Maciej Stanczew <maciej.stanczew+b(a)gmail.com> --- Briefly tested Wine 07030059486e0121051b452c94d37f12931cabf4 + Staging 02be23fa5213c0cb0b377b5120ea256d6b5f1af4 -- also no crashes. Marking as fixed, thank you for all the work! -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 Alexandre Julliard <julliard(a)winehq.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED --- Comment #16 from Alexandre Julliard <julliard(a)winehq.org> --- Closing bugs fixed in 5.14. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=49590 Justin King-Lacroix <justin.kinglacroix(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |justin.kinglacroix(a)gmail.co | |m --- Comment #17 from Justin King-Lacroix <justin.kinglacroix(a)gmail.com> --- (In reply to Maciej Stanczew from comment #15)
Briefly tested Wine 07030059486e0121051b452c94d37f12931cabf4 + Staging 02be23fa5213c0cb0b377b5120ea256d6b5f1af4 -- also no crashes. Marking as fixed, thank you for all the work!
+1 -- Starcraft II on Wine 5.14 + Staging v5.14, and this bug goes away completely. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
participants (1)
-
WineHQ Bugzilla