Unwinding should stop on access violation. This matches windows behavior.
Below is a test that demonstrates the problem: ```c #include <windows.h> #include <stdio.h>
int main() { void *retaddr = __builtin_extract_return_addr(__builtin_return_address(0));
DWORD64 imagebase; UNWIND_HISTORY_TABLE table; RUNTIME_FUNCTION *rf = RtlLookupFunctionEntry((DWORD64)retaddr, &imagebase, &table);
DWORD old; VirtualProtect(rf, 0x100, PAGE_READWRITE, &old);
rf->UnwindData = 0x12345678;
VirtualProtect(rf, 0x100, old, &old);
void *bt[100]; RtlCaptureStackBackTrace(0, 100, bt, NULL);
return 0; } ``` This crashes on wine but not on windows.
-- v2: ntdll: Stop unwinding on access violation.
From: Dylan Donnell dylan.donnell@student.griffith.ie
--- dlls/ntdll/signal_x86_64.c | 35 ++++++++++++++++++++++++++--------- 1 file changed, 26 insertions(+), 9 deletions(-)
diff --git a/dlls/ntdll/signal_x86_64.c b/dlls/ntdll/signal_x86_64.c index 3d4bd23e8ff..245d07abe37 100644 --- a/dlls/ntdll/signal_x86_64.c +++ b/dlls/ntdll/signal_x86_64.c @@ -866,6 +866,13 @@ BOOLEAN WINAPI RtlIsProcessorFeaturePresent( UINT feature ) return feature < PROCESSOR_FEATURE_MAX && user_shared_data->ProcessorFeatures[feature]; }
+static LONG WINAPI walk_frame_chain_handler( EXCEPTION_POINTERS *eptr ) +{ + if (eptr->ExceptionRecord->ExceptionCode == STATUS_ACCESS_VIOLATION) + return EXCEPTION_EXECUTE_HANDLER; + + return EXCEPTION_CONTINUE_SEARCH; +}
/************************************************************************* * RtlWalkFrameChain (NTDLL.@) @@ -882,17 +889,27 @@ ULONG WINAPI RtlWalkFrameChain( void **buffer, ULONG count, ULONG flags )
RtlCaptureContext( &context );
- for (i = 0; i < count; i++) + __TRY { - func = RtlLookupFunctionEntry( context.Rip, &base, &table ); - if (RtlVirtualUnwind2( UNW_FLAG_NHANDLER, base, context.Rip, func, &context, NULL, - &data, &frame, NULL, NULL, NULL, &handler, 0 )) - break; - if (!context.Rip) break; - if (!frame || !is_valid_frame( frame )) break; - if (context.Rsp == (ULONG_PTR)NtCurrentTeb()->Tib.StackBase) break; - if (i >= skip) buffer[num_entries++] = (void *)context.Rip; + for (i = 0; i < count; i++) + { + func = RtlLookupFunctionEntry( context.Rip, &base, &table ); + if (RtlVirtualUnwind2( UNW_FLAG_NHANDLER, base, context.Rip, func, &context, NULL, + &data, &frame, NULL, NULL, NULL, &handler, 0 )) + break; + if (!context.Rip) break; + if (!frame || !is_valid_frame( frame )) break; + if (context.Rsp == (ULONG_PTR)NtCurrentTeb()->Tib.StackBase) break; + if (i >= skip) buffer[num_entries++] = (void *)context.Rip; + } + } + __EXCEPT(walk_frame_chain_handler) + { + TRACE( "access violation when unwinding with context.Rip %p\n", (void *)context.Rip ); + return 0; } + __ENDTRY + return num_entries; }
I notice you're catching exception from RtlVirtualUnwind2().
Does RtlVirtualUnwind2() crash on Windows? Like, crash when encountering invalid stack / pointer? Do you have tests to verify it? Or the name of the application that triggers the behavior?
Yes I have a test that crashes under wine but not windows, on windows it returns 0 so I replicated that in my mr: ```c #include <windows.h> #include <stdio.h>
int main() { void *retaddr = __builtin_extract_return_addr(__builtin_return_address(0));
DWORD64 imagebase; UNWIND_HISTORY_TABLE table; RUNTIME_FUNCTION *rf = RtlLookupFunctionEntry((DWORD64)retaddr, &imagebase, &table);
DWORD old; VirtualProtect(rf, 0x100, PAGE_READWRITE, &old);
rf->UnwindData = 0x12345678;
VirtualProtect(rf, 0x100, old, &old);
void *bt[100]; RtlCaptureStackBackTrace(0, 100, bt, NULL);
return 0; } ```
This behavior was also observed in the game "Blue Protocol: Star Resonance".
The fact it doesn't crash in RtlCaptureStackBackTrace on Windows doesn't say that it doesn't crash in RtlVirtualUnwind2 though, there are more possibilities WRT handling this: 1. It might be indeed catching exceptions but in RtlWalkFrameChain 2. It might not be catching exceptions but instead doing some checks which would prevent the crash. To test if anything is catching exceptions there one may try to either protect some page from reading instead of injecting invalid data before unwind, or set vectored handler and seeing if it is called.
Is it known what the game actually does there and why does it hit this exception, is invalid frame data also happens on Windows or maybe this is the consequences of something else going different under Wine?
This is a good point. The test executable does throw 3 access violations but I should check if the invalid frame data is also present on windows.
Looks like this on windows: 
Ran the below: ```c #include <windows.h> #include <stdio.h>
int main() { void *retaddr = __builtin_extract_return_addr(__builtin_return_address(0));
DWORD64 imagebase; UNWIND_HISTORY_TABLE table; RUNTIME_FUNCTION *rf = RtlLookupFunctionEntry((DWORD64)retaddr, &imagebase, &table);
DWORD old; VirtualProtect(rf, 0x100, PAGE_NOACCESS, &old);
/* rf->UnwindData = 0x12345678;
VirtualProtect(rf, 0x100, old, &old);*/
void *bt[100]; int count = RtlCaptureStackBackTrace(0, 100, bt, NULL); for (int i = 0; i < count; i++) { printf("Backtrace %d: %p\n", i, bt[i]); }
return 0; } ```
Windows output: ``` Backtrace 0: 00007FF6D93629CB ```
Wine with this mr has no output, while current wine stack overflows. Didn't expect that output from windows.
I am still curious if the mentioned crash RtlCaptureStackBackTrace is the core issue Blue Protocol hits or rather some other bug leads to invalid unwind info. So I tried to look at that with current Proton Experimental with things from here (https://github.com/ValveSoftware/Proton/issues/9093#issuecomment-3422275851) on top without this MR (only coremessaging MR, limiting address space for syscall emulation to work with the game) and the game starts for me this way without this MR. Do I need to do anything specific to trigger the mentioned crash?
Regardless, if Windows indeed triggers and catches the exceptions something between the lines of this MR might be worth it I guess, but we still need to know where exactly the crash is supposed to be caught (RtlVirtualUnwind2? WalkStackTrace?) and probably have a test for that as a part of Wine testsuite included in the patches.
The crash occurs when you have more than one character on screen and after some time with `-fno-omit-frame-pointer` (making it occur more easily).
So I managed to reproduce the access violations in ntdll's unwind code with the game (not exactly the same way as described but I hope this is the same case). Does https://gitlab.winehq.org/wine/wine/-/merge_requests/9236 help? This, instead of fixing the consequences of access violation, avoids that (as I think it is avoided on Windows). RtlCaptureStackBackTrace goes wrong when trying to unwind through BaseThreadInitThunk when BaseThreadInitThunk has rbp based frame because rbp is lost in anticheat initial thread code before calling game entry point, while there is no unwind info to restore rbp. On Windows that doesn't break anything and doesn't cause access violation, as my test attached to that MR shows.
`-fno-omit-frame-pointer` is probably not useful on x64, however compiler can generate (or not) rbp-based frame on its own discretion.
Once again, I guess what this MR is trying to do might be useful anyway (provided proper test and confirmed correct place to handle access violation), but WRT the concerned issue I believe it is better to fix the core reason and just avoid access violation in ntdll. UnityPlayer calls that RtlCaptureStackBackTrace routinely on debug logging and that is going to work wrong with those access violations.
Yeah it seems like !9236 improves performance of the game as well which is nice (by stopping the constant nullpointer dereferences). It also stops the game from triggering the issue in this MR which is nice, I do agree that this is still useful though.