If wine dlls are built with frame pointers enabled, the frame pointer will be used during unwinding.
If we don't restore frame pointer before calling the user mode callback, then later when the unwinder encounters the user mode callback frame, it will set the frame pointer to something unexpected (depends on what it was during `call_user_mode_callback`). Then for the subsequent frame it adjusts the stack pointer based on the frame pointer, thus derailing the unwinding process.
-- v3: ntdll: Also restore rbp before calling user mode callback.
From: Yuxuan Shui yshui@codeweavers.com
If wine dlls are built with frame pointers enabled, the frame pointer will be used during unwinding.
If we don't restore frame pointer before calling the user mode callback, then later when the unwinder encounters the user mode callback frame, it will set the frame pointer to something unexpected (depends on what it was during `call_user_mode_callback`). Then for the subsequent frame it adjusts the stack pointer based on the frame pointer, thus derailing the unwinding process. --- dlls/ntdll/unix/signal_x86_64.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/dlls/ntdll/unix/signal_x86_64.c b/dlls/ntdll/unix/signal_x86_64.c index 6ec662b9850..a3adf24c64f 100644 --- a/dlls/ntdll/unix/signal_x86_64.c +++ b/dlls/ntdll/unix/signal_x86_64.c @@ -1613,7 +1613,7 @@ NTSTATUS call_user_exception_dispatcher( EXCEPTION_RECORD *rec, CONTEXT *context /*********************************************************************** * call_user_mode_callback */ -extern NTSTATUS call_user_mode_callback( ULONG64 user_rsp, void **ret_ptr, ULONG *ret_len, void *func, TEB *teb ); +extern NTSTATUS call_user_mode_callback( ULONG64 user_rsp, void **ret_ptr, ULONG *ret_len, void *func, TEB *teb, ULONG64 user_rbp ); __ASM_GLOBAL_FUNC( call_user_mode_callback, "subq $0x58,%rsp\n\t" __ASM_CFI(".cfi_adjust_cfa_offset 0x58\n\t") @@ -1648,6 +1648,7 @@ __ASM_GLOBAL_FUNC( call_user_mode_callback, "testl $1,0x380(%r13)\n\t" /* thread_data->syscall_trace */ "jz 1f\n\t" "movq %rdi,%r12\n\t" /* user_rsp */ + "movq %r9,%r14\n\t" /* user_rbp */ "movl 0x2c(%r12),%edi\n\t" /* id */ "movq %rdi,-0x50(%rbp)\n\t" "movq %rcx,%r15\n\t" /* func */ @@ -1655,9 +1656,11 @@ __ASM_GLOBAL_FUNC( call_user_mode_callback, "movl 0x28(%r12),%edx\n\t" /* len */ "call " __ASM_NAME("trace_usercall") "\n\t" "movq %r12,%rdi\n\t" /* user_rsp */ + "movq %r14,%r9\n\t" /* user_rbp */ "movq %r15,%rcx\n" /* func */ /* switch to user stack */ "1:\tmovq %rdi,%rsp\n\t" /* user_rsp */ + "movq %r9,%rbp\n\t" /* user_rbp */ #ifdef __linux__ "movw 0x338(%r13),%ax\n" /* amd64_thread_data()->fs */ "testw %ax,%ax\n\t" @@ -1768,7 +1771,7 @@ NTSTATUS KeUserModeCallback( ULONG id, const void *args, ULONG len, void **ret_p stack->machine_frame.rip = frame->rip; stack->machine_frame.rsp = frame->rsp; memcpy( stack->args_data, args, len ); - return call_user_mode_callback( rsp, ret_ptr, ret_len, pKiUserCallbackDispatcher, NtCurrentTeb() ); + return call_user_mode_callback( rsp, ret_ptr, ret_len, pKiUserCallbackDispatcher, NtCurrentTeb(), frame->rbp ); }
btw @bernhardu also had this problem (i.e. wine crashing during unwind) when running wine with PE ASan, and had a workaround for it: https://gitlab.winehq.org/bernhardu/wine/-/commit/1b4215acbbe578be456ef29547... (though i think this works around more cases then this MR fixes)
i believe this MR is the proper fix for one of those cases, though idk if i broke anything unintentionally in the process.
Which exact unwinding do you mean, PE side Wine implemented (probably?), or something Unix side?
rbp has no special meaning on x64, it is just one of non-volatile register. If it is not properly restored during PE unwind, it is probably the same as many other non-volatile registers as well (e. g, r12, xmm6)? Then all of those should probably be restored. I'd guess that yeah, call_user_mode_callback() might be the only place for it because there is no context saved in callback_stack_layout (unlike exception_stack_layout or apc_stack_layout). But you probably don't need to pass any of those registers to restore to call_user_mode_callback() and can just use those from frame (including FP state probably).
And that maybe means that it is saner to find a way to use __wine_syscall_dispatcher_return in call_user_mode_callback than replicating all that state restore in call_user_mode_callback().
On Tue Jul 1 00:54:29 2025 +0000, Paul Gofman wrote:
Which exact unwinding do you mean, PE side Wine implemented (probably?), or something Unix side? rbp has no special meaning on x64, it is just one of non-volatile register. If it is not properly restored during PE unwind, it is probably the same as many other non-volatile registers as well (e. g, r12, xmm6)? Then all of those should probably be restored. I'd guess that yeah, call_user_mode_callback() might be the only place for it because there is no context saved in callback_stack_layout (unlike exception_stack_layout or apc_stack_layout). But you probably don't need to pass any of those registers to restore to call_user_mode_callback() and can just use those from frame (including FP state probably).
PE side.
The rbp is important because unwind op code `UWOP_SET_FPREG` uses it to restore the rsp.
The problems is that the wrong rbp got saved on to the stack frame.
When something like this happens:
some user mode functions (1) -> syscall -> kernel mode -> `call_user_mode_callback` -> callback -> more user mode functions
the callback function (`KiUserCallbackDispatcher`) saves whatever rbp it gets onto the stack. if this rbp differs from what it was at point (1), unwind breaks.
On Tue Jul 1 00:54:29 2025 +0000, Yuxuan Shui wrote:
PE side. The rbp is important because unwind op code `UWOP_SET_FPREG` uses it to restore the rsp. The problems is that the wrong rbp got saved on to the stack frame. When something like this happens: some user mode functions (1) -> syscall -> kernel mode -> `call_user_mode_callback` -> callback -> more user mode functions the callback function (`KiUserCallbackDispatcher`) saves whatever rbp it gets onto the stack. if this rbp differs from what it was at point (1), unwind breaks.
yeah, i think you are right about restoring other registers. if we don't do that, exception handlers located in frames below `call_user_mode_callback` probably will break if we try to invoke them.
The rbp is important because unwind op code
Sure, rbp is important, it is just that all of the non-volatile registers are also important, and I don't see how any of those would be restored now in unwind.
On Tue Jul 1 00:59:19 2025 +0000, Paul Gofman wrote:
The rbp is important because unwind op code
Sure, rbp is important, it is just that all of the non-volatile registers are also important, and I don't see how any of those would be restored now in unwind.
do we have a test case that unwinds through a user mode callback?
What's the best way of resuing `__wine_syscall_dispatcher_return` for user mode callbacks? I think I need some guidance.
I think I need to set up desired states in `frame`, and then jump to `__wine_syscall_dispatcher_return`? Kind of similar to `NtContinueEx` maybe?
What's the best way of resuing `__wine_syscall_dispatcher_return` for user mode callbacks? I think I need some guidance.
That doesn't seem necessary, you can simply restore the needed registers from the parent frame. Copying them to the current frame just so you can call `__wine_syscall_dispatcher_return` is not going to simplify anything.
On Wed Jul 2 01:37:27 2025 +0000, Alexandre Julliard wrote:
What's the best way of resuing `__wine_syscall_dispatcher_return` for
user mode callbacks? I think I need some guidance. That doesn't seem necessary, you can simply restore the needed registers from the parent frame. Copying them to the current frame just so you can call `__wine_syscall_dispatcher_return` is not going to simplify anything.
I think @gofman's suggestion is to restore all registers? (I hope I didn't misunderstand him). And that would be easier to do with `__wine_syscall_dispatcher_return`
I think @gofman's suggestion is to restore all registers?
No, only non-volatile. As I understand what Alexandre suggests correctly, it is better to just restore those in call_user_mode_callback, in principle similar to what you did but: - for all non-volatile regs; - no need to pass additional parameters, you can get those right from frame.
I don't know if that is really necessary, but I personally would go for some test for RtlVirtualUnwind called from window procedure (with something like CreateWindow called from asm code to save non-volatile registers before / after), to make sure we are getting it right.
On Wed Jul 2 01:42:35 2025 +0000, Paul Gofman wrote:
I think @gofman's suggestion is to restore all registers?
No, only non-volatile. As I understand what Alexandre suggests correctly, it is better to just restore those in call_user_mode_callback, in principle similar to what you did but:
- for all non-volatile regs;
- no need to pass additional parameters, you can get those right from frame.
I don't know if that is really necessary, but I personally would go for some test for RtlVirtualUnwind called from window procedure (with something like CreateWindow called from asm code to save non-volatile registers before / after), to make sure we are getting it right.
looks like I cannot unwind through user mode callback on native. unwind stops with a `STATUS_FATAL_USER_CALLBACK_EXCEPTION`.
i can capture stacktrace though, which is not working on wine without this rbp fix.
On Fri Jul 4 19:20:56 2025 +0000, Yuxuan Shui wrote:
looks like I cannot unwind through user mode callback on native. unwind stops with a `STATUS_FATAL_USER_CALLBACK_EXCEPTION`. i can capture stacktrace though, which is not working on wine without this rbp fix.
What do you mean by unwind here: unwind as in execute exception handler above user mode callback, or RtlVirtualUnwindEx without actual jump? It would be overly weird if the latter can't work, that is behind RtlCaptureStackBackTrace mechanics, hard to guess how it is possible that you can unwind through RtlCaptureStackBackTrace on Windows but not when you do what it does (according to Wine code) directly. Maybe some problem in such a test setup?
On Fri Jul 4 19:27:53 2025 +0000, Paul Gofman wrote:
What do you mean by unwind here: unwind as in execute exception handler above user mode callback, or RtlVirtualUnwindEx without actual jump? It would be overly weird if the latter can't work, that is behind RtlCaptureStackBackTrace mechanics, hard to guess how it is possible that you can unwind through RtlCaptureStackBackTrace on Windows but not when you do what it does (according to Wine code) directly. Maybe some problem in such a test setup?
Do you mean `RtlVirtualUnwind` or `RtlUnwindEx`?
I was using the latter. I don't know how to inspect the registers when `RtlVirtualUnwind` is unwinding a frame.
On Fri Jul 4 19:30:36 2025 +0000, Yuxuan Shui wrote:
Do you mean `RtlVirtualUnwind` or `RtlUnwindEx`? I was using the latter. I don't know how to inspect the registers when `RtlVirtualUnwind` is unwinding a frame.
There is a difference between unwind per se and executing exception handler: no problem to do the unwind (let's use "unwind" strictly as establishing target frame context, and if you are going to jump there, it is not unwind, it is executing handler or sort of long jump). But to do the actual jump correctly it would need to pop kernel side frame somehow, apparently that is not possible to do on Windows as well if your results were about executing handler.
The easiest is probably to look at RtlWalkFrameChain() Wine implementation. RtlVirtualUnwind2() doesn't jump anywhere there, and while it is not used RtlWalkFrameChain the effect of unwinding is that the context passed to RtlVirtualUnwind should get all the non-volatile registers to the values corresponding to destrination frame / ip.
On Fri Jul 4 19:45:34 2025 +0000, Paul Gofman wrote:
There is a difference between unwind per se and executing exception handler: no problem to do the unwind (let's use "unwind" strictly as establishing target frame context, and if you are going to jump there, it is not unwind, it is executing handler or sort of long jump). But to do the actual jump correctly it would need to pop kernel side frame somehow, apparently that is not possible to do on Windows as well if your results were about executing handler. The easiest is probably to look at RtlWalkFrameChain() Wine implementation. RtlVirtualUnwind2() doesn't jump anywhere there, and while it is not used RtlWalkFrameChain the effect of unwinding is that the context passed to RtlVirtualUnwind should get all the non-volatile registers to the values corresponding to destrination frame / ip.
Thanks. So this is the loop I am using to unwind:
```c for (frames = 0; frames < 16; frames++) { func = RtlLookupFunctionEntry(context.Rip, &base, &table); if (RtlVirtualUnwind(UNW_FLAG_NHANDLER, base, context.Rip, func, &context, &data, &frame, NULL)) break; if (!context.Rip) break; if (!frame) break; WINE_ERR("%d: %p\n", frames, (void *)context.Rip); if (context.Rip == (DWORD64)unwind_target || frame == (DWORD64)target_frame) { WINE_ERR("found\n"); TRACE_CONTEXT(&context); break; } } if (frames >= 16) WINE_ERR("not found\n"); ```
and looks like nothing except rbp is saved. here's a register print out:
``` during unwind: rip=00007ff684951a25 rsp=000000ce835ff8b0 rbp=000000ce835ffa20 eflags=00000202 rax=00007ffc18854a00 rbx=0000000000000000 rcx=000000ce835ff080 rdx=0000000000000082 rsi=0000000000000000 rdi=0000000000000000 r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=000000ce835ff598 r12=0000000000000000 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 mxcsr=00001f80
before unwind rip=00007ff684951a16 rsp=000000ce835ff8b0 rbp=000000ce835ffa20 eflags=00000202 rax=00007ffc18854a00 rbx=00000000deadbeaf rcx=00007ff684957040 rdx=00000000deadbeaf rsi=00000000deadbeaf rdi=00000000deadbeaf r8=00000000deadbeaf r9=00000000deadbeaf r10=00000000deadbeaf r11=00000000deadbeaf r12=00000000deadbeaf r13=00000000deadbeaf r14=00000000deadbeaf r15=0000000000000000 mxcsr=00001f80 ```
if i make wine **save all non-volatile registers**, this is the output i got:
``` during unwind: rip=0000000140001a25 rsp=00007ffffe2ffe00 rbp=00007ffffe2fff70 eflags=00000202 rax=00006ffffd2bf91c rbx=00000000deadbeaf rcx=00007ffffe2ff0b0 rdx=0000000000000082 rsi=00007ffffe0ff138 rdi=00007ffffe2ffd20 r8=0000000000000000 r9=0000000000000000 r10=00000001400010c1 r11=0000000000000202 r12=00000000deadbeaf r13=00000000deadbeaf r14=00000000deadbeaf r15=0000000000000000 mxcsr=00001fa0 ```
given that windows disallow unwinding through a user callback, it's not surprising they chose to not restore registers before calling user callback.
On Sat Jul 5 00:27:21 2025 +0000, Yuxuan Shui wrote:
Thanks. So this is the loop I am using to unwind:
for (frames = 0; frames < 16; frames++) { func = RtlLookupFunctionEntry(context.Rip, &base, &table); if (RtlVirtualUnwind(UNW_FLAG_NHANDLER, base, context.Rip, func, &context, &data, &frame, NULL)) break; if (!context.Rip) break; if (!frame) break; WINE_ERR("%d: %p\n", frames, (void *)context.Rip); if (context.Rip == (DWORD64)unwind_target || frame == (DWORD64)target_frame) { WINE_ERR("found\n"); TRACE_CONTEXT(&context); break; } } if (frames >= 16) WINE_ERR("not found\n");
and looks like nothing except rbp is saved. here's a register print out:
during unwind: rip=00007ff684951a25 rsp=000000ce835ff8b0 rbp=000000ce835ffa20 eflags=00000202 rax=00007ffc18854a00 rbx=0000000000000000 rcx=000000ce835ff080 rdx=0000000000000082 rsi=0000000000000000 rdi=0000000000000000 r8=0000000000000000 r9=0000000000000000 r10=0000000000000000 r11=000000ce835ff598 r12=0000000000000000 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 mxcsr=00001f80 before unwind rip=00007ff684951a16 rsp=000000ce835ff8b0 rbp=000000ce835ffa20 eflags=00000202 rax=00007ffc18854a00 rbx=00000000deadbeaf rcx=00007ff684957040 rdx=00000000deadbeaf rsi=00000000deadbeaf rdi=00000000deadbeaf r8=00000000deadbeaf r9=00000000deadbeaf r10=00000000deadbeaf r11=00000000deadbeaf r12=00000000deadbeaf r13=00000000deadbeaf r14=00000000deadbeaf r15=0000000000000000 mxcsr=00001f80
if i make wine **save all non-volatile registers**, this is the output i got:
during unwind: rip=0000000140001a25 rsp=00007ffffe2ffe00 rbp=00007ffffe2fff70 eflags=00000202 rax=00006ffffd2bf91c rbx=00000000deadbeaf rcx=00007ffffe2ff0b0 rdx=0000000000000082 rsi=00007ffffe0ff138 rdi=00007ffffe2ffd20 r8=0000000000000000 r9=0000000000000000 r10=00000001400010c1 r11=0000000000000202 r12=00000000deadbeaf r13=00000000deadbeaf r14=00000000deadbeaf r15=0000000000000000 mxcsr=00001fa0
given that windows disallow unwinding through a user callback, it's not surprising they chose to not restore registers before calling user callback.
I'd say it is a bit unexpected, but that's what test are for. Probably a test which confirms that would be great to have with the patch. You can still avoid passing additional parameter in original patch and just get rbp from frame, also avoid moving that register back and forth and read / restore just once.
On Sat Jul 5 00:36:41 2025 +0000, Paul Gofman wrote:
I'd say it is a bit unexpected, but that's what test are for. Probably a test which confirms that would be great to have with the patch. You can still avoid passing additional parameter in original patch and just get rbp from frame, also avoid moving that register back and forth and read / restore just once.
That would also allow to see the test in full and look for potential caveats, what if some details lead to such result. Also unclear why you have rbx as `deadbeaf` on Wine if you saved all non-volatile registers.