 
            Overwatch 2 doesn't like the win32u `-syscall` conversion that happened a while back, because it hooks KiUserCallbackDispatcher. This MR fixes it up to match what Overwatch 2 expects.
-- v2: ntdll: Pass KiUserCallbackDispatcher parameters on stack. ntdll: Add 5-byte nop to start of KiUserCallbackDispatcher. ntdll: Align stack pointer when calling KiUserCallbackDispatcher.
 
            From: Torge Matthies tmatthies@codeweavers.com
Overwatch hooks KiUserCallbackDispatcher and expects the stack pointer to be aligned to a multiple of 16 bytes, instead of the usual 8-byte misalignment, otherwise it will crash on a misaligned movaps.
Fix this by aligning the stack pointer when calling the dispatcher and again inside the dispatcher.
Signed-off-by: Torge Matthies openglfreak@googlemail.com --- dlls/ntdll/signal_x86_64.c | 25 +++++++++++++++++++------ dlls/ntdll/unix/signal_x86_64.c | 2 +- 2 files changed, 20 insertions(+), 7 deletions(-)
diff --git a/dlls/ntdll/signal_x86_64.c b/dlls/ntdll/signal_x86_64.c index b1ab4933b93..31beb608965 100644 --- a/dlls/ntdll/signal_x86_64.c +++ b/dlls/ntdll/signal_x86_64.c @@ -654,12 +654,7 @@ __ASM_GLOBAL_FUNC( KiUserApcDispatcher, "int3")
-/******************************************************************* - * KiUserCallbackDispatcher (NTDLL.@) - * - * FIXME: not binary compatible - */ -void WINAPI KiUserCallbackDispatcher( ULONG id, void *args, ULONG len ) +void WINAPI user_callback_dispatcher( ULONG id, void *args, ULONG len ) { NTSTATUS status;
@@ -678,6 +673,24 @@ void WINAPI KiUserCallbackDispatcher( ULONG id, void *args, ULONG len ) RtlRaiseStatus( status ); }
+/******************************************************************* + * KiUserCallbackDispatcher (NTDLL.@) + * + * FIXME: not binary compatible + */ +#ifdef __x86_64__ +__ASM_GLOBAL_FUNC( KiUserCallbackDispatcher, + "andq $0xFFFFFFFFFFFFFFF0, %rsp\n\t" + __ASM_SEH(".seh_endprologue\n\t") + "call " __ASM_NAME("user_callback_dispatcher") "\n\t" + "int3") +#else +void WINAPI DECLSPEC_HOTPATCH KiUserCallbackDispatcher( ULONG id, void *args, ULONG len ) +{ + return user_callback_dispatcher( id, args, len ); +} +#endif +
static ULONG64 get_int_reg( CONTEXT *context, int reg ) { diff --git a/dlls/ntdll/unix/signal_x86_64.c b/dlls/ntdll/unix/signal_x86_64.c index 5787f1dc6f9..0526db9d762 100644 --- a/dlls/ntdll/unix/signal_x86_64.c +++ b/dlls/ntdll/unix/signal_x86_64.c @@ -2399,7 +2399,7 @@ NTSTATUS WINAPI KeUserModeCallback( ULONG id, const void *args, ULONG len, void if (!__wine_setjmpex( &callback_frame.jmpbuf, NULL )) { struct syscall_frame *frame = amd64_thread_data()->syscall_frame; - void *args_data = (void *)((frame->rsp - len) & ~15); + void *args_data = (void *)(((frame->rsp - len) & ~15) - 8);
memcpy( args_data, args, len );
 
            From: Torge Matthies tmatthies@codeweavers.com
Overwatch 2 hooks KiUserCallbackDispatcher by overwriting the first five bytes with a jump, and returning to just after the jump. Make sure there is a five-byte instruction for it to replace.
Signed-off-by: Torge Matthies openglfreak@googlemail.com --- dlls/ntdll/signal_x86_64.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/dlls/ntdll/signal_x86_64.c b/dlls/ntdll/signal_x86_64.c index 31beb608965..abc28019643 100644 --- a/dlls/ntdll/signal_x86_64.c +++ b/dlls/ntdll/signal_x86_64.c @@ -680,6 +680,7 @@ void WINAPI user_callback_dispatcher( ULONG id, void *args, ULONG len ) */ #ifdef __x86_64__ __ASM_GLOBAL_FUNC( KiUserCallbackDispatcher, + ".byte 0x0f, 0x1f, 0x44, 0x00, 0x00\n\t" /* Overwatch 2 replaces the first 5 bytes with a jump */ "andq $0xFFFFFFFFFFFFFFF0, %rsp\n\t" __ASM_SEH(".seh_endprologue\n\t") "call " __ASM_NAME("user_callback_dispatcher") "\n\t"
 
            From: Torge Matthies tmatthies@codeweavers.com
In addition to in registers.
Overwatch 2 hooks KiUserCallbackDispatcher and expects to be able to use all the caller-saved registers, but also expects the callback id to be in ecx.
Signed-off-by: Torge Matthies openglfreak@googlemail.com --- dlls/ntdll/signal_x86_64.c | 3 +++ dlls/ntdll/unix/signal_x86_64.c | 10 +++++++++- 2 files changed, 12 insertions(+), 1 deletion(-)
diff --git a/dlls/ntdll/signal_x86_64.c b/dlls/ntdll/signal_x86_64.c index abc28019643..ec78cacbaf6 100644 --- a/dlls/ntdll/signal_x86_64.c +++ b/dlls/ntdll/signal_x86_64.c @@ -681,6 +681,9 @@ void WINAPI user_callback_dispatcher( ULONG id, void *args, ULONG len ) #ifdef __x86_64__ __ASM_GLOBAL_FUNC( KiUserCallbackDispatcher, ".byte 0x0f, 0x1f, 0x44, 0x00, 0x00\n\t" /* Overwatch 2 replaces the first 5 bytes with a jump */ + "movq 0x28(%rsp), %rdx\n\t" + "movl 0x30(%rsp), %ecx\n\t" + "movl 0x34(%rsp), %r8d\n\t" "andq $0xFFFFFFFFFFFFFFF0, %rsp\n\t" __ASM_SEH(".seh_endprologue\n\t") "call " __ASM_NAME("user_callback_dispatcher") "\n\t" diff --git a/dlls/ntdll/unix/signal_x86_64.c b/dlls/ntdll/unix/signal_x86_64.c index 0526db9d762..019d1ecefa2 100644 --- a/dlls/ntdll/unix/signal_x86_64.c +++ b/dlls/ntdll/unix/signal_x86_64.c @@ -2400,8 +2400,16 @@ NTSTATUS WINAPI KeUserModeCallback( ULONG id, const void *args, ULONG len, void { struct syscall_frame *frame = amd64_thread_data()->syscall_frame; void *args_data = (void *)(((frame->rsp - len) & ~15) - 8); + struct { + void *args; + ULONG id; + ULONG len; + } *params = (void *)((ULONG_PTR)args_data - 0x10);
memcpy( args_data, args, len ); + params->args = args_data; + params->id = id; + params->len = len;
callback_frame.frame.rcx = id; callback_frame.frame.rdx = (ULONG_PTR)args; @@ -2410,7 +2418,7 @@ NTSTATUS WINAPI KeUserModeCallback( ULONG id, const void *args, ULONG len, void callback_frame.frame.fs = amd64_thread_data()->fs; callback_frame.frame.gs = ds64_sel; callback_frame.frame.ss = ds64_sel; - callback_frame.frame.rsp = (ULONG_PTR)args_data - 0x28; + callback_frame.frame.rsp = (ULONG_PTR)params - 0x28; callback_frame.frame.rip = (ULONG_PTR)pKiUserCallbackDispatcher; callback_frame.frame.eflags = 0x200; callback_frame.frame.restore_flags = CONTEXT_CONTROL | CONTEXT_INTEGER;
 
            Nope, I couldn't get backtraces to work, but I removed some unnecessary stuff from the asm.
 
            Zebediah Figura (@zfigura) commented about dlls/ntdll/signal_x86_64.c:
RtlRaiseStatus( status );}
+/*******************************************************************
KiUserCallbackDispatcher (NTDLL.@)
- FIXME: not binary compatible
- */
+#ifdef __x86_64__
No need for this ifdef.
 
            Zebediah Figura (@zfigura) commented about dlls/ntdll/signal_x86_64.c:
RtlRaiseStatus( status );}
+/*******************************************************************
KiUserCallbackDispatcher (NTDLL.@)
- FIXME: not binary compatible
- */
+#ifdef __x86_64__ +__ASM_GLOBAL_FUNC( KiUserCallbackDispatcher,
"andq $0xFFFFFFFFFFFFFFF0, %rsp\n\t"
__ASM_SEH(".seh_endprologue\n\t")
I was going to comment that this isn't a valid SEH prologue, but I guess it doesn't matter, since we do the same thing in the other Ki* functions. Perhaps those deserve a comment, or perhaps it's just clear enough that they're special...
 
            On Tue Oct 25 21:11:03 2022 +0000, Zebediah Figura wrote:
I was going to comment that this isn't a valid SEH prologue, but I guess it doesn't matter, since we do the same thing in the other Ki* functions. Perhaps those deserve a comment, or perhaps it's just clear enough that they're special...
Should I just remove it? We can't unwind / backtrace through here anyway, so this SEH directive is useless.
 
            On 10/25/22 22:20, Torge Matthies (@tmatthies) wrote:
On Tue Oct 25 21:11:03 2022 +0000, Zebediah Figura wrote:
I was going to comment that this isn't a valid SEH prologue, but I guess it doesn't matter, since we do the same thing in the other Ki* functions. Perhaps those deserve a comment, or perhaps it's just clear enough that they're special...
Should I just remove it? We can't unwind / backtrace through here anyway, so this SEH directive is useless.
I don't think I can answer that without understanding why the SEH directives are there in the first place. Paul, since you wrote them, can you by chance explain?
 
            I am sure that unwind from all the Ki functions is supposed to work. I tested that for functions that I touched. Of course unwind can’t go throw kernel (Unix) part, stack and .seh are such that the unwind goes throw the PE part only.
So instead of removing I’d suggest to make (possibly an out of tree) test and make sure that unwind works. It is not about debugging or getting backtraces. An app might be processing the exception thrown from below Ki function higher in call chain.
On 25 Oct 2022, at 23:02, Zebediah Figura zfigura@codeweavers.com wrote:
On 10/25/22 22:20, Torge Matthies (@tmatthies) wrote:
On Tue Oct 25 21:11:03 2022 +0000, Zebediah Figura wrote: I was going to comment that this isn't a valid SEH prologue, but I guess it doesn't matter, since we do the same thing in the other Ki* functions. Perhaps those deserve a comment, or perhaps it's just clear enough that they're special...
Should I just remove it? We can't unwind / backtrace through here anyway, so this SEH directive is useless.
I don't think I can answer that without understanding why the SEH directives are there in the first place. Paul, since you wrote them, can you by chance explain?
 
            On Tue Oct 25 21:11:03 2022 +0000, Zebediah Figura wrote:
No need for this ifdef.
I forgot I was in x86-64 specific code :D
 
            Alexandre Julliard (@julliard) commented about dlls/ntdll/unix/signal_x86_64.c:
if (!__wine_setjmpex( &callback_frame.jmpbuf, NULL )) { struct syscall_frame *frame = amd64_thread_data()->syscall_frame;
void *args_data = (void *)((frame->rsp - len) & ~15);
void *args_data = (void *)(((frame->rsp - len) & ~15) - 8);
Mis-aligning the arguments doesn't seem like an improvement. Probably it doesn't need a return address, like KiUserExceptionDispatcher.
 
            On Wed Oct 26 14:27:01 2022 +0000, Alexandre Julliard wrote:
Mis-aligning the arguments doesn't seem like an improvement. Probably it doesn't need a return address, like KiUserExceptionDispatcher.
Changing the offset of `params` from `callback_frame.frame.rsp` actually breaks Overwatch again, so it's either subtracting 8 here or subtracting an extra 8 when calculating the address for `params`.





