https://bugs.winehq.org/show_bug.cgi?id=53682
--- Comment #3 from Kevin Puetz PuetzKevinA@JohnDeere.com --- Well, no, that patch won't since NtCallbackReturn assumes the arm64_thread_data()->syscall_frame pointer is actually a user_callback_frame *`, and casts it accordingly to get the jmpbuf. Which call_user_mode_callback didn't copy (or even allocate), so it just jumps off into space.
But yeah, you're on the same track I've been for the shape of a possible solution. And this could be made to work, it just has to copy even more.
The other idea I've been trying think of a way to actually accomplish is to make KeUserModeCallback->__wine_setjmpex into a tail call. i.e. we'd get the jmpbuf to have the return bp/lr values for the *caller* of KeUserModeCallback. Then NtCallbackReturn could just pass status to __wine_longjmp (to make that the return value) and we never KeUserModeCallback and trashing its stack (below the syscall_frame) is fine.
Problem for that is that we need __wine_jmpbuf to save the callee-saved register values from the very beginning of KeUserModeCallback, not somewhere in the middle of it. There's plenty of scratch registers on aarch64 to pull that off, just hard to get the code inserted in the right place.