Today, NtContinue() on ARM64 does not restore X16 and X17 from the context.
This is because X16 and X17 are used as scratch registers for restoring SP and PC respectively in __wine_syscall_dispatcher. Scratch registers are required because ARMv8 does not have an unprivileged (EL0) instruction that loads SP and PC from memory or non-GPR architectural state.
Fix this by making ARM64 __wine_syscall_dispatcher perform a full context restore via raise(SIGUSR2) when NtContinue() is used.
Since raising a signal is quite expensive, it should be done only when necessary. To achieve this, split the ARM64 syscall dispatcher's returning behaviour into a fast path (that does not involve signals) and a slow path (that involves signals):
- If CONTEXT_INTEGER is not set, the dispatcher takes the fast path: the X16 and X17 registers are clobbered as usual.
- If X16 == PC and X17 == SP, the dispatcher also takes the fast path: it can safely use X16 and X17 without corrupting the register values, since those two registers already have the desired values.
This fast path is used in call_user_apc_dispatcher(), call_user_exception_dispatcher(), and call_init_thunk().
- Otherwise, the dispatcher takes the slow path: it raises SIGUSR2 and does full context restore in the signal handler.
Fixes: 88e336214db94318b6657d641919fcce6be4a328