On 12/1/21 19:15, Jinoh Kang wrote:
+static void test_syscall_clobbered_regs(void) +{
- struct regs
- {
UINT64 rcx;
- };
- static const BYTE code[] =
- {
0x48, 0x8d, 0x05, 0x00, 0x10, 0x00, 0x00,
/* leaq 0x1000(%rip),%rax */
0x48, 0x25, 0x00, 0xf0, 0xff, 0xff,
/* andq $~0xfff,%rax */
code_mem is already page-aligned, so &~0xfff should be unnecessary. Even if this wasn't case, it might lead to the stack overlapping the code if e.g. (rax & 0xfff) == 0xfff.
But I am adding 0x1000 to rip beforehand, how would it overlap the code?
0x48, 0x83, 0xe8, 0x08, /* subq $8,%rax */
0x48, 0x89, 0x20, /* movq %rsp,0(%rax) */
0x48, 0x89, 0xc4, /* movq %rax,%rsp */
I think "push %rsp" would make it clear that we're saving RSP.
But it will push that on the current stack while I want to save it to the new temporary stack, before clobbering rsp. I could first push the rsp on the initial stack and then copy but it is not obvious to me that these extra operations are an improvement.
All in all, my suggestion goes:
0x48, 0x8d, 0x25, 0xf9, 0x0f, 0x00, 0x00, /* leaq 0xff9(%rip),%rsp */ 0x54, /* push %rsp */
0x41, 0x50, /* push %r8 */
0x53, 0x55, 0x57, 0x56, 0x41, 0x54, 0x41, 0x55, 0x41, 0x56, 0x41, 0x57,
/* push %rbx, %rbp, %rdi, %rsi, %r12, %r13, %r14, %r15 */
0x41, 0xff, 0xd1, /* callq *r9 */
0x41, 0x5f, 0x41, 0x5e, 0x41, 0x5d, 0x41, 0x5c, 0x5e, 0x5f, 0x5d, 0x5b,
/* pop %r15, %r14, %r13, %r12, %rsi, %rdi, %rbp, %rbx */
I think we should reload RSP beforehand, since we're assuming the stack pointer isn't reliable at all after the system call.
The stack pointer is reliable after the system call, it is always (((code_mem + 0x1000) ~0xfff) - saved_regs_size). This is not a universal procedure, only a part of specific test case and frankly I don't find the motivation to make it universal at the cost of complication. The reason I introduced this at all in the second patch version is that previously the test implicitly relied on the compiler calling the asm block with the same %rsp. Now it doesn't, the %rsp is set to the same known value before NtGetContextThread().
"testl $0x3,%edx\n\t" /* CONTEXT_CONTROL | CONTEXT_INTEGER */ "jnz 1f\n\t" "movq 0x88(%rcx),%rsp\n\t"
"jmpq *0x70(%rcx)\n" /* frame->rip */
"movq 0x70(%rcx),%rcx\n\t" /* frame->rip */
"jmpq *%rcx\n\t" "1:\tleaq 0x70(%rcx),%rsp\n\t" "testl $0x2,%edx\n\t" /* CONTEXT_INTEGER */
"jz 1f\n\t"
"movq 0x00(%rcx),%rax\n\t"
"jnz 1f\n\t"
"movq (%rsp),%rcx\n\t" /* frame->rip */
CONTEXT_CONTROL means we should restore the CS segment register as well. Since SYSRET cannot restore CS, Windows would have to use plain IRETQ instead of SYSRET. In this case I suspect there should be no reason to clobber RCX at all.
Note that this claim is unconfirmed; I might need some testing.
Does my patch change anything in this regard? I hope it doesn't, if it does that is an oversight. If there is something looking wrong with how it works now fixing that should not go in the same patch. But I think we currently restore CS (both before and after my patch) by using iretq in case of CONTEXT_CONTROL, isn't that the case?
CONTEXT_CONTROL