Regarding "[PATCH 08/13] ntdll: Allocate a truly separate stack for the kernel stack.":
Why do we need to do this? Specifically, do we know why this is broken in Valgrind?
I ask because I would guess a priori that this should be fixed on the valgrind side. Valgrind should have all of the information regarding our stacks, so I wouldn't expect it to need to resort to heuristics.