On Sun, Oct 21, 2012 at 10:56:51AM -0600, James Eder wrote:
static inline void save_fpux( CONTEXT *context ) { #ifdef __GNUC__ /* we have to enforce alignment by hand */ char buffer[sizeof(XMM_SAVE_AREA32) + 16]; XMM_SAVE_AREA32 *state = (XMM_SAVE_AREA32 *)(((ULONG_PTR)buffer + 15) & ~15);
__asm__ __volatile__( "fxsave %0" : "=m" (*state) ); context->ContextFlags |= CONTEXT_EXTENDED_REGISTERS; memcpy( context->ExtendedRegisters, state, sizeof(*state) );
#endif }
That is an on-stack buffer, all bets are off!
Traditionally the SYSV ABI (used by everyone except microsoft) only required the stack to br 4 byte aligned. At some point the gcc bods unilaterally decided to maintain 16 byte alignement (so the SSE2 registers can be saved on stack).
The Linux kernel aligns the stack when main (strictly _start) is called, but anything not compiled by gcc (or compiled with other options) can lose that alignment.
If you request 32 byte alignment for the save area, (recent enough) gcc will align the stack before allocating the locals.
David