Just to let you guys know, there's a bug in gcc 5.3 & 5.4 that causes wine's 64 bit code to be generated as if the incoming stack boundary was 8 bytes. This makes the pro/epilogues all bloaty with unaligned movups for SSE saves/restores. This can be avoided by actually *not* using force_align_arg_pointer or -mstackrealign. Note however that when using either of those, we should supply -mincoming-stack-boundary=4 so that gcc knows what alignment to force the arg pointer to... except that that is also expertly broken.
I'm not aware of this actually causing any crashes or UB, just producing slower, bloaty code. There was also PR69140 that caused build breakage when the stack pointer wasn't properly aligned, but I guess this is a flaw that got exposed once force_align_arg_pointer started breaking alignment on x64 abi.
Daniel