It was added because of calling into unix side back when there was no separation between it and PE via "syscalls". It makes no sense to add it back, especially because you can compile PE modules with a preferred stack boundary of 2 (4 bytes as on Windows) and then force_align_arg_pointer won't do anything.
Sure, we could, and that'd be another way to solve the problem, but currently we don't. The fact is that gcc assumes the stack boundary is 16-byte-aligned, period, and that means that not only do we need to force alignment so we don't break the Unix ABI, but we also need to force alignment so that aligned types and variables actually will be aligned.
There may be motion upstream to change that, and assume 4-byte-alignment for the i686-w64-mingw32 target, but that will take time to propagate. We should introduce some solution before then.
What's this compiler bug about? Did you try annotate the variable declaration with DECLSPEC_ALIGN(16), not just the struct type?
That shouldn't make a difference.