On Mon May 4 11:41:50 2026 +0000, Matteo Bruni wrote:
Otherwise using compiler intrinsics could be an option. I guess it would leave some performance on the table, by handing register allocation to the compiler, but maybe not that much? I made some measurements, and it turns out that the performance difference is very small (less than 1%). I'll try to come up with a new version, although I have no idea how to integrate this into our build system. The SSE code would have to be in a separate .c file, which is compiled with `-msse`, but only on x86. IIRC !9588 took care of that via a couple of `#ifdef`s, essentially only building the SSE version when -msse is included or implied in the CFLAGS (e.g. because of -march=nocona). If I understand correctly, doing it like this would mean SSE is disabled by default on 32-bit x86, and Wine would have to be recompiled with different `i386_CFLAGS` to enable it.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10716#note_138852