Thanks for the input, I agree that seperating these would probably make it easier.
more idiomatic way to do this would probably be making an asm wrapper (like we already have for call_seh_handler, for different reasons
I did initially start off like this in my testing but isn't is cleaner and more readable to just a single line instruction for the compiler?
Even if affects movaps generation its meaning is not to prevent sse instruction but disable automatic loop vectorization?
Yes, but it does also affect movaps generation for some reason, and there's no loops in OpenThread which is why I suggested it.
looks most straightforward, I guess it could just use spuriously added newline removal and better formatting of added condition to match the overall style around.
Copy that I will split up the 2 ntdll changes into seperate branches and make MRs for each individually. I'll check and reconsider the OpenThread change for now. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/11069#note_142232