Based on [a patch](https://www.winehq.org/mailman3/hyperkitty/list/wine-devel(a)winehq.org/mess...) by Jinoh Kang (@iamahuman) from February 2022. I removed the need for the event object and implemented fast paths for Linux. On macOS 10.14+ `thread_get_register_pointer_values` is called on every thread of the process. On Linux 4.14+ `membarrier(MEMBARRIER_CMD_GLOBAL_EXPEDITED, ...)` is used. On x86 Linux <= 4.13 and on other platforms `madvise(..., MADV_DONTNEED)` is used, which sends IPIs to all cores causing them to do a memory barrier. -- v11: ntdll: Add thread_get_register_pointer_values-based implementation of NtFlushProcessWriteBuffers. ntdll: Add sys_membarrier-based implementation of NtFlushProcessWriteBuffers. ntdll: Add MADV_DONTNEED-based implementation of NtFlushProcessWriteBuffers. https://gitlab.winehq.org/wine/wine/-/merge_requests/741