Re: [PATCH v3 0/5] MR741: Draft: ntdll: Implement NtFlushProcessWriteBuffers.

7 Sep 2022


      On Sun Sep  4 13:46:59 2022 +0000, Jinoh Kang wrote:
...
...
I don't think I can/should do that, since then if thread 2 issues a
memory barrier while thread 1 is already waiting on a memory barrier,
then thread 1 will also wait for the second memory barrier to complete
instead of just its own.
We can serialize the barrier calls with mutex here, too.  It will avoid
excessive APCs in case multiple threads call NtFlushProcessWriteBuffers
(e.g. RCU with concurrent writers, GC in multiple arena/isolates).  If
we don't serialize the barrier or coalesce APCs, the total number of
simultaneous APCs will be `NM` where N = number of threads in the
process, and M = concurrent calls to NtFlushProcessWriteBuffers.
Since not all applications use membarrier in the first place, we can
also avoid extra object allocation for threads that will never end up
using the global memory barrier.
(Yet another approach to solve this problem would be keeping track of
generations.  It will let us coalesce APCs, but this sounds like an overkill.)
In general, we want to minimize the complexity and overhead of the
fallback path since its use will not be very common: newest operating
systems will just use mprotect/membarrier/mach calls, and the fallback
is only used when all else fails.
I protected the APC path with a mutex and made the memory barrier object a global object that is only created once. This means that the `wake_up(...)` calls might do a little unnecessary work if multiple processes issue a memory barrier at the same time but I don't think that matters much and we don't have to create one object per process (or thread).
I have though up a way to coalesce the APCs too but it's more complex and doesn't easily allow reporting back errors to the origin thread. Not sure if I should implement it?
-- 
https://gitlab.winehq.org/wine/wine/-/merge_requests/741#note_7885

2025

2024

2023

2022

Re: [PATCH v3 0/5] MR741: Draft: ntdll: Implement NtFlushProcessWriteBuffers.