Jinoh Kang (@iamahuman) commented about server/queue.c:
+ +#define SHARED_WRITE_BEGIN( x ) \ + do { \ + assert( (*(x) & SEQUENCE_MASK) != SEQUENCE_MASK ); \ + if ((__atomic_add_fetch( x, 1, __ATOMIC_RELAXED ) & SEQUENCE_MASK) == 1) \ + __atomic_thread_fence( __ATOMIC_RELEASE ); \ + } while(0) + +#define SHARED_WRITE_END( x ) \ + do { \ + assert( (*(x) & SEQUENCE_MASK) != 0 ); \ + if ((*(x) & SEQUENCE_MASK) > 1) \ + __atomic_sub_fetch( x, 1, __ATOMIC_RELAXED ); \ + else { \ + __atomic_thread_fence( __ATOMIC_RELEASE ); \ + __atomic_add_fetch( x, SEQUENCE_MASK, __ATOMIC_RELAXED ); \ Can we just coalesce these two calls into one for better optimization opportunities (e.g., use of STLR with RCpc semantics on ARM64, instead of `DMB ISH`)?
```suggestion:-1+0 __atomic_add_fetch( x, SEQUENCE_MASK, __ATOMIC_RELEASE ); \ ``` -- https://gitlab.winehq.org/wine/wine/-/merge_requests/3103#note_36122