Jinoh Kang (@iamahuman) commented about server/queue.c:
+#define SHARED_WRITE_BEGIN( x ) \
- do { \
assert( (*(x) & SEQUENCE_MASK) != SEQUENCE_MASK ); \
if ((__atomic_add_fetch( x, 1, __ATOMIC_RELAXED ) & SEQUENCE_MASK) == 1) \
__atomic_thread_fence( __ATOMIC_RELEASE ); \
- } while(0)
+#define SHARED_WRITE_END( x ) \
- do { \
assert( (*(x) & SEQUENCE_MASK) != 0 ); \
if ((*(x) & SEQUENCE_MASK) > 1) \
__atomic_sub_fetch( x, 1, __ATOMIC_RELAXED ); \
else { \
__atomic_thread_fence( __ATOMIC_RELEASE ); \
__atomic_add_fetch( x, SEQUENCE_MASK, __ATOMIC_RELAXED ); \
Can we just coalesce these two calls into one for better optimization opportunities (e.g., use of STLR with RCpc semantics on ARM64, instead of `DMB ISH`)?
```suggestion:-1+0 __atomic_add_fetch( x, SEQUENCE_MASK, __ATOMIC_RELEASE ); \ ```