GCC, clang, and other compilers do not actually provide acquire/release semantics on volatile memory accesses. This is also true on MSVC with the /volatile:iso (use strict ISO C semantics for volatile accesses) flag.
Consider the following test program:
void func(int *foo, volatile int *bar) { *foo = 1; *bar = 2; /* NOTE: *not* immune to reordering! */ *foo = 3; }
After store reordering and dead code removal, the function above compiles into the following x86-64 assembly on GCC 11.2 (gcc -O2):
movl $2, (%rsi) movl $3, (%rdi) ret
Note that the first write to "*foo" has been ellided; the compiler decided that it is safe to reorder writes to "*foo" around writes to the volatile variable "*bar", so it simply merged the first "*foo" write into the second one.
Fix this by explicitly specifying a compiler memory barrier before the atomic store. As a workaround for GCC bug #81316, we do this even for other architectures where we explicitly use the atomic memory access builtin.
Signed-off-by: Jinoh Kang jinoh.kang.kr@gmail.com --- server/fd.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/server/fd.c b/server/fd.c index 1b4b98b0e76..5865db89c33 100644 --- a/server/fd.c +++ b/server/fd.c @@ -381,10 +381,15 @@ static const int user_shared_data_timeout = 16;
static void atomic_store_ulong(volatile ULONG *ptr, ULONG value) { - /* on x86 there should be total store order guarantees, so volatile is - * enough to ensure the stores aren't reordered by the compiler, and then - * they will always be seen in-order from other CPUs. On other archs, we - * need atomic intrinsics to guarantee that. */ + /* on x86 there should be total store order guarantees, so a compiler + * barrier is enough to ensure the stores aren't reordered by the compiler; + * then, they will always be seen in-order from other CPUs. On other archs, + * we need atomic intrinsics to guarantee that. + * + * even when using atomic intrinsics, we explicitly place a memory barrier + * to work around GCC bug #81316 (affects all GCC versions prior to 7.x). + */ + __asm__ __volatile__("" ::: "memory"); #if defined(__i386__) || defined(__x86_64__) *ptr = value; #else @@ -394,6 +399,10 @@ static void atomic_store_ulong(volatile ULONG *ptr, ULONG value)
static void atomic_store_long(volatile LONG *ptr, LONG value) { + /* even when using atomic intrinsics, we explicitly place a memory barrier + * to work around GCC bug #81316 (affects all GCC versions prior to 7.x). + */ + __asm__ __volatile__("" ::: "memory"); #if defined(__i386__) || defined(__x86_64__) *ptr = value; #else