[PATCH 0/1] MR7281: include: Prefer builtin atomics on x86 unless known broken.

List overview All Threads

newer

older

[PATCH v3 0/7] MR7432: mp3dmod:...

[PATCH v3 0/2] MR7119: ntdll,...

William Horvath (＠whrvt)

6 Feb 2025 6 Feb '25

5 p.m.

Clang presents as GCC 4.2.1, so it was always using the fallbacks for `InterlockedExchange{,Pointer}` on the non-PE side.

The fix for the GCC `__atomic_*` bug linked in the comment for the `__WINE_ATOMIC_*` macros first appeared in GCC 8: https://github.com/gcc-mirror/gcc/commit/d8c40eff56f69877b33c697ded756d50fde...

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7281

Show replies by date

William Horvath

6 Feb 6 Feb

5 p.m.

New subject: [PATCH 1/1] include: Prefer builtin atomics on x86 unless known broken.

From: William Horvath william@horvath.blog

Clang presents as GCC 4.2.1, so it was always using the fallbacks for InterlockedExchange{,Pointer} on the non-PE side.

The fix for the GCC __atomic_* bug linked in the comment for the __WINE_ATOMIC_* macros first appeared in GCC 8: https://github.com/gcc-mirror/gcc/commit/d8c40eff56f69877b33c697ded756d50fde... --- include/winnt.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/winnt.h b/include/winnt.h index 2547f841b7c..2992e86622d 100644 --- a/include/winnt.h +++ b/include/winnt.h @@ -7189,7 +7189,7 @@ static FORCEINLINE LONGLONG WINAPI InterlockedCompareExchange64( LONGLONG volati static FORCEINLINE LONG WINAPI InterlockedExchange( LONG volatile *dest, LONG val ) { LONG ret; -#if (__GNUC__ > 4) || ((__GNUC__ == 4) && (__GNUC_MINOR__ >= 7)) +#if defined(__clang__) || ((__GNUC__ > 4) || ((__GNUC__ == 4) && (__GNUC_MINOR__ >= 7))) ret = __atomic_exchange_n( dest, val, __ATOMIC_SEQ_CST ); #elif defined(__i386__) || defined(__x86_64__) __asm__ __volatile__( "lock; xchgl %0,(%1)" @@ -7248,7 +7248,7 @@ static FORCEINLINE LONGLONG WINAPI InterlockedDecrement64( LONGLONG volatile *de static FORCEINLINE void * WINAPI InterlockedExchangePointer( void *volatile *dest, void *val ) { void *ret; -#if (__GNUC__ > 4) || ((__GNUC__ == 4) && (__GNUC_MINOR__ >= 7)) +#if defined(__clang__) || ((__GNUC__ > 4) || ((__GNUC__ == 4) && (__GNUC_MINOR__ >= 7))) ret = __atomic_exchange_n( dest, val, __ATOMIC_SEQ_CST ); #elif defined(__x86_64__) __asm__ __volatile__( "lock; xchgq %0,(%1)" : "=r" (ret) :"r" (dest), "0" (val) : "memory" ); @@ -7285,7 +7285,7 @@ static FORCEINLINE void MemoryBarrier(void) __sync_synchronize(); }

-#if defined(__x86_64__) || defined(__i386__) +#if !defined(__clang__) && ((__GNUC__ < 8) && (defined(__x86_64__) || defined(__i386__))) /* On x86, Support old GCC with either no or buggy (GCC BZ#81316) __atomic_* support */ #define __WINE_ATOMIC_LOAD_ACQUIRE(ptr, ret) do { *(ret) = *(ptr); __asm__ __volatile__( "" ::: "memory" ); } while (0) #define __WINE_ATOMIC_LOAD_RELAXED(ptr, ret) do { *(ret) = *(ptr); } while (0) @@ -7296,7 +7296,7 @@ static FORCEINLINE void MemoryBarrier(void) #define __WINE_ATOMIC_LOAD_RELAXED(ptr, ret) __atomic_load(ptr, ret, __ATOMIC_RELAXED) #define __WINE_ATOMIC_STORE_RELEASE(ptr, val) __atomic_store(ptr, val, __ATOMIC_RELEASE) #define __WINE_ATOMIC_STORE_RELAXED(ptr, val) __atomic_store(ptr, val, __ATOMIC_RELAXED) -#endif /* defined(__x86_64__) || defined(__i386__) */ +#endif /* !defined(__clang__) && ((__GNUC__ < 8) && (defined(__x86_64__) || defined(__i386__))) */

static FORCEINLINE LONG ReadAcquire( LONG const volatile *src ) {

-- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/7281

Jinoh Kang (＠iamahuman)

5:11 p.m.

Is there a known performance or correctness benefit for using the C11 atomic variants? Either a game with improved perf or an example of an enabled compiler optimization (reordering) would work.

We should also look into server/fd.c too, it's where I originally borrowed the pattern from.

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7281#note_93920

William Horvath (＠whrvt)

5:31 p.m.

...

Is there a known performance or correctness benefit for using the C11 atomic variants?

Not that I know of. I just think it makes sense to use them if they're available (and work correctly) instead of needing to write our own.

...

We should also look into server/fd.c...

`server/fd.c` defines `atomic_store_*(volatile *ptr, value)` as `*ptr = value` on x86/x86_64. I'm not sure why it defines its own set of atomic helpers, when it also has access to the macros in `include/winnt.h` indirectly.

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7281#note_93924

William Horvath (＠whrvt)

6:10 p.m.

On Thu Feb 6 17:31:08 2025 +0000, William Horvath wrote:

...

...
Is there a known performance or correctness benefit for using the C11

atomic variants? Not that I know of. I just think it makes sense to use them if they're available (and work correctly) instead of needing to write our own.

...
We should also look into server/fd.c...

`server/fd.c` defines `atomic_store_*(volatile *ptr, value)` as `*ptr = value` on x86/x86_64. I'm not sure why it defines its own set of atomic helpers, when it also has access to the macros in `include/winnt.h` indirectly.

Also, note that these aren't C11 atomics, which would be `atomic_store` and `atomic_load`, without the `__` prefix. Underscore prefixed atomics are compiler builtins.

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7281#note_93937

Jinoh Kang (＠iamahuman)

11:58 p.m.

On Thu Feb 6 18:10:41 2025 +0000, William Horvath wrote:

...

Also, note that these aren't C11 atomics, which would be `atomic_store` and `atomic_load`, without the `__` prefix. Underscore prefixed atomics are compiler builtins.

I'm not particularly opposed to the idea, but without a convincing case for it I'm not sure further modification will be well received. (I'm not a maintainer, so maybe ask for others' opinion.)

Maybe when we drop support for GCC <8, then `__sync_` variants as well as open-coded inline asms could be dropped at once, although we may still need a stronger case for it.

Note that the 32-bit atomics have been effectively tested for years. ReadNoFence64 was relatively new though.

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7281#note_93985

William Horvath (＠whrvt)

4 Mar 4 Mar

12:17 a.m.

New subject: [PATCH 0/1] MR7281: include: Prefer builtin atomics on x86 unless known broken. - closed

This merge request was closed by William Horvath.

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7281

136

Age (days ago)

162

Last active (days ago)

wine-gitlab@winehq.org

6 comments

3 participants

tags (0)

participants (3)

Jinoh Kang (＠iamahuman)
William Horvath
William Horvath (＠whrvt)