[PATCH v2 0/4] MR11267: ntdll: Use getauxval(AT_HWCAP) to detect ARM/ARM64 processor features
The motivation here is to simplify processor feature detection on ARM/ARM64 by using `getauxval(AT_HWCAP)` instead of parsing `/proc/cpuinfo` (which is meant for human readability). Corresponding `FEAT_` strings for each feature are also included for use on macOS. And FreeBSD uses the same HWCAP_ constants as Linux, in the future this should work there with minor changes. @mstorsjo, this should build with fairly old headers too. The flags we use on ARM32 are all quite old, so I don’t think there’s a need to define them. For ARM64 I’m defining almost everything we use (`AT_HWCAP2`, and everything newer than the earliest `HWCAP_` flags). Let me know if additional defines are needed. References: - For most ARM64 PF_ features, Microsoft’s [documentation for `IsProcessorFeaturePresent()`](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-pro...) documents the corresponding ARM64 `FEAT_`. - ARM’s own documentation is clunky, but googling for a `FEAT_` string will get you ARM’s documentation and what registers can be used to identify it (i.e. for `FEAT_SHA3`, [the documentation](https://developer.arm.com/documentation/109697/2026_03/Feature-descriptions/...) says that it sets `ID_AA64ISAR0_EL1.SHA3`. - The Linux kernel documents which `HWCAP_` constant corresponds to various register values: https://docs.kernel.org/arch/arm64/elf_hwcaps.html -- v2: ntdll: Define HWCAP_CPUID on ARM64. ntdll: Detect ARM64 processor features using getauxval(AT_HWCAP). https://gitlab.winehq.org/wine/wine/-/merge_requests/11267
From: Brendan Shanks <bshanks@codeweavers.com> Windows sets all the ARM32 features (except EXTERNAL_CACHE_AVAILABLE) even on a CPU and OS that doesn't support ARM32. --- dlls/ntdll/unix/system.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/dlls/ntdll/unix/system.c b/dlls/ntdll/unix/system.c index cf2883f05ae..85e17899df9 100644 --- a/dlls/ntdll/unix/system.c +++ b/dlls/ntdll/unix/system.c @@ -707,6 +707,12 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) if (native_machine == IMAGE_FILE_MACHINE_ARMNT) return; + features[PF_ARM_VFP_32_REGISTERS_AVAILABLE] = TRUE; + features[PF_ARM_NEON_INSTRUCTIONS_AVAILABLE] = TRUE; + features[PF_ARM_DIVIDE_INSTRUCTION_AVAILABLE] = TRUE; + features[PF_ARM_64BIT_LOADSTORE_ATOMIC] = TRUE; + features[PF_ARM_FMAC_INSTRUCTIONS_AVAILABLE] = TRUE; + features[PF_ARM_V8_INSTRUCTIONS_AVAILABLE] = TRUE; features[PF_NX_ENABLED] = TRUE; @@ -715,11 +721,6 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) { switch (supported_machines[i]) { - case IMAGE_FILE_MACHINE_ARMNT: - features[PF_ARM_VFP_32_REGISTERS_AVAILABLE] = TRUE; - features[PF_ARM_NEON_INSTRUCTIONS_AVAILABLE] = TRUE; - features[PF_ARM_DIVIDE_INSTRUCTION_AVAILABLE] = TRUE; - break; case IMAGE_FILE_MACHINE_I386: features[PF_MMX_INSTRUCTIONS_AVAILABLE] = TRUE; features[PF_XMMI_INSTRUCTIONS_AVAILABLE] = TRUE; -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/11267
From: Brendan Shanks <bshanks@codeweavers.com> --- configure.ac | 1 + dlls/ntdll/unix/system.c | 33 ++++++++++++++++++++++++++------- 2 files changed, 27 insertions(+), 7 deletions(-) diff --git a/configure.ac b/configure.ac index a8e3bf77076..b8f497eaaf5 100644 --- a/configure.ac +++ b/configure.ac @@ -726,6 +726,7 @@ AC_CHECK_HEADERS(\ OpenCL/opencl.h \ arpa/inet.h \ arpa/nameser.h \ + asm/hwcap.h \ asm/termbits.h \ asm/types.h \ asm/user.h \ diff --git a/dlls/ntdll/unix/system.c b/dlls/ntdll/unix/system.c index 85e17899df9..87eb1c60add 100644 --- a/dlls/ntdll/unix/system.c +++ b/dlls/ntdll/unix/system.c @@ -36,6 +36,9 @@ #include <sys/time.h> #include <time.h> #include <dirent.h> +#ifdef HAVE_ASM_HWCAP_H +# include <asm/hwcap.h> +#endif #ifdef HAVE_SYS_PARAM_H # include <sys/param.h> #endif @@ -558,6 +561,23 @@ static int has_feature( const char *line, const char *feat ) return 0; } +#if defined(AT_HWCAP) && defined(__arm__) +static BOOLEAN has_capability( int hwcap, unsigned long hwcap_bit ) +{ + unsigned long type; + switch (hwcap) + { + case 1: type = AT_HWCAP; break; + default: return FALSE; + } + return !!(getauxval( type ) & hwcap_bit); +} + +#define HAS_FEATURE(hwcap, hwcap_bit, ...) has_capability( hwcap, hwcap_bit ) +#else +#define HAS_FEATURE(hwcap, hwcap_bit, ...) 0 +#endif + static void init_cpu_model(void) { unsigned int implementer = 0x41, part = 0, variant = 0, revision = 0; @@ -671,9 +691,6 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) value++; if ((s = strchr( value, '\n' ))) *s = 0; if (strcmp( line, "Features" )) continue; - features[PF_ARM_VFP_32_REGISTERS_AVAILABLE] = has_feature( value, "vfpv3" ); - features[PF_ARM_NEON_INSTRUCTIONS_AVAILABLE] = has_feature( value, "neon" ); - features[PF_ARM_DIVIDE_INSTRUCTION_AVAILABLE] = has_feature( value, "idivt" ); if (native_machine == IMAGE_FILE_MACHINE_ARMNT) break; features[PF_ARM_V8_CRC32_INSTRUCTIONS_AVAILABLE] = has_feature( value, "crc32" ); features[PF_ARM_V8_CRYPTO_INSTRUCTIONS_AVAILABLE] = has_feature( value, "aes" ); @@ -705,8 +722,11 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) features[PF_FASTFAIL_AVAILABLE] = TRUE; features[PF_COMPARE_EXCHANGE_DOUBLE] = TRUE; - if (native_machine == IMAGE_FILE_MACHINE_ARMNT) return; - +#ifdef __arm__ + features[PF_ARM_VFP_32_REGISTERS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_VFPv3 ); + features[PF_ARM_NEON_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_NEON ); + features[PF_ARM_DIVIDE_INSTRUCTION_AVAILABLE] = HAS_FEATURE( 1, HWCAP_IDIVT ); +#else features[PF_ARM_VFP_32_REGISTERS_AVAILABLE] = TRUE; features[PF_ARM_NEON_INSTRUCTIONS_AVAILABLE] = TRUE; features[PF_ARM_DIVIDE_INSTRUCTION_AVAILABLE] = TRUE; @@ -735,6 +755,7 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) break; } } +#endif } #endif /* End architecture specific feature detection for CPUs */ @@ -1947,8 +1968,6 @@ static WORD append_smbios_boot_info( struct smbios_buffer *buf ) #ifdef __aarch64__ #ifdef linux -#include <asm/hwcap.h> - static DWORD get_core_id_regs_arm64( struct smbios_wine_id_reg_value_arm64 *regs, WORD logical_thread_id ) { -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/11267
From: Brendan Shanks <bshanks@codeweavers.com> --- dlls/ntdll/unix/system.c | 251 +++++++++++++++++++++++---------------- 1 file changed, 150 insertions(+), 101 deletions(-) diff --git a/dlls/ntdll/unix/system.c b/dlls/ntdll/unix/system.c index 87eb1c60add..56c5f21aa79 100644 --- a/dlls/ntdll/unix/system.c +++ b/dlls/ntdll/unix/system.c @@ -24,6 +24,7 @@ #include "config.h" +#include <assert.h> #include <fcntl.h> #include <string.h> #include <stdarg.h> @@ -76,6 +77,55 @@ # include <hwloc.h> #endif +#ifdef __linux__ +# define AT_HWCAP2 26 +#endif + +#if defined(__aarch64__) && defined(AT_HWCAP) +# define HWCAP_ATOMICS (1 << 8) +# define HWCAP_FPHP (1 << 9) +# define HWCAP_JSCVT (1 << 13) +# define HWCAP_LRCPC (1 << 15) +# define HWCAP_SHA3 (1 << 17) +# define HWCAP_ASIMDDP (1 << 20) +# define HWCAP_SHA512 (1 << 21) +# define HWCAP_SVE (1 << 22) +# define HWCAP_USCAT (1 << 25) +# define HWCAP_SME2P2 (1UL << 42) +# define HWCAP_SME_SBITPERM (1UL << 43) +# define HWCAP_SME_AES (1UL << 44) +# define HWCAP2_SVE2 (1 << 1) +# define HWCAP2_SVEAES (1 << 2) +# define HWCAP2_SVEPMULL (1 << 3) +# define HWCAP2_SVEBITPERM (1 << 4) +# define HWCAP2_SVESHA3 (1 << 5) +# define HWCAP2_SVESM4 (1 << 6) +# define HWCAP2_SVEI8MM (1 << 9) +# define HWCAP2_SVEF32MM (1 << 10) +# define HWCAP2_SVEF64MM (1 << 11) +# define HWCAP2_SVEBF16 (1 << 12) +# define HWCAP2_I8MM (1 << 13) +# define HWCAP2_BF16 (1 << 14) +# define HWCAP2_SME (1 << 23) +# define HWCAP2_SME_I16I64 (1 << 24) +# define HWCAP2_SME_F64F64 (1 << 25) +# define HWCAP2_SME_FA64 (1 << 30) +# define HWCAP2_EBF16 (1UL << 32) +# define HWCAP2_SVE_EBF16 (1UL << 33) +# define HWCAP2_SVE2P1 (1UL << 36) +# define HWCAP2_SME2 (1UL << 37) +# define HWCAP2_SME2P1 (1UL << 38) +# define HWCAP2_SME_F16F16 (1UL << 42) +# define HWCAP2_SME_B16B16 (1UL << 41) +# define HWCAP2_SVE_B16B16 (1UL << 45) +# define HWCAP2_SME_LUTV2 (1UL << 57) +# define HWCAP2_SME_F8F16 (1UL << 58) +# define HWCAP2_SME_F8F32 (1UL << 59) +# define HWCAP2_SME_SF8FMA (1UL << 60) +# define HWCAP2_SME_SF8DP4 (1UL << 61) +# define HWCAP2_SME_SF8DP2 (1UL << 62) +#endif + #include "ntstatus.h" #include "windef.h" #include "winternl.h" @@ -526,7 +576,7 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) features[PF_AVX512F_INSTRUCTIONS_AVAILABLE] = !!(regs[1] & (1 << 16)); features[PF_RDPID_INSTRUCTION_AVAILABLE] = !!(regs[2] & (1 << 22)); features[PF_MOVDIR64B_INSTRUCTION_AVAILABLE]= !!(regs[2] & (1 << 28)); -#if defined(__linux__) && defined(AT_HWCAP2) +#if defined(__linux__) features[PF_RDWRFSGSBASE_AVAILABLE] &= !!(getauxval( AT_HWCAP2 ) & 2); #endif } @@ -548,36 +598,66 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) #elif defined(__arm__) || defined(__aarch64__) -static int has_feature( const char *line, const char *feat ) -{ - size_t len = strlen(feat); - - while (*line) - { - while (*line == ' ' || *line == '\t') line++; - if (!strncmp( line, feat, len ) && (!line[len] || isspace(line[len]))) return 1; - while (*line && *line != ' ' && *line != '\t') line++; - } - return 0; -} - -#if defined(AT_HWCAP) && defined(__arm__) +#if defined(AT_HWCAP) static BOOLEAN has_capability( int hwcap, unsigned long hwcap_bit ) { unsigned long type; switch (hwcap) { case 1: type = AT_HWCAP; break; + case 2: type = AT_HWCAP2; break; default: return FALSE; } return !!(getauxval( type ) & hwcap_bit); } #define HAS_FEATURE(hwcap, hwcap_bit, ...) has_capability( hwcap, hwcap_bit ) +#elif defined(__APPLE__) +static BOOLEAN has_feature( const char *feature ) +{ + char buf[200]; + int val; + size_t size = sizeof(val); + + snprintf( buf, sizeof(buf), "hw.optional.arm.%s", feature ); + if (!sysctlbyname( buf, &val, &size, NULL, 0 )) + return !!val; + return FALSE; +} + +static BOOLEAN has_features( int dummy, ... ) +{ + BOOLEAN ret = TRUE; + const char *feature; + va_list args; + + va_start( args, dummy ); + + while ((feature = va_arg( args, const char * ))) + { + if (has_feature( feature )) + continue; + + ret = FALSE; + break; + } + + va_end( args ); + return ret; +} +#define HAS_FEATURE(hwcap, hwcap_bit, ...) has_features( 0, __VA_ARGS__, NULL ) #else #define HAS_FEATURE(hwcap, hwcap_bit, ...) 0 #endif +static void set_feature_bitmap( ULONG flag, BOOLEAN enabled ) +{ + assert( flag >= PROCESSOR_FEATURE_MAX ); + if (!enabled) return; + flag -= PROCESSOR_FEATURE_MAX; + cpu_features_bitmap[flag / 64] |= 1ull << (flag % 64); +} + static void init_cpu_model(void) { unsigned int implementer = 0x41, part = 0, variant = 0, revision = 0; @@ -603,46 +683,6 @@ static void init_cpu_model(void) else if (!strcmp( line, "CPU part" )) part = strtoul( value, NULL, 0); else if (!strcmp( line, "CPU variant" )) variant = strtoul( value, NULL, 0); else if (!strcmp( line, "CPU revision" )) revision = strtoul( value, NULL, 0); - else if (!strcmp( line, "Features" )) - { - static const struct { ULONG flag; const char *name; } features[] = - { - { PF_ARM_SHA3_INSTRUCTIONS_AVAILABLE, "sha3" }, - { PF_ARM_SHA512_INSTRUCTIONS_AVAILABLE, "sha512" }, - { PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE, "i8mm" }, - { PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE, "fphp" }, - { PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE, "bf16" }, - { PF_ARM_V86_EBF16_INSTRUCTIONS_AVAILABLE, "ebf16" }, - { PF_ARM_SME_INSTRUCTIONS_AVAILABLE, "sme" }, - { PF_ARM_SME2_INSTRUCTIONS_AVAILABLE, "sme2" }, - { PF_ARM_SME2_1_INSTRUCTIONS_AVAILABLE, "sme2p1" }, - { PF_ARM_SME2_2_INSTRUCTIONS_AVAILABLE, "sme2p2" }, - { PF_ARM_SME_AES_INSTRUCTIONS_AVAILABLE, "smeaes" }, - { PF_ARM_SME_SBITPERM_INSTRUCTIONS_AVAILABLE, "smesbitperm" }, - /* The PF_ARM_SME_SF8MM4_INSTRUCTIONS_AVAILABLE and - * PF_ARM_SME_SF8MM8_INSTRUCTIONS_AVAILABLE flags aren't exposed by - * the Linux kernel, see - * https://lists.infradead.org/pipermail/linux-arm-kernel/2025-January/991187.h... */ - { PF_ARM_SME_SF8DP2_INSTRUCTIONS_AVAILABLE, "smesf8dp2" }, - { PF_ARM_SME_SF8DP4_INSTRUCTIONS_AVAILABLE, "smesf8dp4" }, - { PF_ARM_SME_SF8FMA_INSTRUCTIONS_AVAILABLE, "smesf8fma" }, - { PF_ARM_SME_F8F32_INSTRUCTIONS_AVAILABLE, "smef8f32" }, - { PF_ARM_SME_F8F16_INSTRUCTIONS_AVAILABLE, "smef8f16" }, - { PF_ARM_SME_F16F16_INSTRUCTIONS_AVAILABLE, "smef16f16" }, - { PF_ARM_SME_B16B16_INSTRUCTIONS_AVAILABLE, "smeb16b16" }, - { PF_ARM_SME_F64F64_INSTRUCTIONS_AVAILABLE, "smef64f64" }, - { PF_ARM_SME_I16I64_INSTRUCTIONS_AVAILABLE, "smei16i64" }, - { PF_ARM_SME_LUTv2_INSTRUCTIONS_AVAILABLE, "smelutv2" }, - { PF_ARM_SME_FA64_INSTRUCTIONS_AVAILABLE, "smefa64" }, - }; - - for (unsigned int i = 0; i < ARRAY_SIZE(features); i++) - { - ULONG flag = features[i].flag - PROCESSOR_FEATURE_MAX; - if (!has_feature( value, features[i].name )) continue; - cpu_features_bitmap[flag / 64] |= 1ull << (flag % 64); - } - } } fclose( f ); } @@ -664,6 +704,38 @@ static void init_cpu_model(void) case 0x66: strcpy( cpu_vendor, "Faraday" ); break; case 0x69: strcpy( cpu_vendor, "Intel" ); break; } + +#ifdef __aarch64__ + set_feature_bitmap( PF_ARM_SHA3_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 1, HWCAP_SHA3, "FEAT_SHA3" ) ); + set_feature_bitmap( PF_ARM_SHA512_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 1, HWCAP_SHA512, "FEAT_SHA512" ) ); + set_feature_bitmap( PF_ARM_V82_I8MM_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_I8MM, "FEAT_I8MM" ) ); + set_feature_bitmap( PF_ARM_V82_FP16_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 1, HWCAP_FPHP, "FEAT_FP16" ) ); + set_feature_bitmap( PF_ARM_V86_BF16_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_BF16, "FEAT_BF16" ) ); + set_feature_bitmap( PF_ARM_V86_EBF16_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_EBF16, "FEAT_EBF16" ) ); + set_feature_bitmap( PF_ARM_SME_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME, "FEAT_SME" ) ); + set_feature_bitmap( PF_ARM_SME2_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME2, "FEAT_SME2" ) ); + set_feature_bitmap( PF_ARM_SME2_1_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME2P1, "FEAT_SME2p1" ) ); + set_feature_bitmap( PF_ARM_SME2_2_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 1, HWCAP_SME2P2, "FEAT_SME2p2" ) ); + set_feature_bitmap( PF_ARM_SME_AES_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 1, HWCAP_SME_AES, "FEAT_SSVE_AES" ) ); + set_feature_bitmap( PF_ARM_SME_SBITPERM_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 1, HWCAP_SME_SBITPERM, "FEAT_SSVE_BitPerm" ) ); + /* The PF_ARM_SME_SF8MM4_INSTRUCTIONS_AVAILABLE and + * PF_ARM_SME_SF8MM8_INSTRUCTIONS_AVAILABLE flags aren't exposed by + * the Linux kernel, see + * https://lists.infradead.org/pipermail/linux-arm-kernel/2025-January/991187.h... */ + set_feature_bitmap( PF_ARM_SME_SF8MM4_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 0, 0, "FEAT_SSVE_F8F16MM" ) ); + set_feature_bitmap( PF_ARM_SME_SF8MM8_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 0, 0, "FEAT_SSVE_F8F32MM" ) ); + set_feature_bitmap( PF_ARM_SME_SF8DP2_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_SF8DP2, "FEAT_SSVE_FP8DOT2" ) ); + set_feature_bitmap( PF_ARM_SME_SF8DP4_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_SF8DP4, "FEAT_SSVE_FP8DOT4" ) ); + set_feature_bitmap( PF_ARM_SME_SF8FMA_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_SF8FMA, "FEAT_SSVE_FP8FMA" ) ); + set_feature_bitmap( PF_ARM_SME_F8F32_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_F8F32, "FEAT_SME_F8F32" ) ); + set_feature_bitmap( PF_ARM_SME_F8F16_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_F8F16, "FEAT_SME_F8F16" ) ); + set_feature_bitmap( PF_ARM_SME_F16F16_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_F16F16, "FEAT_SME_F16F16" ) ); + set_feature_bitmap( PF_ARM_SME_B16B16_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_B16B16, "FEAT_SME_B16B16" ) ); + set_feature_bitmap( PF_ARM_SME_F64F64_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_F64F64, "FEAT_SME_F64F64" ) ); + set_feature_bitmap( PF_ARM_SME_I16I64_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_I16I64, "FEAT_SME_I16I64" ) ); + set_feature_bitmap( PF_ARM_SME_LUTv2_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_LUTV2, "FEAT_SME_LUTv2" ) ); + set_feature_bitmap( PF_ARM_SME_FA64_INSTRUCTIONS_AVAILABLE, HAS_FEATURE( 2, HWCAP2_SME_FA64, "FEAT_SME_FA64" ) ); +#endif } static ULONGLONG get_cpu_features(void) @@ -675,50 +747,6 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) { BOOLEAN *features = data->ProcessorFeatures; -#ifdef linux - FILE *f = fopen("/proc/cpuinfo", "r"); - if (f) - { - char *s, *value, line[512]; - while (fgets( line, sizeof(line), f )) - { - /* NOTE: the ':' is the only character we can rely on */ - if (!(value = strchr(line,':'))) continue; - /* terminate the valuename */ - s = value - 1; - while ((s >= line) && (*s == ' ' || *s == '\t')) s--; - s[1] = 0; - value++; - if ((s = strchr( value, '\n' ))) *s = 0; - if (strcmp( line, "Features" )) continue; - if (native_machine == IMAGE_FILE_MACHINE_ARMNT) break; - features[PF_ARM_V8_CRC32_INSTRUCTIONS_AVAILABLE] = has_feature( value, "crc32" ); - features[PF_ARM_V8_CRYPTO_INSTRUCTIONS_AVAILABLE] = has_feature( value, "aes" ); - features[PF_ARM_V81_ATOMIC_INSTRUCTIONS_AVAILABLE] = has_feature( value, "atomics" ); - features[PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE] = has_feature( value, "asimddp" ); - features[PF_ARM_V83_JSCVT_INSTRUCTIONS_AVAILABLE] = has_feature( value, "jscvt" ); - features[PF_ARM_V83_LRCPC_INSTRUCTIONS_AVAILABLE] = has_feature( value, "lrcpc" ); - features[PF_ARM_SVE_INSTRUCTIONS_AVAILABLE] = has_feature( value, "sve" ); - features[PF_ARM_SVE2_INSTRUCTIONS_AVAILABLE] = has_feature( value, "sve2" ); - features[PF_ARM_SVE2_1_INSTRUCTIONS_AVAILABLE] = has_feature( value, "sve2p1" ); - features[PF_ARM_SVE_AES_INSTRUCTIONS_AVAILABLE] = has_feature( value, "sveaes" ); - features[PF_ARM_SVE_PMULL128_INSTRUCTIONS_AVAILABLE] = has_feature( value, "svepmull" ); - features[PF_ARM_SVE_BITPERM_INSTRUCTIONS_AVAILABLE] = has_feature( value, "svebitperm" ); - features[PF_ARM_SVE_BF16_INSTRUCTIONS_AVAILABLE] = has_feature( value, "svebf16" ); - features[PF_ARM_SVE_EBF16_INSTRUCTIONS_AVAILABLE] = has_feature( value, "sveebf16" ); - features[PF_ARM_SVE_B16B16_INSTRUCTIONS_AVAILABLE] = has_feature( value, "sveb16b16" ); - features[PF_ARM_SVE_SHA3_INSTRUCTIONS_AVAILABLE] = has_feature( value, "svesha3" ); - features[PF_ARM_SVE_SM4_INSTRUCTIONS_AVAILABLE] = has_feature( value, "svesm4" ); - features[PF_ARM_SVE_I8MM_INSTRUCTIONS_AVAILABLE] = has_feature( value, "svei8mm" ); - features[PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE] = has_feature( value, "svef32mm" ); - features[PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE] = has_feature( value, "svef64mm" ); - features[PF_ARM_LSE2_AVAILABLE] = has_feature( value, "uscat" ); - break; - } - fclose( f ); - } -#endif - features[PF_FASTFAIL_AVAILABLE] = TRUE; features[PF_COMPARE_EXCHANGE_DOUBLE] = TRUE; @@ -733,8 +761,29 @@ void init_shared_data_cpuinfo( KUSER_SHARED_DATA *data ) features[PF_ARM_64BIT_LOADSTORE_ATOMIC] = TRUE; features[PF_ARM_FMAC_INSTRUCTIONS_AVAILABLE] = TRUE; - features[PF_ARM_V8_INSTRUCTIONS_AVAILABLE] = TRUE; - features[PF_NX_ENABLED] = TRUE; + features[PF_ARM_V8_INSTRUCTIONS_AVAILABLE] = TRUE; + features[PF_NX_ENABLED] = TRUE; + features[PF_ARM_V8_CRYPTO_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_AES, "FEAT_AES" ); + features[PF_ARM_V8_CRC32_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_CRC32, "FEAT_CRC32" ); + features[PF_ARM_V81_ATOMIC_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_ATOMICS, "FEAT_LSE" ); + features[PF_ARM_V82_DP_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_ASIMDDP, "FEAT_DotProd" ); + features[PF_ARM_V83_JSCVT_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_JSCVT, "FEAT_JSCVT" ); + features[PF_ARM_V83_LRCPC_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_LRCPC, "FEAT_LRCPC" ); + features[PF_ARM_SVE_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 1, HWCAP_SVE, "FEAT_SVE" ); + features[PF_ARM_SVE2_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVE2, "FEAT_SVE2" ); + features[PF_ARM_SVE2_1_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVE2P1, "FEAT_SVE2p1" ); + features[PF_ARM_SVE_AES_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVEAES, "FEAT_SVE_AES" ); + features[PF_ARM_SVE_PMULL128_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVEPMULL, "FEAT_SVE_PMULL128" ); + features[PF_ARM_SVE_BITPERM_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVEBITPERM, "FEAT_SVE_BitPerm" ); + features[PF_ARM_SVE_BF16_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVEBF16, "FEAT_SVE", "FEAT_BF16" ); + features[PF_ARM_SVE_EBF16_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVE_EBF16, "FEAT_SVE", "FEAT_EBF16" ); + features[PF_ARM_SVE_B16B16_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVE_B16B16, "FEAT_SVE_B16B16" ); + features[PF_ARM_SVE_SHA3_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVESHA3, "FEAT_SVE_SHA3" ); + features[PF_ARM_SVE_SM4_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVESM4, "FEAT_SVE_SM4" ); + features[PF_ARM_SVE_I8MM_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVEI8MM, "FEAT_SVE", "FEAT_I8MM" ); + features[PF_ARM_SVE_F32MM_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVEF32MM, "FEAT_F32MM" ); + features[PF_ARM_SVE_F64MM_INSTRUCTIONS_AVAILABLE] = HAS_FEATURE( 2, HWCAP2_SVEF64MM, "FEAT_F64MM" ); + features[PF_ARM_LSE2_AVAILABLE] = HAS_FEATURE( 1, HWCAP_USCAT, "FEAT_LSE2" ); /* add features for other architectures supported by wow64 */ for (unsigned int i = 0; i < supported_machines_count; i++) -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/11267
From: Brendan Shanks <bshanks@codeweavers.com> --- dlls/ntdll/unix/system.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/dlls/ntdll/unix/system.c b/dlls/ntdll/unix/system.c index 56c5f21aa79..041a992c8e3 100644 --- a/dlls/ntdll/unix/system.c +++ b/dlls/ntdll/unix/system.c @@ -84,6 +84,7 @@ #if defined(__aarch64__) && defined(AT_HWCAP) # define HWCAP_ATOMICS (1 << 8) # define HWCAP_FPHP (1 << 9) +# define HWCAP_CPUID (1 << 11) # define HWCAP_JSCVT (1 << 13) # define HWCAP_LRCPC (1 << 15) # define HWCAP_SHA3 (1 << 17) @@ -2035,7 +2036,6 @@ static DWORD get_core_id_regs_arm64( struct smbios_wine_id_reg_value_arm64 *regs regs[regidx++] = (struct smbios_wine_id_reg_value_arm64){ 0x4000, value }; } -#ifdef HWCAP_CPUID if (!(getauxval(AT_HWCAP) & HWCAP_CPUID)) { WARN( "Skipping ID register population as kernel is missing emulation support.\n" ); @@ -2068,9 +2068,6 @@ static DWORD get_core_id_regs_arm64( struct smbios_wine_id_reg_value_arm64 *regs READ_ID_REG( 0x5801 ); /* CTR_EL0 */ /* Windows exposes SCTLR_EL1, ACTLR_EL1, TTBR0_EL1 and MAIR_EL1, but these are inaccessible under * linux so leave them unpopulated. */ -#else - WARN( "Skipping ID register population as HWCAP_CPUID isn't supported.\n" ); -#endif #undef READ_ID_REG #undef STR -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/11267
Thanks for looping me in! This looks like a very reasonable direction to go in overall - this matches how we've settled on doing feature detection in many other projects as well (ffmpeg, dav1d etc).
@mstorsjo, this should build with fairly old headers too. The flags we use on ARM32 are all quite old, so I don’t think there’s a need to define them. For ARM64 I’m defining almost everything we use (`AT_HWCAP2`, and everything newer than the earliest `HWCAP_` flags). Let me know if additional defines are needed.
In test building with my very old toolchain (which I use because I have a test device with glibc from 2015 or so), I hit two other missing ones - we'd need to define `HWCAP_AES` and `HWCAP_CRC32` too. For ARM32 I didn't hit anything missing with my toolchains though. Where to get these defines, and whether to redefine them or not, is the main question that comes up around these... On ARM, there's another funny discrepancy around them: Normally when using these, AFAIK, you're supposed to include `<sys/auxv.h>`. This includes `<bits/hwcap.h>` that contains definitions of them. But they are also defined in `<asm/hwcap.h>` which we seem to be using here (and your patches pile on). As far as I've understood, the `<asm/*>` headers are headers primarily meant for the kernel, that the userspace shouldn't really be using. `<asm/hwcap.h>` defines `HWCAP_NEON` as `(1 << 12)` - while `<bits/hwcap.h>` defines `HWCAP_ARM_NEON` as `4096`. So the defines in `bits/hwcap.h` have an extra `_ARM` prefix inbetween - and the literal string they expand to differ. (I don't remember offhand what defines the BSDs provide here.) To avoid any potential mismatch between these, as we practically do want to provide our own fallback defines at least for aarch64, we've settled on a separate namespace for the defines that we provide, distinct from the ones from system headers: https://code.ffmpeg.org/FFmpeg/FFmpeg/src/commit/38b88335f99e76ed89ff3c93f87... There we define them as `HWCAP_AARCH64_CRC32` with an extra `_AARCH64` inbetween - which no system headers define. For ARM we used to match the naming from `<asm/hwcap.h>` without including it - https://code.ffmpeg.org/FFmpeg/FFmpeg/src/commit/cdae5c3639f4adcd289e643a203.... But since https://code.ffmpeg.org/FFmpeg/FFmpeg/commit/ced4a6ebc9e7cd92d0ca9b9fb8f9d10... we've switched to the same naming style as on other architectures - which happen to coincide with the naming from `<sys/auxv.h>` - which does cause redefinition warnings as they expand to a different literal string. In dav1d that's avoided by wrapping the potentially conflicting ones in an `#ifndef` - https://code.videolan.org/videolan/dav1d/-/blob/master/src/arm/cpu.c?ref_typ.... Sorry for the long text... TL;DR: - Please define `HWCAP_AES` and `HWCAP_CRC32` for aarch64. - Consider don't relying on `<asm/hwcap.h>` as I think that's linux kernel specific - As we're defining these constants unconditionally, consider placing them in a disjoint namespace to avoid redefinition warnings if system headers define them with a different literal spelling. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/11267#note_144338
Martin Storsjö (@mstorsjo) commented about dlls/ntdll/unix/system.c:
+ case 2: type = AT_HWCAP2; break; default: return FALSE; } return !!(getauxval( type ) & hwcap_bit); }
#define HAS_FEATURE(hwcap, hwcap_bit, ...) has_capability( hwcap, hwcap_bit ) +#elif defined(__APPLE__) +static BOOLEAN has_feature( const char *feature ) +{ + char buf[200]; + int val; + size_t size = sizeof(val); + + snprintf( buf, sizeof(buf), "hw.optional.arm.%s", feature ); + if (!sysctlbyname( buf, &val, &size, NULL, 0 )) As this commit also expands the feature detection to macOS, it would be nice to mention that in the commit message of this commit - not only in the MR description, as this commit isn’t a pure refactoring as it otherwise seems.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/11267#note_144342
participants (3)
-
Brendan Shanks -
Brendan Shanks (@bshanks) -
Martin Storsjö (@mstorsjo)