This is an attempt to upstream a set of Proton patches that correct the value of HKEY_LOCAL_MACHINE\HARDWARE\DESCRIPTION\System\CentralProcessor\*\~MHz to not be the maximum frequency of the processor, but the calibrated TSC.
Some games like Horizon Zero Dawn and most likely some more obscure benchmark/profiling tools use this as indicated in this forum post: https://community.osr.com/discussion/288014/how-to-find-out-tsc-frequency
The last comment also suggests querying the above registry key for the TSC.
To my understanding the calibration code has been successfully in use for some time now without any known issues.
I tried to be as faithful to the original history as possible with separating out my changes into their own commits.
If everything should be squashed to be prettier, just let me know!
FYI @rbernon
``` In HKEY_LOCAL_MACHINE\HARDWARE\DESCRIPTION\System\CentralProcessor
Squashed with patches from:
* Arkadiusz Hiler ahiler@codeweavers.com
Check if the kernel trusts TSC before using it for Qpc.
Even if the bits are claiming that TSC meets our requirements the hardware implementation may still be broken.
The Linux kernel does a lot of quality testing before deciding to use as the clock source. If it (or the user, through an override) does not trust the TSC we should not trust it either.
* Joshua Ashton joshua@froggi.es
Some games such as Horizon Zero Dawn use this registry value to correlate values from rtdsc to real time.
Testing across a few devices, is seems like Windows always returns the TSC frequency in this entry, not the current/maximum frequency of the processor.
Returning the nominal/maximum cpu frequency here causes the game to run in slow motion as it may not match the tsc frequency of the processor.
Ideally we'd not have to measure this and the kernel would return tsc_khz to userspace, but this is a good enough stop-gap until https://lkml.org/lkml/2020/12/31/72 or something similar is merged.
CW-Bug-Id: #18918 CW-Bug-Id: #18958 ```
-- v2: wineboot: Add comment about TSC trust check.
From: Rémi Bernon rbernon@codeweavers.com
In HKEY_LOCAL_MACHINE\HARDWARE\DESCRIPTION\System\CentralProcessor
Squashed with patches from:
* Arkadiusz Hiler ahiler@codeweavers.com
Check if the kernel trusts TSC before using it for Qpc.
Even if the bits are claiming that TSC meets our requirements the hardware implementation may still be broken.
The Linux kernel does a lot of quality testing before deciding to use as the clock source. If it (or the user, through an override) does not trust the TSC we should not trust it either.
* Joshua Ashton joshua@froggi.es
Some games such as Horizon Zero Dawn use this registry value to correlate values from rtdsc to real time.
Testing across a few devices, is seems like Windows always returns the TSC frequency in this entry, not the current/maximum frequency of the processor.
Returning the nominal/maximum cpu frequency here causes the game to run in slow motion as it may not match the tsc frequency of the processor.
Ideally we'd not have to measure this and the kernel would return tsc_khz to userspace, but this is a good enough stop-gap until https://lkml.org/lkml/2020/12/31/72 or something similar is merged.
CW-Bug-Id: #18918 CW-Bug-Id: #18958 --- programs/wineboot/wineboot.c | 175 ++++++++++++++++++++++++++++++++++- 1 file changed, 170 insertions(+), 5 deletions(-)
diff --git a/programs/wineboot/wineboot.c b/programs/wineboot/wineboot.c index 728c41fffa9..3e8ffcb484a 100644 --- a/programs/wineboot/wineboot.c +++ b/programs/wineboot/wineboot.c @@ -82,6 +82,8 @@
WINE_DEFAULT_DEBUG_CHANNEL(wineboot);
+#define TICKSPERSEC 10000000 + extern BOOL shutdown_close_windows( BOOL force ); extern BOOL shutdown_all_desktops( BOOL force ); extern void kill_processes( BOOL kill_desktop ); @@ -240,15 +242,173 @@ static void initialize_xstate_features(struct _KUSER_SHARED_DATA *data) TRACE("XSAVE feature 2 %#x, %#x, %#x, %#x.\n", regs[0], regs[1], regs[2], regs[3]); }
+static UINT64 read_tsc_frequency( BOOL has_rdtscp ) +{ + UINT64 freq = 0; + +/* FIXME: Intel provides TSC freq in some CPUID but it's been slightly broken, + fix it properly and test it on real Intel hardware */ + +#if 0 + int regs[4], cpuid_level, tmp; + UINT64 denom, numer; + + __cpuid( regs, 0 ); + tmp = regs[2]; + regs[2] = regs[3]; + regs[3] = tmp; + + /* only available on some intel CPUs */ + if (memcmp( regs + 1, "GenuineIntel", 12 )) freq = 0; + else if ((cpuid_level = regs[0]) < 0x15) freq = 0; + else + { + __cpuid( regs, 0x15 ); + if (!(denom = regs[0]) || !(numer = regs[1])) freq = 0; + else + { + if ((freq = regs[2])) freq = freq * numer / denom; + else if (cpuid_level >= 0x16) + { + __cpuid( regs, 0x16 ); /* eax is base freq in MHz */ + freq = regs[0] * (UINT64)1000000; + } + else freq = 0; + } + + if (!freq) WARN( "Failed to read TSC frequency from CPUID, falling back to calibration.\n" ); + else TRACE( "TSC frequency read from CPUID, found %I64u Hz\n", freq ); + } +#endif + + if (freq == 0) + { + LONGLONG time0, time1, tsc0, tsc1, tsc2, tsc3, freq0, freq1, error; + unsigned int aux; + UINT retries = 50; + int regs[4]; + + do + { + if (has_rdtscp) + { + tsc0 = __rdtscp( &aux ); + time0 = RtlGetSystemTimePrecise(); + tsc1 = __rdtscp( &aux ); + Sleep( 1 ); + tsc2 = __rdtscp( &aux ); + time1 = RtlGetSystemTimePrecise(); + tsc3 = __rdtscp( &aux ); + } + else + { + tsc0 = __rdtsc(); __cpuid( regs, 0 ); + time0 = RtlGetSystemTimePrecise(); + tsc1 = __rdtsc(); __cpuid( regs, 0 ); + Sleep(1); + tsc2 = __rdtsc(); __cpuid( regs, 0 ); + time1 = RtlGetSystemTimePrecise(); + tsc3 = __rdtsc(); __cpuid( regs, 0 ); + } + + freq0 = (tsc2 - tsc0) * 10000000 / (time1 - time0); + freq1 = (tsc3 - tsc1) * 10000000 / (time1 - time0); + error = llabs( (freq1 - freq0) * 1000000 / min( freq1, freq0 ) ); + } + while (error > 100 && --retries); + + if (!retries) WARN( "TSC frequency calibration failed, unstable TSC?\n" ); + else + { + freq = (freq0 + freq1) / 2; + TRACE( "TSC frequency calibration complete, found %I64u Hz\n", freq ); + } + } + + return freq; +} + +static BOOL is_tsc_trusted_by_the_kernel(void) +{ + char buf[4] = {}; + DWORD num_read; + HANDLE handle; + BOOL ret = TRUE; + + handle = CreateFileA( "\??\unix\sys\bus\clocksource\devices\clocksource0\current_clocksource", + GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, 0 ); + if (handle == INVALID_HANDLE_VALUE) return TRUE; + + if (ReadFile( handle, buf, sizeof(buf) - 1, &num_read, NULL ) && strcmp( "tsc", buf )) + ret = FALSE; + + CloseHandle( handle ); + return ret; +} + +static void initialize_qpc_features( struct _KUSER_SHARED_DATA *data, UINT64 *tsc_frequency ) +{ + BOOL has_rdtscp = FALSE; + int regs[4]; + + data->QpcBypassEnabled = 0; + data->QpcFrequency = TICKSPERSEC; + data->QpcShift = 0; + data->QpcBias = 0; + *tsc_frequency = 0; + + if (!is_tsc_trusted_by_the_kernel()) + { + WARN( "Failed to compute TSC frequency, not trusted by the kernel.\n" ); + return; + } + + if (!data->ProcessorFeatures[PF_RDTSC_INSTRUCTION_AVAILABLE]) + { + WARN( "Failed to compute TSC frequency, RDTSC instruction not supported.\n" ); + return; + } + + __cpuid( regs, 0x80000000 ); + if (regs[0] < 0x80000007) + { + WARN( "Failed to compute TSC frequency, unable to check invariant TSC.\n" ); + return; + } + + /* check for invariant tsc bit */ + __cpuid( regs, 0x80000007 ); + if (!(regs[3] & (1 << 8))) + { + WARN( "Failed to compute TSC frequency, no invariant TSC.\n" ); + return; + } + + /* check for rdtscp support bit */ + __cpuid( regs, 0x80000001 ); + if ((regs[3] & (1 << 27))) has_rdtscp = TRUE; + + *tsc_frequency = read_tsc_frequency( has_rdtscp ); +} + #else
static void initialize_xstate_features(struct _KUSER_SHARED_DATA *data) { }
+static void initialize_qpc_features( struct _KUSER_SHARED_DATA *data, UINT64 *tsc_frequency ) +{ + data->QpcBypassEnabled = 0; + data->QpcFrequency = TICKSPERSEC; + data->QpcShift = 0; + data->QpcBias = 0; + *tsc_frequency = 0; +} + #endif
-static void create_user_shared_data(void) +static void create_user_shared_data( UINT64 *tsc_frequency ) { struct _KUSER_SHARED_DATA *data; RTL_OSVERSIONINFOEXW version; @@ -367,6 +527,7 @@ static void create_user_shared_data(void) data->ActiveGroupCount = 1;
initialize_xstate_features( data ); + initialize_qpc_features( data, tsc_frequency );
UnmapViewOfFile( data ); } @@ -659,7 +820,7 @@ done: }
/* create the volatile hardware registry keys */ -static void create_hardware_registry_keys(void) +static void create_hardware_registry_keys( UINT64 tsc_frequency ) { unsigned int i; HKEY hkey, system_key, cpu_key, fpu_key; @@ -736,13 +897,16 @@ static void create_hardware_registry_keys(void) if (!RegCreateKeyExW( cpu_key, numW, 0, NULL, REG_OPTION_VOLATILE, KEY_ALL_ACCESS, NULL, &hkey, NULL )) { + DWORD tsc_freq_mhz = (DWORD)(tsc_frequency / 1000000ull); /* Hz -> Mhz */ + if (!tsc_freq_mhz) tsc_freq_mhz = power_info[i].MaxMhz; + RegSetValueExW( hkey, L"FeatureSet", 0, REG_DWORD, (BYTE *)&sci.ProcessorFeatureBits, sizeof(DWORD) ); set_reg_value( hkey, L"Identifier", id ); /* TODO: report ARM properly */ RegSetValueExA( hkey, "ProcessorNameString", 0, REG_SZ, (const BYTE *)name_buffer, strlen( (char *)name_buffer ) + 1 ); set_reg_value( hkey, L"VendorIdentifier", vendorid ); - RegSetValueExW( hkey, L"~MHz", 0, REG_DWORD, (BYTE *)&power_info[i].MaxMhz, sizeof(DWORD) ); + RegSetValueExW( hkey, L"~MHz", 0, REG_DWORD, (BYTE *)&tsc_freq_mhz, sizeof(DWORD) ); RegCloseKey( hkey ); } if (sci.ProcessorArchitecture != PROCESSOR_ARCHITECTURE_ARM && @@ -1627,6 +1791,7 @@ int __cdecl main( int argc, char *argv[] ) BOOL end_session, force, init, kill, restart, shutdown, update; HANDLE event; OBJECT_ATTRIBUTES attr; + UINT64 tsc_frequency = 0; UNICODE_STRING nameW = RTL_CONSTANT_STRING( L"\KernelObjects\__wineboot_event" ); BOOL is_wow64;
@@ -1712,8 +1877,8 @@ int __cdecl main( int argc, char *argv[] )
ResetEvent( event ); /* in case this is a restart */
- create_user_shared_data(); - create_hardware_registry_keys(); + create_user_shared_data( &tsc_frequency ); + create_hardware_registry_keys( tsc_frequency ); create_dynamic_registry_keys(); create_environment_registry_keys(); create_computer_name_keys();
From: Rémi Bernon rbernon@codeweavers.com
--- programs/wineboot/wineboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/programs/wineboot/wineboot.c b/programs/wineboot/wineboot.c index 3e8ffcb484a..c6a4f1be358 100644 --- a/programs/wineboot/wineboot.c +++ b/programs/wineboot/wineboot.c @@ -315,7 +315,7 @@ static UINT64 read_tsc_frequency( BOOL has_rdtscp ) freq1 = (tsc3 - tsc1) * 10000000 / (time1 - time0); error = llabs( (freq1 - freq0) * 1000000 / min( freq1, freq0 ) ); } - while (error > 100 && --retries); + while (error > 500 && --retries);
if (!retries) WARN( "TSC frequency calibration failed, unstable TSC?\n" ); else
From: Rémi Bernon rbernon@codeweavers.com
--- programs/wineboot/wineboot.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/programs/wineboot/wineboot.c b/programs/wineboot/wineboot.c index c6a4f1be358..e7869eb64f4 100644 --- a/programs/wineboot/wineboot.c +++ b/programs/wineboot/wineboot.c @@ -317,7 +317,13 @@ static UINT64 read_tsc_frequency( BOOL has_rdtscp ) } while (error > 500 && --retries);
- if (!retries) WARN( "TSC frequency calibration failed, unstable TSC?\n" ); + if (!retries) + { + FIXME( "TSC frequency calibration failed, unstable TSC?"); + FIXME( "time0 %I64u ns, time1 %I64u ns\n", time0 * 100, time1 * 100 ); + FIXME( "tsc2 - tsc0 %I64u, tsc3 - tsc1 %I64u\n", tsc2 - tsc0, tsc3 - tsc1 ); + FIXME( "freq0 %I64u Hz, freq2 %I64u Hz, error %I64u ppm\n", freq0, freq1, error ); + } else { freq = (freq0 + freq1) / 2;
From: Marc-Aurel Zent mzent@codeweavers.com
--- programs/wineboot/wineboot.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/programs/wineboot/wineboot.c b/programs/wineboot/wineboot.c index a19046a6439..c1ac7c07d38 100644 --- a/programs/wineboot/wineboot.c +++ b/programs/wineboot/wineboot.c @@ -302,6 +302,10 @@ static BOOL is_tsc_trusted_by_the_kernel(void) HANDLE handle; BOOL ret = TRUE;
+/* Darwin for x86-64 uses the TSC internally for timekeeping, so it can always + * be trusted. + * For BSDs there seems to be no unified interface to query TSC quality. + * If there is a sysfs entry with clocksource information, use it to check though.*/ handle = CreateFileA( "\??\unix\sys\bus\clocksource\devices\clocksource0\current_clocksource", GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, 0 ); if (handle == INVALID_HANDLE_VALUE) return TRUE;
From: Marc-Aurel Zent mzent@codeweavers.com
--- programs/wineboot/wineboot.c | 111 ++++++++++++----------------------- 1 file changed, 36 insertions(+), 75 deletions(-)
diff --git a/programs/wineboot/wineboot.c b/programs/wineboot/wineboot.c index e7869eb64f4..a19046a6439 100644 --- a/programs/wineboot/wineboot.c +++ b/programs/wineboot/wineboot.c @@ -245,90 +245,51 @@ static void initialize_xstate_features(struct _KUSER_SHARED_DATA *data) static UINT64 read_tsc_frequency( BOOL has_rdtscp ) { UINT64 freq = 0; + LONGLONG time0, time1, tsc0, tsc1, tsc2, tsc3, freq0, freq1, error; + unsigned int aux; + UINT retries = 50; + int regs[4];
-/* FIXME: Intel provides TSC freq in some CPUID but it's been slightly broken, - fix it properly and test it on real Intel hardware */ - -#if 0 - int regs[4], cpuid_level, tmp; - UINT64 denom, numer; - - __cpuid( regs, 0 ); - tmp = regs[2]; - regs[2] = regs[3]; - regs[3] = tmp; - - /* only available on some intel CPUs */ - if (memcmp( regs + 1, "GenuineIntel", 12 )) freq = 0; - else if ((cpuid_level = regs[0]) < 0x15) freq = 0; - else + do { - __cpuid( regs, 0x15 ); - if (!(denom = regs[0]) || !(numer = regs[1])) freq = 0; + if (has_rdtscp) + { + tsc0 = __rdtscp( &aux ); + time0 = RtlGetSystemTimePrecise(); + tsc1 = __rdtscp( &aux ); + Sleep( 1 ); + tsc2 = __rdtscp( &aux ); + time1 = RtlGetSystemTimePrecise(); + tsc3 = __rdtscp( &aux ); + } else { - if ((freq = regs[2])) freq = freq * numer / denom; - else if (cpuid_level >= 0x16) - { - __cpuid( regs, 0x16 ); /* eax is base freq in MHz */ - freq = regs[0] * (UINT64)1000000; - } - else freq = 0; + tsc0 = __rdtsc(); __cpuid( regs, 0 ); + time0 = RtlGetSystemTimePrecise(); + tsc1 = __rdtsc(); __cpuid( regs, 0 ); + Sleep(1); + tsc2 = __rdtsc(); __cpuid( regs, 0 ); + time1 = RtlGetSystemTimePrecise(); + tsc3 = __rdtsc(); __cpuid( regs, 0 ); }
- if (!freq) WARN( "Failed to read TSC frequency from CPUID, falling back to calibration.\n" ); - else TRACE( "TSC frequency read from CPUID, found %I64u Hz\n", freq ); + freq0 = (tsc2 - tsc0) * 10000000 / (time1 - time0); + freq1 = (tsc3 - tsc1) * 10000000 / (time1 - time0); + error = llabs( (freq1 - freq0) * 1000000 / min( freq1, freq0 ) ); } -#endif + while (error > 500 && --retries);
- if (freq == 0) + if (!retries) { - LONGLONG time0, time1, tsc0, tsc1, tsc2, tsc3, freq0, freq1, error; - unsigned int aux; - UINT retries = 50; - int regs[4]; - - do - { - if (has_rdtscp) - { - tsc0 = __rdtscp( &aux ); - time0 = RtlGetSystemTimePrecise(); - tsc1 = __rdtscp( &aux ); - Sleep( 1 ); - tsc2 = __rdtscp( &aux ); - time1 = RtlGetSystemTimePrecise(); - tsc3 = __rdtscp( &aux ); - } - else - { - tsc0 = __rdtsc(); __cpuid( regs, 0 ); - time0 = RtlGetSystemTimePrecise(); - tsc1 = __rdtsc(); __cpuid( regs, 0 ); - Sleep(1); - tsc2 = __rdtsc(); __cpuid( regs, 0 ); - time1 = RtlGetSystemTimePrecise(); - tsc3 = __rdtsc(); __cpuid( regs, 0 ); - } - - freq0 = (tsc2 - tsc0) * 10000000 / (time1 - time0); - freq1 = (tsc3 - tsc1) * 10000000 / (time1 - time0); - error = llabs( (freq1 - freq0) * 1000000 / min( freq1, freq0 ) ); - } - while (error > 500 && --retries); - - if (!retries) - { - FIXME( "TSC frequency calibration failed, unstable TSC?"); - FIXME( "time0 %I64u ns, time1 %I64u ns\n", time0 * 100, time1 * 100 ); - FIXME( "tsc2 - tsc0 %I64u, tsc3 - tsc1 %I64u\n", tsc2 - tsc0, tsc3 - tsc1 ); - FIXME( "freq0 %I64u Hz, freq2 %I64u Hz, error %I64u ppm\n", freq0, freq1, error ); - } - else - { - freq = (freq0 + freq1) / 2; - TRACE( "TSC frequency calibration complete, found %I64u Hz\n", freq ); - } + FIXME( "TSC frequency calibration failed, unstable TSC?"); + FIXME( "time0 %I64u ns, time1 %I64u ns\n", time0 * 100, time1 * 100 ); + FIXME( "tsc2 - tsc0 %I64u, tsc3 - tsc1 %I64u\n", tsc2 - tsc0, tsc3 - tsc1 ); + FIXME( "freq0 %I64u Hz, freq2 %I64u Hz, error %I64u ppm\n", freq0, freq1, error ); + } + else + { + freq = (freq0 + freq1) / 2; + TRACE( "TSC frequency calibration complete, found %I64u Hz\n", freq ); }
return freq;
Yes, we don't want dead code in any form, so 9a637ad53c271712654ddbc938f61df9e72277d3 should be squashed (with 912deb992cf6499a7e41a3f08da48808b8be0955) into 4b6a8a76d6dabb1ba8332d0a973b3ed530f9717a. You should edit the commit message to something like that (make it more succinct and drop Proton specific tags):
``` In HKEY_LOCAL_MACHINE\HARDWARE\DESCRIPTION\System\CentralProcessor
Some games such as Horizon Zero Dawn use this registry value to correlate values from rtdsc to real time.
Returning the nominal/maximum cpu frequency here causes the game to run in slow motion as it may not match the tsc frequency of the processor.
Based on patches from Arkadiusz Hiler and Joshua Ashton. ```
d793836654bc9037418914531e9b793be9afb0d5 should be dropped, it was there to help debugging some issue with a custom CPUfreq governor.