Signed-off-by: Rémi Bernon rbernon@codeweavers.com ---
This is some tests to validate that RtlQueryPerformanceCounter should be able to bypass NtQueryPerformanceCounter syscall and be optimised with rdtsc(p). The XSTATE save and restore is making its syscall much slower than it was before, and some applications are now burning CPU calling Qpc in a tight loop.
It also looks like that there's a new shared page in town since w10v1809 and it's pretty much undocumented. On previous versions the bypass is not always enabled, and it's for instance not enabled on the testbot VMs but I was able to test it in a local VM with w10v1511, and it only seems to be using the USD values.
dlls/ntdll/tests/time.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/dlls/ntdll/tests/time.c b/dlls/ntdll/tests/time.c index d756a8c839c..931bf765aa7 100644 --- a/dlls/ntdll/tests/time.c +++ b/dlls/ntdll/tests/time.c @@ -206,9 +206,9 @@ static void test_user_shared_data_time(void) { t1 = GetTickCount(); if (user_shared_data->NtMajorVersion <= 5 && user_shared_data->NtMinorVersion <= 1) - t2 = (*(volatile ULONG*)&user_shared_data->TickCountLowDeprecated * (ULONG64)user_shared_data->TickCountMultiplier) >> 24; + t2 = (DWORD)((*(volatile ULONG*)&user_shared_data->TickCountLowDeprecated * (ULONG64)user_shared_data->TickCountMultiplier) >> 24); else - t2 = (read_ksystem_time(&user_shared_data->u.TickCount) * user_shared_data->TickCountMultiplier) >> 24; + t2 = (DWORD)((read_ksystem_time(&user_shared_data->u.TickCount) * user_shared_data->TickCountMultiplier) >> 24); t3 = GetTickCount(); } while(t3 < t1 && i++ < 1); /* allow for wrap, but only once */