This reverts commit 5c8ea25014f ("ntdll: Use CLOCK_REALTIME_COARSE for NtQuerySystemTime() if it has sufficient resolution.")
CLOCK_*_COARSE only provides up to 1ms resolution at CONFIG_HZ=1000. OTOH, there are several ways to get up to 0.5ms resolution on modern Windows (high resolution waitable timers, NtSetTimerResolution with 0.5ms). This code path therefore has a possibility of behaving worse than native.
Since COARSE resolution is HZ dependent, this code path only runs if the kernel is configured with CONFIG_HZ=1000. Most distro ships does not ship with this. Therefore, this code path is rarely tested, and is more of a recipe for surprise. If any application rely on fast NtQuerySystemTime they are likely already broken for majority of Wine users.
Given the above reason, don't use CLOCK_REALTIME_COARSE. Use gettimeofday which is internally hooked to CLOCK_REALTIME.
-- v3: ntdll: Don't use CLOCK_REALTIME_COARSE
From: Tatsuyuki Ishi ishitatsuyuki@gmail.com
This reverts commit 5c8ea25014f ("ntdll: Use CLOCK_REALTIME_COARSE for NtQuerySystemTime() if it has sufficient resolution.")
CLOCK_*_COARSE only provides up to 1ms resolution at CONFIG_HZ=1000. OTOH, there are several ways to get up to 0.5ms resolution on modern Windows (high resolution waitable timers, NtSetTimerResolution with 0.5ms). This code path therefore has a possibility of behaving worse than native.
Since COARSE resolution is HZ dependent, this code path only runs if the kernel is configured with CONFIG_HZ=1000. Most distro ships does not ship with this. Therefore, this code path is rarely tested, and is more of a recipe for surprise. If any application rely on fast NtQuerySystemTime they are likely already broken for majority of Wine users.
Given the above reason, don't use CLOCK_REALTIME_COARSE. Use gettimeofday which is internally hooked to CLOCK_REALTIME. --- dlls/ntdll/unix/sync.c | 31 +++---------------------------- 1 file changed, 3 insertions(+), 28 deletions(-)
diff --git a/dlls/ntdll/unix/sync.c b/dlls/ntdll/unix/sync.c index 21e66f01875..19dee0ac6df 100644 --- a/dlls/ntdll/unix/sync.c +++ b/dlls/ntdll/unix/sync.c @@ -1698,35 +1698,10 @@ NTSTATUS WINAPI NtQueryPerformanceCounter( LARGE_INTEGER *counter, LARGE_INTEGER */ NTSTATUS WINAPI NtQuerySystemTime( LARGE_INTEGER *time ) { -#ifdef HAVE_CLOCK_GETTIME - struct timespec ts; - static clockid_t clock_id = CLOCK_MONOTONIC; /* placeholder */ - - if (clock_id == CLOCK_MONOTONIC) - { -#ifdef CLOCK_REALTIME_COARSE - struct timespec res; - - /* Use CLOCK_REALTIME_COARSE if it has 1 ms or better resolution */ - if (!clock_getres( CLOCK_REALTIME_COARSE, &res ) && res.tv_sec == 0 && res.tv_nsec <= 1000000) - clock_id = CLOCK_REALTIME_COARSE; - else -#endif /* CLOCK_REALTIME_COARSE */ - clock_id = CLOCK_REALTIME; - } - - if (!clock_gettime( clock_id, &ts )) - { - time->QuadPart = ticks_from_time_t( ts.tv_sec ) + (ts.tv_nsec + 50) / 100; - } - else -#endif /* HAVE_CLOCK_GETTIME */ - { - struct timeval now; + struct timeval now;
- gettimeofday( &now, 0 ); - time->QuadPart = ticks_from_time_t( now.tv_sec ) + now.tv_usec * 10; - } + gettimeofday( &now, 0 ); + time->QuadPart = ticks_from_time_t( now.tv_sec ) + now.tv_usec * 10; return STATUS_SUCCESS; }
Hi,
It looks like your patch introduced the new failures shown below. Please investigate and fix them before resubmitting your patch. If they are not new, fixing them anyway would help a lot. Otherwise please ask for the known failures list to be updated.
The tests also ran into some preexisting test failures. If you know how to fix them that would be helpful. See the TestBot job for the details:
The full results can be found at: https://testbot.winehq.org/JobDetails.pl?Key=147018
Your paranoid android.
=== debian11b (64 bit WoW report) ===
user32: input.c:5867: Test failed: got pos (49,51) input.c:4305: Test succeeded inside todo block: button_down_hwnd_todo 1: got MSG_TEST_WIN hwnd 00000000010400DC, msg WM_LBUTTONDOWN, wparam 0x1, lparam 0x320032
On Thu Jul 11 13:36:55 2024 +0000, Tatsuyuki Ishi wrote:
I know about Unixlibs and I'm not using them because they are a PITA to distribute out-of-tree due to Steam runtime compatibility. Also, this is getting off-topic.
Sorry for sounding Captain Obvious but depending on the Wine behaviour different from Windows is never a good idea. We didn't yet encounter an app which is confirmed depending on Windows sleep granularity, but if we will I personally don't see why won't we change this behaviour, even if it is not immediately trivial to do and will first require visiting some places in the code which might depend on that. There has to be a better way.
That's not related to this MR, I didn't verify but I don't know that we really need that coarse timer.
BTW did you try experimenting with waitable timers? Maybe (didn't test) waiting for them will yield a better time granularity on Windows.
On Thu Jul 11 14:48:59 2024 +0000, Paul Gofman wrote:
Sorry for sounding Captain Obvious but depending on the Wine behaviour different from Windows is never a good idea. We didn't yet encounter an app which is confirmed depending on Windows sleep granularity, but if we will I personally don't see why won't we change this behaviour, even if it is not immediately trivial to do and will first require visiting some places in the code which might depend on that. There has to be a better way. That's not related to this MR, I didn't verify but I don't know that we really need that coarse timer. BTW did you try experimenting with waitable timers? Maybe (didn't test) waiting for them will yield a better time granularity on Windows.
Yeah, I’m aware that detecting Wine isn’t a good approach in general. I’ll probably move to a Unixlib just for sleeping (kinda silly but that’s the most reliable).
I already use waitable timers on Windows and they give 0.5ms precision from my tests. There are also other sources on the internet confirming the same thing [1](https://groups.google.com/a/chromium.org/g/scheduler-dev/c/0GlSPYreJeY?pli=1).
On Thu Jul 11 15:09:15 2024 +0000, Tatsuyuki Ishi wrote:
Yeah, I’m aware that detecting Wine isn’t a good approach in general. I’ll probably move to a Unixlib just for sleeping (kinda silly but that’s the most reliable). I already use waitable timers on Windows and they give 0.5ms precision from my tests. There are also other sources on the internet confirming the same thing [1](https://groups.google.com/a/chromium.org/g/scheduler-dev/c/0GlSPYreJeY?pli=1).
@ishitatsuyuki I looked at the `NtDelayExecution` code, we can easily get rid of calling `NtQuerySystemTime`. Probably in Windows there is no indication anywhere that we must call `NtQuerySystemTime` in `NtDelayExecution`. I think we can use `CLOCK_MONOTONIC` there, if its accuracy is not enough, then try using `CLOCK_PROCESS_CPUTIME_ID`.
On Thu Jul 11 15:32:39 2024 +0000, Grigory Vasilyev wrote:
@ishitatsuyuki I looked at the `NtDelayExecution` code, we can easily get rid of calling `NtQuerySystemTime`. Probably in Windows there is no indication anywhere that we must call `NtQuerySystemTime` in `NtDelayExecution`. I think we can use `CLOCK_MONOTONIC` there, if its accuracy is not enough, then try using `CLOCK_PROCESS_CPUTIME_ID`.
Feel free to write a patch if you want to, but it really just has nothing to do with the precision problem and is out of scope of this MR.
On Thu Jul 11 15:36:50 2024 +0000, Tatsuyuki Ishi wrote:
Feel free to write a patch if you want to, but it really just has nothing to do with the precision problem and is out of scope of this MR.
@ishitatsuyuki From your description, this problem is more complex than the NtQuerySystemTime problem. No one will accept this MR in this form. Therefore, I advise you to provide test code and open a bug report so that someone more experienced can solve this problem. I don't know how frame limiting works in Windows, so it would be nice to see more code examples from open source game engines.
On Thu Jul 11 16:02:17 2024 +0000, Grigory Vasilyev wrote:
@ishitatsuyuki From your description, this problem is more complex than the NtQuerySystemTime problem. No one will accept this MR in this form. Therefore, I advise you to provide test code and open a bug report so that someone more experienced can solve this problem. I don't know how frame limiting works in Windows, so it would be nice to see more code examples from open source game engines.
From your actions I really can’t tell if you are just trolling and wasting everyone’s time on purpose.
On Thu Jul 11 16:10:27 2024 +0000, Tatsuyuki Ishi wrote:
From your actions I really can’t tell if you are just trolling and wasting everyone’s time on purpose.
@ishitatsuyuki I'll quote youPaul Gofman:
``` We didn't yet encounter an app which is confirmed depending on Windows sleep granularity ```
Provide code examples, or are you hiding secret malware developments?
``` NtDelayExecution is used to suspend execution, similiar to the Sleep() API function. This function can be used by malware for evasion purposes. ```
On Thu Jul 11 16:27:08 2024 +0000, Grigory Vasilyev wrote:
@ishitatsuyuki I'll quote youPaul Gofman:
We didn't yet encounter an app which is confirmed depending on Windows sleep granularity
Provide code examples, or are you hiding secret malware developments?
NtDelayExecution is used to suspend execution, similiar to the Sleep() API function. This function can be used by malware for evasion purposes.
I've already resolved this offline with Gofman because having a productive discussion with a confrontational, overconfident and unprofessional "academic" is just impossible.
Just leave it to Huw for review, the description already says it all.
On Thu Jul 11 16:27:08 2024 +0000, Tatsuyuki Ishi wrote:
I've already resolved this offline with Gofman because having a productive discussion with a confrontational, overconfident and unprofessional "academic" is just impossible. Just leave it to Huw for review, the description already says it all.
@ishitatsuyuki I'll try to explain to you, you suggest using `gettimeofday`, which was abandoned in the commit you want to revert. But calling `gettimeofday` may take significantly longer and on older kernels the call may not be wrapped in a `vDSO (virtual dynamic shared object)`. Also, the current behavior is to use `CLOCK_REALTIME_COARSE` in `NtDelayExecution`, I think it’s a bad idea because we can get time changes in `CLOCK_REALTIME_COARSE`, from here our timer can be offset in a non-predictable way.
``` This will only affect users running with HZ=1000. On an i7-8700 CPU @ 3.20GHz it cuts the call cost from ~30ns to ~12ns. ```
BTW, one of the reason I want this is because I want count on NtDelayExecution being accurate on Wine (but not on Windows). I'm developing a frame limiter specific for Proton use and that allows me to accurately sleep without all the redundant spinning to compensate for native Windows' poor timer accuracy.
Eh, in general, that should be a red flag already. But it also contradicts the description where you said:
CLOCK_*_COARSE only provides up to 1ms resolution at CONFIG_HZ=1000. OTOH, there are several ways to get up to 0.5ms resolution on modern Windows (high resolution waitable timers, NtSetTimerResolution with 0.5ms). This code path therefore has a possibility of behaving worse than native.
So which one is it?