On Tue, 2005-01-25 at 14:05, wine-devel-request@winehq.org wrote:
Message: 1 Date: Tue, 25 Jan 2005 19:30:04 +0100 From: Andreas Mohr andi@rhlx01.fht-esslingen.de To: Rein Klazes wijn@wanadoo.nl Cc: Lionel Ulmer lionel.ulmer@free.fr, wine-patches@winehq.com, wine-devel@winehq.org Subject: Re: PerformanceCounterFrequency fix.
Hi,
On Tue, Jan 25, 2005 at 06:44:04PM +0100, Rein Klazes wrote:
On Mon, 24 Jan 2005 15:08:56 +0100, you wrote:
How bad is it to use the gettimeofday() method?
In my opinion, the RTDSC method should be suppressed from the code and we should always use the 'gettimeofday' method (despite the penalty hit of a syscall).
I was more concerned about the accuracy of gettimeofday (not incrementing in usec's). So I did a small test and I find it behaves very nicely.
That was the only reason I could see to justify the rdtsc method, so here it goes. As the cpuHz variable is not used anymore, we might as well move it to ntdll.
(sorry for not replying earlier - no time :-) I'm not sure why you'd want to base it on gettimeofday(). This is a terrible idea IMHO. I'm quite certain that many programs use that function for extremely time critical code (games, anyone??), and that thus the Windows function is equally highly optimized, certainly much less slow than a gettimeofday() call.
This should remain based on rdtsc IMHO, or on equally suitable and fast methods (ACPI counter, ...).
Or did you actually test it with programs calling it a large number of times, or test its performance behaviour on Windows?
Andreas Mohr
Our system uses the performance counter all the time in extremely time critical code. If the call is anything more than a few cycles with absolutely no chance of blocking, we are hosed!
The application is an embedded audio plugin player. The audio is processed with SCHED_FIFO and needs to be as deterministic and fast as possible.
I hope this fix/change doesn't jeopardize our product's use of Wine... mo
=================================== Michael Ost, Software Architect Muse Research, Inc. most@museresearch.com
On 25 Jan 2005 14:22:40 -0800, you wrote:
I'm quite certain that many programs use that function for extremely time critical code (games, anyone??), and that thus the Windows function is equally highly optimized, certainly much less slow than a gettimeofday() call.
This should remain based on rdtsc IMHO, or on equally suitable and fast methods (ACPI counter, ...).
Or did you actually test it with programs calling it a large number of times, or test its performance behaviour on Windows?
Andreas Mohr
Our system uses the performance counter all the time in extremely time critical code. If the call is anything more than a few cycles with absolutely no chance of blocking, we are hosed!
The application is an embedded audio plugin player. The audio is processed with SCHED_FIFO and needs to be as deterministic and fast as possible.
I hope this fix/change doesn't jeopardize our product's use of Wine...
You are of course in an excellent position to quantify better then "extremely time critical" and "few cycles". Just try the patch and tell us when you are "hosed".
I have done a few further tests. A loop like:
for(i=0;i<10000000;i++) QueryPerformanceCounter( &count);
takes under Windows 2k, on some hardware 45 seconds. Under Wine on the exact same hardware it takes (with the patch) 13 seconds.
If you do a series of QueryPerformanceCounter:
QueryPerformanceCounter( carr[0]); QueryPerformanceCounter( carr[1]); QueryPerformanceCounter( carr[2]); QueryPerformanceCounter( carr[3]); QueryPerformanceCounter( carr[4]); ...
and print the results, I see on Windows that the counter increments 4 or 5 steps between the calls. Under Wine it is 1 or 2.
So in these simple benchmarks, Wine beats Windows by a factor of three. I call that satisfactory, wishing Wine would do that in other areas as well.
Also note that QueryPerformanceCounter Timer takes hundreds of cycles. In an extremely time critical application that would be hosed if it takes more then a few cycles, I would not recommend the use of this call at all.
Rein.
Hi,
On Wed, Jan 26, 2005 at 08:58:13AM +0100, Rein Klazes wrote:
On 25 Jan 2005 14:22:40 -0800, you wrote:
The application is an embedded audio plugin player. The audio is processed with SCHED_FIFO and needs to be as deterministic and fast as possible.
I hope this fix/change doesn't jeopardize our product's use of Wine...
You are of course in an excellent position to quantify better then "extremely time critical" and "few cycles". Just try the patch and tell us when you are "hosed".
Whoa, seems like I started an avalanche with my posting ;-)
I have done a few further tests. A loop like:
for(i=0;i<10000000;i++) QueryPerformanceCounter( &count);
takes under Windows 2k, on some hardware 45 seconds. Under Wine on the exact same hardware it takes (with the patch) 13 seconds.
Wow!! I'd never have expected that. So it seems the optimized Linux syscall is just that: optimized, highly :)
If you do a series of QueryPerformanceCounter:
QueryPerformanceCounter( carr[0]); QueryPerformanceCounter( carr[1]); QueryPerformanceCounter( carr[2]); QueryPerformanceCounter( carr[3]); QueryPerformanceCounter( carr[4]); ...
and print the results, I see on Windows that the counter increments 4 or 5 steps between the calls. Under Wine it is 1 or 2.
Which is obvious as well since the execution time under Wine is lower, so the deltas should be lower as well.
So in these simple benchmarks, Wine beats Windows by a factor of three. I call that satisfactory, wishing Wine would do that in other areas as well.
Damn right!
Also note that QueryPerformanceCounter Timer takes hundreds of cycles. In an extremely time critical application that would be hosed if it takes more then a few cycles, I would not recommend the use of this call at all.
Yup. I'd say that the goal is not necessarily to be terribly faster than Windows; the goal is to have comparable behaviour (being much faster than Windows here can easily be a problem on its own, you bet!).
As such the new implementation should be perfect.
Sorry for that wonderful false alarm! ;-) (but I'd say the ensuing discussion certainly was useful)
Andreas Mohr
On Tue, 2005-01-25 at 23:58, Rein Klazes wrote:
On 25 Jan 2005 14:22:40 -0800, you wrote:
I'm quite certain that many programs use that function for extremely time critical code (games, anyone??), and that thus the Windows function is equally highly optimized, certainly much less slow than a gettimeofday() call.
This should remain based on rdtsc IMHO, or on equally suitable and fast methods (ACPI counter, ...).
Or did you actually test it with programs calling it a large number of times, or test its performance behaviour on Windows?
Andreas Mohr
Our system uses the performance counter all the time in extremely time critical code. If the call is anything more than a few cycles with absolutely no chance of blocking, we are hosed!
The application is an embedded audio plugin player. The audio is processed with SCHED_FIFO and needs to be as deterministic and fast as possible.
I hope this fix/change doesn't jeopardize our product's use of Wine...
You are of course in an excellent position to quantify better then "extremely time critical" and "few cycles". Just try the patch and tell us when you are "hosed". I have done a few further tests. A loop like:
for(i=0;i<10000000;i++) QueryPerformanceCounter( &count);
takes under Windows 2k, on some hardware 45 seconds. Under Wine on the exact same hardware it takes (with the patch) 13 seconds.
OK. Sorry. My ebullience has once again resulted in poor mailing list style! Not the first time. I'll turn on "serious careful engineer mode". Let's see how I do... %)
My concern isn't the number of cycles. It sounds like the function runs very quickly, even faster than in Windows. That's great news.
But I am concerned about blocking or preemption. I assume that the new call doesn't hit the wineserver, right? Is there any other thread sync required (critical sections, etc) for gettimeofday() which might cause the new implementation to block? I don't know how it is implemented, but it sounds like a "system call" which makes me suspicious. I know that Windows developers class QueryPerformanceCounter as a "system call that's safe to use in real time" so that's why I am trying to keep tabs on it.
We don't have control over what APIs our hosted audio plugins use. They are written by other developers, and given to us in binary form. So we have to keep a close eye on APIs that plugins commonly use during real time processing.
Sometimes plugins make calls they shouldn't from a r/t priority thread --- like InvalidateRect. I guess in Windows such calls work without blocking almost all the time. But that isn't the case in Wine.
The Wine Semaphore implementation has caused us problems as well. They are much more likely to block in Wine than in Windows. One plugin used it where it shouldn't and its audio processing execution time went completely erratic as it blocked on the wineserver.
Anyway, thanks for your attention to this issue. It sounds like Receptor (our product) will be fine with this change. ... mo
On 26 Jan 2005 09:59:14 -0800, you wrote:
My concern isn't the number of cycles. It sounds like the function runs very quickly, even faster than in Windows. That's great news.
But I am concerned about blocking or preemption. I assume that the new call doesn't hit the wineserver, right? Is there any other thread sync required (critical sections, etc) for gettimeofday() which might cause the new implementation to block? I don't know how it is implemented, but it sounds like a "system call" which makes me suspicious. I know that Windows developers class QueryPerformanceCounter as a "system call that's safe to use in real time" so that's why I am trying to keep tabs on it.
Because the wineserver is not involved at all, I did not believe there to be substantial differences between Windows and Linux. For both there is switch to kernel mode, that is a syscall, which is in itself expensive compared to a user function call. Both will need to have some mutual exclusion or critical section when they access the hardware clock. Net result will depend on optimization details where I believe Linux has some distinct advantages.
[snip]
Anyway, thanks for your attention to this issue. It sounds like Receptor (our product) will be fine with this change. ... mo
I think so too. And if it is not, it can be fixed.
Rein.