Hi wine-devel,
attached i have a test case whitch demonstates the differece between Windows and wine. There is also a sample program 'TEST.CPP' attached.
On Windows XP - Start 'test.exe' from a dos-box... you see some FAST counting integers - Start a other (loop.pl) program witch consumes mutch cpu time. - the output of 'test.exe' is slower but FAST
On wine - Start 'test.exe' from a dos-box... you see some FAST counting integers - Start a other (loop.pl) program witch consumes mutch cpu time. - the output of 'test.exe' is very slow
This different behavior starts from wine version 20041201. The version before was ok.
I have no patch and it would be nice if someone can write a patch to fix this.
Thanks
Oliver Mössinger
Oliver Mössinger olivwork@web.de writes:
Hi wine-devel,
attached i have a test case whitch demonstates the differece between Windows and wine. There is also a sample program 'TEST.CPP' attached.
On Windows XP
- Start 'test.exe' from a dos-box... you see some FAST counting integers
- Start a other (loop.pl) program witch consumes mutch cpu time.
- the output of 'test.exe' is slower but FAST
On wine
- Start 'test.exe' from a dos-box... you see some FAST counting integers
- Start a other (loop.pl) program witch consumes mutch cpu time.
- the output of 'test.exe' is very slow
This different behavior starts from wine version 20041201. The version before was ok.
I have no patch and it would be nice if someone can write a patch to fix this.
You can probably fix it by passing PM_NOYIELD in the PeekMessage calls. But if your app needs a lot of CPU, restructuring the code to avoid all the needless polling would give much better results, and probably improve the behavior on Windows too.
Alexandre Julliard wrote:
You can probably fix it by passing PM_NOYIELD in the PeekMessage calls. But if your app needs a lot of CPU, restructuring the code to avoid all the needless polling would give much better results, and probably improve the behavior on Windows too.
That's just a workaround. Our PeekMessage is definitly misbehaving - I ran the attached test-program in Wine and WinXP... here are the results:
Wine:
wine@majestix c $ wine foo.exe NtYieldExecution yielded 100 times PeekMessage(...) yielded 200 times PeekMessage(... PM_NOYIELD) yielded 100 times
WinXP: C:>foo.exe NtYieldExecution yielded 100 times PeekMessage(...) yielded 0 times PeekMessage(... PM_NOYIELD) yielded 0 times
(The numbers slightly differ between runs for obvious reasons but they are close enough (with an error margin of +/- 10 we could maybe make this a real testcase))
So, PeekMessage always yields execution (it shouldn't) with PM_NOYIELD specified it yields execution twice (although it shouldn't at all).
The (real) effect of PM_NOYIELD is described at http://www.piclist.com/techref/os/win/api/win32/func/src/f67_6.htm
You can optionally combine the value PM_NOYIELD with either PM_NOREMOVE or PM_REMOVE. However, PM_NOYIELD has no effect on 32-bit Windows applications. It is defined in Win32 solely to provide compatibility with applications written for previous versions of Windows, where it was used to prevent the current task from halting and yielding system resources to another task. 32-bit Windows applications always run simultaneously.
Felix
I wrote:
So, PeekMessage always yields execution (it shouldn't) with PM_NOYIELD specified it yields execution twice (although it shouldn't at all).
Err, that should read "and without PM_NOYIELD specified".
Felix
Felix Nawothnig felix.nawothnig@t-online.de writes:
(The numbers slightly differ between runs for obvious reasons but they are close enough (with an error margin of +/- 10 we could maybe make this a real testcase))
So, PeekMessage always yields execution (it shouldn't) with PM_NOYIELD specified it yields execution twice (although it shouldn't at all).
PeekMessage is going to call the server and wait on the result, there's no way around it. The extra yield is to avoid hammering the server with requests in stupid apps that constantly poll for messages, since a server call is much more expensive than a Windows system call.
This could certainly be changed, but it will require evidence that the changes help for common real cases, not just to make some artificial benchmark show better results.
Alexandre Julliard wrote:
So, PeekMessage always yields execution (it shouldn't) with PM_NOYIELD specified it yields execution twice (although it shouldn't at all).
PeekMessage is going to call the server and wait on the result, there's no way around it. The extra yield is to avoid hammering the server with requests in stupid apps that constantly poll for messages,
But then that "extra" NtYieldExecution should not depend on !PM_NOYIELD since PM_NOYIELD doesn't have any effect on Windows, right?
since a server call is much more expensive than a Windows system call.
Would using shm fix that?
Felix
Felix Nawothnig felix.nawothnig@t-online.de writes:
But then that "extra" NtYieldExecution should not depend on !PM_NOYIELD since PM_NOYIELD doesn't have any effect on Windows, right?
It has an effect for Win16 apps, they need to release the Win16 lock. We could add a yield in the PM_NOYIELD case, but Win32 apps won't use PM_NOYIELD anyway so I doubt it would make a difference, and keeping it that way allows Winelib apps to tweak the behavior if needed.
since a server call is much more expensive than a Windows system call.
Would using shm fix that?
No, you don't want to put the message queue in shared memory, that's not reliable.
Alexandre Julliard wrote:
Felix Nawothnig felix.nawothnig@t-online.de writes:
But then that "extra" NtYieldExecution should not depend on !PM_NOYIELD since PM_NOYIELD doesn't have any effect on Windows, right?
It has an effect for Win16 apps, they need to release the Win16 lock. We could add a yield in the PM_NOYIELD case, but Win32 apps won't use PM_NOYIELD anyway so I doubt it would make a difference, and keeping it that way allows Winelib apps to tweak the behavior if needed.
Ostensibly, it also effects whether or not a WaitInputIdle returns or not, but I'm not sure I fully understand this.
But I have to admit I'm bothered; you seem to be refusing a patch that makes Wine more correct.
PeekMessage() is more lightweight on Windows than it is on Wine, but I can still write bad code that chokes the system by spin looping on PeekMessage on Windows.
I can imagine a case where a bad programmer has two threads and depends (not intentionally, but through accident) on one thread starving the other thread of CPU time such that a race condition never occurs. I don't have an example, but I have seen behavior like that, notably with IE and PowerPoint (although I think the case was with some other signalling method, not PeekMessage).
Thus, I think it's reasonable to try to preserve relative timing on Wine as closely as we can, even if it creates some overall performance degradation for poorly designed apps. (Famous last words, I'm sure I'll shortly be screaming about why is Wine suddenly so slow <grin>).
Can you point out examples of misbehaving programs so that we can go see just how bad the impact is?
Cheers,
Jeremy
Jeremy White jwhite@codeweavers.com writes:
But I have to admit I'm bothered; you seem to be refusing a patch that makes Wine more correct.
PeekMessage() is more lightweight on Windows than it is on Wine, but I can still write bad code that chokes the system by spin looping on PeekMessage on Windows.
Yes, but the thing is you can write bad code that works fine on Windows but chokes down Wine, because we spend so much more time inside PeekMessage. The yield is an attempt to fix this by penalizing badly written apps to prevent them from hurting well written ones.
Thus, I think it's reasonable to try to preserve relative timing on Wine as closely as we can, even if it creates some overall performance degradation for poorly designed apps. (Famous last words, I'm sure I'll shortly be screaming about why is Wine suddenly so slow <grin>).
The problem is that the poorly designed apps will hammer the server, which will cause a much bigger performance degradation in unrelated apps than what you'd expect simply by hogging the CPU.
Can you point out examples of misbehaving programs so that we can go see just how bad the impact is?
If you run Oliver's test program along with some other Windows apps you can probably see the effects.
I dug into this a bit further.
Felix, the extra 100 yields are coming from code I prompted, in ntdll/sync.c; if the return from an NtWait... is TIMEOUT, then we force a yield. (The thread that points to more info is here: http://www.winehq.org/hypermail/wine-devel/2005/01/0469.html)
If I back that down and apply your patch, I can get to 100/1/1. This also makes Olivers test program retain priority (rather than slowing to a crawl as it does today). In fact, it keeps too high a priority (the perl test is slow and jerky by comparison to the Wine one, instead of being relatively even as on Windows). I'd theorize that's due to the server calls; we smell like an X process so we get priority.
However, this makes it clear to me that the yield in message.c is largely moot; you need to remove both that one and the one in ntdll/sync.c to have any material effect on Wine timing with messages.
Further, while I've been wanting to probe that yield in ntdll.c further, I haven't done my homework yet, so I think maybe I'll shut up and slink back into my corner.
Cheers,
Jeremy
Jeremy White jwhite@codeweavers.com writes:
However, this makes it clear to me that the yield in message.c is largely moot; you need to remove both that one and the one in ntdll/sync.c to have any material effect on Wine timing with messages.
Actually it should be enough to not yield in MsgWaitForMultipleObjects when we are only checking for X events, and that would be correct IMO since the check for X events is always in addition to the normal behavior. Something like this should do it:
Index: dlls/x11drv/event.c =================================================================== RCS file: /opt/cvs-commit/wine/dlls/x11drv/event.c,v retrieving revision 1.56 diff -u -p -r1.56 event.c --- dlls/x11drv/event.c 25 Jul 2005 11:08:43 -0000 1.56 +++ dlls/x11drv/event.c 3 Aug 2005 08:28:52 -0000 @@ -295,12 +295,13 @@ DWORD X11DRV_MsgWaitForMultipleObjectsEx
data->process_event_count++; if (process_events( data->display, mask )) ret = count; - else + else if (count || timeout) { ret = WaitForMultipleObjectsEx( count+1, new_handles, flags & MWMO_WAITALL, timeout, flags & MWMO_ALERTABLE ); if (ret == count) process_events( data->display, mask ); } + else ret = WAIT_TIMEOUT; data->process_event_count--; return ret; }
We changed our program to avoid the needless polling (only two PeekMessage are needet then). We add also the PM_NOYIELD. This gives back the reaction of the program if some other process consumes process time. But the speed is more reduced than windows.
Your patch works for our program. It seems it is faster than before.
So now our program works, not fine but it works.
If you can find a better fix, it would be fine...
Jeremy White jwhite@codeweavers.com writes:
However, this makes it clear to me that the yield in message.c is largely moot; you need to remove both that one and the one in ntdll/sync.c to have any material effect on Wine timing with messages.
Actually it should be enough to not yield in MsgWaitForMultipleObjects when we are only checking for X events, and that would be correct IMO since the check for X events is always in addition to the normal behavior. Something like this should do it:
Index: dlls/x11drv/event.c
RCS file: /opt/cvs-commit/wine/dlls/x11drv/event.c,v retrieving revision 1.56 diff -u -p -r1.56 event.c --- dlls/x11drv/event.c 25 Jul 2005 11:08:43 -0000 1.56 +++ dlls/x11drv/event.c 3 Aug 2005 08:28:52 -0000 @@ -295,12 +295,13 @@ DWORD X11DRV_MsgWaitForMultipleObjectsEx
data->process_event_count++; if (process_events( data->display, mask )) ret = count;
- else
- else if (count || timeout) { ret = WaitForMultipleObjectsEx( count+1, new_handles, flags &
MWMO_WAITALL, timeout, flags & MWMO_ALERTABLE ); if (ret == count) process_events( data->display, mask ); }
- else ret = WAIT_TIMEOUT; data->process_event_count--; return ret;
}
Oliver Mössinger wrote:
We changed our program to avoid the needless polling (only two PeekMessage are needet then). We add also the PM_NOYIELD. This gives back the reaction of the program if some other process consumes process time. But the speed is more reduced than windows.
So we actually got a real world app which depends (depended, but we don't want apps getting fixed for Wine, right?) on PeekMessage not giving away timeslices?
Felix
I wrote:
since a server call is much more expensive than a Windows system call.
Would using shm fix that?
No, you don't want to put the message queue in shared memory, that's not reliable.
shm + kernel handles for synchronization then?
Wait, putting the message queue into shm isn't what I wanted to suggest (although it would be possible with kernel handles, no?). :-)
I meant that using shm generally would lower the cost of a server request and doing that extra yield would no longer be necessary (although we'd still have the other yield due to the request itself unless the queue is put into shm).
Felix
Felix Nawothnig felix.nawothnig@t-online.de writes:
I meant that using shm generally would lower the cost of a server request and doing that extra yield would no longer be necessary (although we'd still have the other yield due to the request itself unless the queue is put into shm).
If you mean passing the parameters through shared memory instead of a pipe, then no, it doesn't really make a difference. The cost is not the few bytes we need to copy across, it's the context switches.