-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Good day to all.
Henri, Stefan, I address this message to you at the first place as to a main developers of wined3d/opengl stuff. Nevertheless, hints and help are welcome from anyone, cause ATM I'm totally confused and don't know what else to try to investigate a case.
What I've got here is an app (localized version of "Perfect World" MMORPG game client from "Mail.Ru Games Corp") that seems to suffer huge FPS regression which I believe to be a bug in nVIDIA drivers rather than a regression in Wine.
Details are following.
When I configure an app to run in a windowed mode I've got around 40 FPS on game login screen with nVIDIA drivers 275.09.07, but switching into using more recent versions causes FPS to drop to around ~10. Configuring the game to use fullscreen more fixes the issue - I've got ~30-40 FPS at game login screen no matter the driver version I use.
I've been suspecting that this issue might be related to vsync control (and a recent change in nVIDIA linux drivers for vsync to be on by default) that causes the issue, so I had exported __GL_SYNC_TO_VBLANK="0" into the environment and used nvidia-settings to set vsync to be off by default. Using a small native opengl demo program I had specifically written to test for vsync state I can prove that vsync defaults to be off on my system. Also I had modified Wine's winex11.drv opengl.c in a way that a call to wglSwapIntervalEXT always sets swap_interval to be zero, no matter what was originally requested. Nevertheless, I still got this strange FPS drop when I run the game in a windowed more with a recent nVIDIA drivers.
What could be a cause for it? What I want is to track down the problem to it's roots and check if it's really a bug in nVIDIA drivers. In the end I would like to implement a small opengl demo that would trigger the bug so nVIDIA wouldn't be able to reject my bug report on a matter that "it's a Wine bug, prove us that it's not".
- -- Best regards, Alexey Loukianov mailto:mooroon2@mail.ru System Engineer, Mob.:+7(926)218-1320 *nix Specialist
Am Sonntag, 15. April 2012, 07:22:34 schrieb Alexey Loukianov:
When I configure an app to run in a windowed mode I've got around 40 FPS on game login screen with nVIDIA drivers 275.09.07, but switching into using more recent versions causes FPS to drop to around ~10.
It could be a driver bug as you suspect, which is difficult to track down - your best bet would be using something like oprofile to find out which GL calls show performance changes.
It could also be because of some additional features added in newer drivers. 16 byte alignment for vertex buffers is a possibility, I believe it was added in the 280 drivers. You can check this by disabling GL_ARB_map_buffer_range. If this improves performance, you're probably running into a dynamic buffer related issue. It wouldn't explain the windowed vs fullscreen difference, but it's still worth checking.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
15.04.2012 21:50, Stefan Dösinger wrote:
It could also be because of some additional features added in newer drivers. 16 byte alignment for vertex buffers is a possibility, I believe it was added in the 280 drivers. You can check this by disabling GL_ARB_map_buffer_range.
This one isn't the case, as the problem affects ancient versions of Wine that don't use this extension, like 1.2.3 (under which the app in question actually performs much better comparing to fresher releases).
Actually I've got a small patchset here against 1.5.2 which is required to bring the FPS levels for this game to a way it was with 1.2.3, and one of the patches from this patchset effectively disables GL_ARB_map_buffer_range usage. So, yeah, GL_ARB_map_buffer_range is causing trouble for this game, but it is not the one to blame for the issue I'm trying to resolve.
Meanwhile I've been able to reproduce this bug on another PC I've got here at home. It was originally spotted on a box having 8GB DDR3 RAM, GeForce GTX 550 Ti with 1GB and AMD FX 8120 CPU running Fedora 14-based LFS-like system with 32bit PAE-enabled kernel. The system I've been able to reproduce the bug on is a box equipped with AMD Phenom II x4 955 CPU, 8GB DDR2 RAM, GeForce 8600 GT with 256MB VRAM running Linux Mint 9 with 32bit PAE-enabled kernel 3.0.0-16. Unfortunately I haven't got access to any system with ATI/AMD card a.t.m., but chances are I would be able to lay my hands on one with AMD A8 CPU with integraded Radeon card. It would be interesting to check if this bug affects ATI/AMD.
Thanks for oprofile hint, but unfortunately I haven't got any experience with it. I also been thinking about trying to use APITrace, but I don't have any experience with it either and I don't know if it's compatible with non-OSS GPU drivers.
- -- Best regards, Alexey Loukianov mailto:mooroon2@mail.ru System Engineer, Mob.:+7(926)218-1320 *nix Specialist
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
15.04.2012 21:50, Stefan Dösinger wrote:
your best bet would be using something like oprofile to find out which GL calls show performance changes.
Well, I had compiled/installed APITrace 3.0 and oprofile 0.9.7 on my system, but it seems that it'd be by the very least "problematic" to get any useful info from these. APITrace 3.0 works fine with Wine, but the performance hit is huge and resulting trace size seems to be non-manageable. Trace file containing two frames displayed at the game login screen is ~2GB in size, and non-surprisingly game performance is more like "one frame per twenty seconds".
With oprofile I hit another trouble - it seems that this tool is unable to fetch symbols from libGL, at least all I get in reports related to libGL are simple references to /usr/lib/libGL.so.295.40, /usr/lib/libnvidia-glcore.so.295.40 and /usr/lib/tls/libnvidia-tls.so.295.40 without any details available on symbols that are internal to this libs.
It might be me misusing oprofile although, as currently it is the first time ever I'm trying to use it to do profiling tasks.
- -- Best regards, Alexey Loukianov mailto:mooroon2@mail.ru System Engineer, Mob.:+7(926)218-1320 *nix Specialist
On 04/15/2012 04:44 PM, Alexey Loukianov wrote:
With oprofile I hit another trouble - it seems that this tool is unable to fetch symbols from libGL
Of course it won't - they are binary blobs from Nvidia. Not much to see there. All you really looking for are time spent in that library.
Vitaliy.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
16.04.2012 04:28, Vitaliy Margolen wrote:
Of course it won't - they are binary blobs from Nvidia. Not much to see there. All you really looking for are time spent in that library.
Vitaliy, I don't expect oprofile to find hidden COFF or DWARF 2 debug infos inside nVIDIA binary blob, it's obvious that there are nothing like that there :-).
What I was expecting to find in oprofile output is a libGL.so subdivision like may be seen in "objdump -T /usr/lib/libGL.so.295.40" output. Thinking a bit more about it makes it clear that my expectations were wrong as the export table doesn't have all the required info for oprofile to act as a poors man substitute for real symbols map file (for example, it can't be determined for sure from exports table what are the actual proc boundaries for any given exported symbol).
What might be useful for profiling in this case is a "proxy" wrapper for libGL that sits between Wine and real libGL and collects call timing stats. A wrapper of such kind wouldn't be as devastating to FPS as apitrace, and it would provide a fine-grained picture on per-proc timing stats which are extremely helpful when one is trying to catch a GPU driver bug. I have no idea if there's such a wrapper already implemented out in a wild.
- -- Best regards, Alexey Loukianov mailto:mooroon2@mail.ru System Engineer, Mob.:+7(926)218-1320 *nix Specialist
On 4/16/12, Alexey Loukianov mooroon2@mail.ru wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
16.04.2012 04:28, Vitaliy Margolen wrote:
Of course it won't - they are binary blobs from Nvidia. Not much to see there. All you really looking for are time spent in that library.
Vitaliy, I don't expect oprofile to find hidden COFF or DWARF 2 debug infos inside nVIDIA binary blob, it's obvious that there are nothing like that there :-).
What I was expecting to find in oprofile output is a libGL.so subdivision like may be seen in "objdump -T /usr/lib/libGL.so.295.40" output. Thinking a bit more about it makes it clear that my expectations were wrong as the export table doesn't have all the required info for oprofile to act as a poors man substitute for real symbols map file (for example, it can't be determined for sure from exports table what are the actual proc boundaries for any given exported symbol).
What might be useful for profiling in this case is a "proxy" wrapper for libGL that sits between Wine and real libGL and collects call timing stats. A wrapper of such kind wouldn't be as devastating to FPS as apitrace, and it would provide a fine-grained picture on per-proc timing stats which are extremely helpful when one is trying to catch a GPU driver bug. I have no idea if there's such a wrapper already implemented out in a wild.
Something equivalent exists and is in Wine. When compiling wined3d compile it with -DUSE_WIN32_OPENGL to route all GL calls through wine's opengl32.dll. It may give some more clues. About all Nvidia driver calls end up in libGLcore.so where libGL.so is a thin wrapper. You likely won't see that much more, but it is easy to try.
Roderick
Best regards, Alexey Loukianov mailto:mooroon2@mail.ru System Engineer, Mob.:+7(926)218-1320 *nix Specialist
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQEcBAEBAgAGBQJPi20tAAoJEPB9BOdTkBULPo4IAKFYIbPvg0Znv6AOiZt+C2yV +uiq1+FZcdeMedfGZrA8lelOvUEp9h8/wuDEmZ2YEW28+S/qkMA0EL0MaJY7hqZz ac3vdt/wVxxDpwAm1Jjl0YjmzhZP4dE8fyB42Clh5+McIG7MvsO7sHfGmQk9Jbye d+KvoOWbOFaB5fNrXr+lQMGqkNTSMas3TQS3KIVeiCFitDzXwDHoK7dGykeiJ340 q0MxqRRa7XvGSZNtw9Q043ZeywaNMFD/k6tSUkIXwP/FlZsTBPHhr7M37h+gjt84 2w26KE7cncoUKl1l3w7WfUMN9/MgEKmtX5O0nsP0S8EBCdehfVhOlUcUI48Kkgk= =Kkgh -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
15.04.2012 07:22, Alexey Loukianov wrote:
Good day to all.
Henri, Stefan, I address this message to you at the first place as to a main developers of wined3d/opengl stuff... ---8<--- strip ---8<--- ... an app (localized version of "Perfect World" MMORPG game client from "Mail.Ru Games Corp") that seems to suffer huge FPS regression which I believe to be a bug in nVIDIA drivers rather than a regression in Wine...
So I'm still trying to investigate this issue and have some additional info to share.
Recently Mail.Ru had updated the game client so now its version almost matches one that is available on Perfect World International site. I was hoping that this update might made the issue go away but that didn't work out. Good news here are that - most probably - it should be possible to reproduce this issue using PWI game client. My sacred hopes are that Henri would be able to find some time and took a look into it.
Another hope was that the issue would be magically auto-fixed in fresh nVIDIA drivers, like 302.07 beta which had been released yesterday. It turned out not to be a case.
Trying to pinpoint the cause using oprofile produced no valuable results: it either me not able to use this wonderful profiler correctly or the issue is of such kind that isn't easily tracked by oprofile.
Next bet was to try to start up each and every D3D app I have in windowed mode and check if there's any that experience similar problem. I had tried running a bunch of demos from D3D SDK and several well-known game titles (namely "UFO: ET", "Trine", "Portal", "King's Bounty: The Legend", "Osmos", "LIMBO" and "Braid") but none of them had suffered the issue.
What I had noticed though is a big difference in behavior that shows up when I move an app window. When I setup Wine prefix to use virtual desktop mode, launch "windowed" D3D app and try to move app's window around inside virtual desktop window - the only app I've got having window contents updated while I'm dragging it is the PW game client. All other apps tested - and they are not affected by this bug - do not update the contents of their window while I move it around. Don't know what does it mean per se, but it well might be a hint that would allow someone more clever than me to understand what's happening and why does nVIDIA driver versions 280+ acts as a trigger for this bug.
P.S. Sorry once again for making noise on devel list but I don't want to open up a garbage bug report which would have decent chances to be closed as "not a Wine bug, blame nVIDIA".
- -- Best regards, Alexey Loukianov mailto:mooroon2@mail.ru System Engineer, Mob.:+7(926)218-1320 *nix Specialist
On 3 May 2012 07:17, Alexey Loukianov mooroon2@mail.ru wrote:
Trying to pinpoint the cause using oprofile produced no valuable results: it either me not able to use this wonderful profiler correctly or the issue is of such kind that isn't easily tracked by oprofile.
Personally I think perf is a bit nicer to use than oprofile. Nevertheless, two issues you're likely to run into are on the one hand that the nvidia drivers don't have any kind of useful debugging information, so you'll have a hard time profiling any time spent in the driver, and on the other hand time spent waiting for the GPU typically won't show up in the profile. This would happen for example if we tried to upload to a vertex buffer that the GPU is still drawing from. You can sometimes get useful information by measuring time for some more high level operations like draws, blits, clears, etc. by adding extra code to wined3d. In general tracking these kinds of things down is just a lot of work and fairly hard though.