https://bugs.winehq.org/show_bug.cgi?id=39421
Bug ID: 39421 Summary: Majesty Gold HD runs very slowly Product: Wine Version: 1.7.52 Hardware: x86 OS: Mac OS X Status: UNCONFIRMED Severity: normal Priority: P2 Component: -unknown Assignee: wine-bugs@winehq.org Reporter: jonas.bugzilla@gmail.com
Created attachment 52535 --> https://bugs.winehq.org/attachment.cgi?id=52535 Clamp the glFlush rate in the Mac driver to 60Hz
The game Majesty Gold HD (http://www.gog.com/game/majesty_gold_hd ) runs very slowly under Wine for me (Late 2013 iMac 21.5", quad 2.9GHz Core i5, GeForce GT 750M with 1GB GDDR5, OS X 10.9.5). There is an in-game slider to speed up the game, but it has no effect.
The game is a DirectDraw 7 game. There are also some reports on Windows regarding this problem, but most people can solve it by telling the game to use DirectDraw for blitting (set BlitMode=1 in the MajXPrefs, or use the -useddblit command line parameter -- see http://forum.paradoxplaza.com/forum/index.php?threads/bug-fixes.601217/#post... )
The game performs its drawing commands in the same thread that runs the game logic (I don't know whether anything else is even supported by by DirectDraw 7). From a trace I can see that game performs many small blits to update only the parts of the screen that have changed. Example from a trace:
trace:ddraw:ddraw_surface7_Blt iface 0x16c160, dst_rect (0,0)-(24,38), src_surface 0x16c9c8, src_rect (0,0)-(24,38), flags 0x1000000, fx 0x0. [ repeats 8 times ] trace:ddraw:ddraw_surface7_Blt iface 0x16c160, dst_rect (864,661)-(1009,803), src_surface 0x16c9c8, src_rect (864,661)-(1009,803), flags 0x1000000, fx 0x0. trace:ddraw:ddraw_surface7_Blt iface 0x16c160, dst_rect (806,999)-(986,1080), src_surface 0x16c9c8, src_rect (806,999)-(986,1080), flags 0x1000000, fx 0x0. trace:ddraw:ddraw_surface7_Blt iface 0x16c160, dst_rect (1097,734)-(1277,914), src_surface 0x16c9c8, src_rect (1097,734)-(1277,914), flags 0x1000000, fx 0x0. trace:ddraw:ddraw_surface7_Blt iface 0x16c160, dst_rect (860,481)-(1040,661), src_surface 0x16c9c8, src_rect (860,481)-(1040,661), flags 0x1000000, fx 0x0. trace:ddraw:ddraw_surface7_Blt iface 0x16c160, dst_rect (860,481)-(1040,661), src_surface 0x16c9c8, src_rect (860,481)-(1040,661), flags 0x1000000, fx 0x0. trace:ddraw:ddraw_surface7_Blt iface 0x16c160, dst_rect (1354,797)-(1534,977), src_surface 0x16c9c8, src_rect (1354,797)-(1534,977), flags 0x1000000, fx 0x0. ...
I think Wine, however, always flushes a complete new screen via OpenGL, as the game spends almost all of its time waiting for glFlush() to finish. I have verified that this is the case by applying the attached patch to the Mac driver code to unconditionally clamp the glFlush rate to 60Hz. This solves the speed issue, even at 1920x1080 (without the patch, the game is glacial even at 800x600). Obviously, the patch cannot be integrated in wine, since it occasionally discards the "last" blit to screen after a static scene change, leaving old data visible and no longer updating.
I previously posted about this issue (a long time ago) on wine-dev: https://www.winehq.org/pipermail/wine-devel/2015-February/106625.html . To answer some questions/suggestions from the end of that thread: a) the SkipSingleBufferFlushes registry key does not help (my system supports the GL_APPLE_flush_render extension) b) related to Henri Verbeet's suggestion that it may be related to wine possibly not implementing asynchronous ddraw blits: it's true that wine does not support this, but OTOH the game does not ask for them either: it calls ddraw_surface7_Blt rather than ddraw_surface7_BltFast (BltFast is documented on MSDN as always being asynchronous), and it does not set the DDBLT_ASYNC flag
The game offers a built-in ability to wait for vsync via its configuration file (set the VSync variable in $HOME/Documents/My Games/MajestyHD/MajXPrefs to "1"), but that only results in trace:ddraw:ddraw7_WaitForVerticalBlank iface 0x146750, flags 0x1, event 0x0. fixme:ddraw:ddraw7_WaitForVerticalBlank iface 0x146750, flags 0x1, event 0x0 stub!
I'm also not sure whether it would help, but looking at the ddraw log (which I will also attach -- it's a trace+ddraw,trace+d3d_draw log), I think it may since often there are several blits between the vsync waits.
Some possible approaches to tackle this issue may be: * handle small direct rects more efficiently * add support for waiting for VSync in ddraw
There is a demo of an older version of the game, but it can only run in 640x480 and there is no way to adjust the game speed via an in-game slider, and at least on my system the difference with and without my patch is not very visible there (http://www.cyberlore.com/Majesty/demo.htm )
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #1 from Jonas Maebe jonas.bugzilla@gmail.com --- Created attachment 52536 --> https://bugs.winehq.org/attachment.cgi?id=52536 trace+ddraw,trace+d3d_draw log
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #2 from Jonas Maebe jonas.bugzilla@gmail.com --- Oh, and I didn't test with 1.7.52 exactly, but with git HEAD (which is/was d548639d977dee847350b408aec9522d68aef813 at that point)
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #3 from Jonas Maebe jonas.bugzilla@gmail.com ---
I'm also not sure whether it would help, but looking at the ddraw log (which I will also attach -- it's a trace+ddraw,trace+d3d_draw log), I think it may since often there are several blits between the vsync waits.
Well, at least if Wine is allowed to and would queue flushes until a request to wait for the vsync is fired...
https://bugs.winehq.org/show_bug.cgi?id=39421
Jonas Maebe jonas.bugzilla@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Component|-unknown |directx-d3d
https://bugs.winehq.org/show_bug.cgi?id=39421
Stefan Dösinger stefan@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |stefan@codeweavers.com
--- Comment #4 from Stefan Dösinger stefan@codeweavers.com --- Does the game draw to the front buffer? We have to flush after a front buffer draw to make the changes show up on the screen. They're supposed to show up immediately, there is no "I am done drawing now, please show this" for blits to the front buffer. So we can't really queue things up and skip flushes if there are 300 / second, because we never know if there will be a draw number 301.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #5 from Jonas Maebe jonas.bugzilla@gmail.com --- (In reply to Stefan Dösinger from comment #4)
Does the game draw to the front buffer?
Yes, I think the game blits to the frontbuffer:
trace:ddraw:ddraw_surface7_GetSurfaceDesc iface 0x16c160, surface_desc 0x33fb94. trace:ddraw:ddraw_surface7_GetSurfaceDesc Returning surface desc: trace:ddraw:DDRAW_dump_members - DDSD_CAPS : DDSCAPS_COMPLEX DDSCAPS_FLIP *DDSCAPS_FRONTBUFFER* DDSCAPS_PRIMARYSURFACE DDSCAPS_3DDEVICE DDSCAPS_VIDEOMEMORY DDSCAPS_VISIBLE DDSCAPS_LOCALVIDMEM
And all blits go that same interface: trace:ddraw:ddraw_surface7_Blt iface 0x16c160 ...
So you indeed would get the same effect as with my glFlush() clamping patch if you would queue them, unless you'd also associate a timer with them and flush anyway if within 1/18th of a second no new flush/blit has a arrived. The overhead of instating and removing so many timers all the time may also be significant though, especially with such short timeouts (although if you don't mind some delayed updates occasionally, you could increase their timeouts).
The only other alternative solution I can think of would be to add dirty rects (not "direct rects" as I mistakenly wrote above) handling, so that not the entire screen is flushed every time when only parts have been updated.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #6 from Stefan Dösinger stefan@codeweavers.com --- I don't see how we can do partial flushes. glFlush doesn't take any arguments.
One idea that has been floating around since a long time is to have a thread in wined3d that draws the front buffer to the screen every 1/60th second, and otherwise we don't do anything special about front buffer modifications. I guess this would fix this game here, but it isn't really trivial to implement.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #7 from Jonas Maebe jonas.bugzilla@gmail.com --- (In reply to Stefan Dösinger from comment #6)
I don't see how we can do partial flushes. glFlush doesn't take any arguments.
I guess it would boil down to implementing double buffering ourselves: always keep a copy of the previous image in VRAM, and then composite that one (on the card) with textures created from the the new partial blits (which come from system memory and which only cover part of the screen).
There should be no need for transfers from VRAM back to system memory in this scenario, nor for any dirty rects testing by Wine, as the blits tell you exactly where the dirty rects are. Unless, of course, you have to take into account that the frontbuffer may also be modified by other means than ddraw blits. However, since forcing the game to always use ddraw blits resolves the speed issue on Windows, I guess this can somehow be detected/known.
One idea that has been floating around since a long time is to have a thread in wined3d that draws the front buffer to the screen every 1/60th second, and otherwise we don't do anything special about front buffer modifications. I guess this would fix this game here, but it isn't really trivial to implement.
I can imagine.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #8 from Jonas Maebe jonas.bugzilla@gmail.com --- (In reply to Jonas Maebe from comment #7)
I guess it would boil down to implementing double buffering ourselves: always keep a copy of the previous image in VRAM, and then composite that one (on the card) with textures created from the the new partial blits (which come from system memory and which only cover part of the screen).
As may be clear from the above, until now I know/knew very little about either DirectDraw and OpenGL (other than high level concepts). I've been reading up a bit on both, and disregarding the adagio that"a little knowledge is worse than none", here's what I understood from it:
1) a ddraw surface can be updated in two ways: either you lock the surface manually and directly manipulate the pixel data, or you use the ddraw blit functions. In the former case, ddraw has no clue what exactly changed, in the latter case it knows exactly what changed.
2) you could have an OpenGL FBO, three texture images and a single render buffer image. Initially, you make texture image 1 all black. You assign texture image 2 via a texture object to one of the color attachments of the FBO, and the render buffer to to the depth attachment. Then you use the following logic after every blit:
previous_frame_tex = texture_image_1; blit_data_tex = texture_image_2; new_frame_tex = texture_image_3;
/* render the new frame into a texture ... */ glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, new_frame_tex); /* ... and into a frame to display */ glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_FRAMEBUFFER, render_buffer); /* set up textures with data from previous frame and new frame */ glGenTextures(2, &textures); glBindTexture(textures[0], previous_frame_tex); glBindTextture(textures[1],new_drawing_tex); ... /* render */ ... /* flip FBO to frontbuffer */ ... /* flush */ glFlush();
/* set newly rendered frame as "previous frame" for the next invocation */ texture_image_1 = new_frame_tex; /* switch the texture ID for the next new frame */ texture_image_3 = previous_frame_tex;
Now, a) I don't know what the conditions are under which you can be confident that the frontbuffer will never be changed by anything else but ddraw (so that you can in fact use the rendering outcome of the previous frame as basis for a new one). In the case of this game, we always take the wined3d_surface_blt() path in ddraw_surface_update_frontbuffer(), whose comment says "Nothing to do, we control the frontbuffer, or at least the parts we care about.", so that suggests to me that it may be okay (along with the fact that it runs quickly on native, and that it presumably does something similar there) b) I don't know in what way the wined3d internals can conflict with this approach c) I don't know what wined3d does exactly currently that makes it so slow. Is it just blitting a complete new screen after every (small) blit, or is it in addition also getting the data from the frontbuffer for every frame? I think it's the former, but I'm not sure. d) probably other things I haven't thought of
I'll also attach the profiling data for the drawing thread. Some notes about this data. Note that it's sampling based, but the percentages *include* time spent blocked for kernel operations to complete. That's why the thread is recorded as using 8.5% of the total sampled time, as there are 12 "full time" threads in the program (I guess it's rounded up) and e.g. even threads that do almost nothing but spend time waiting for a select call to finish, are counted just as much as a thread that is constantly calculating.
The reason I'm attaching this data, is that it nicely illustrates how the game is indeed constantly waiting for glFlush() to finish in the main drawing/game logic thread. It also shows which paths it takes through ddraw and wined3d.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #9 from Jonas Maebe jonas.bugzilla@gmail.com --- Created attachment 52585 --> https://bugs.winehq.org/attachment.cgi?id=52585 Sampling profile trace of the ddraw/game logic thread
As mentioned in the comment referring to this profiling data: note that it's sampling based, but the percentages *include* time spent blocked for kernel operations to complete. That's why the thread is recorded as using 8.5% of the total sampled time, as there are 12 "full time" threads in the program (I guess it's rounded up) and e.g. even threads that do almost nothing but spend time waiting for a select call to finish, are counted just as much as a thread that is constantly calculating.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #10 from Jonas Maebe jonas.bugzilla@gmail.com --- (In reply to Jonas Maebe from comment #9)
Created attachment 52585 [details] Sampling profile trace of the ddraw/game logic thread
One more thing: this profiling information is without the patch to clamp the glFlush rate.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #11 from Jonas Maebe jonas.bugzilla@gmail.com --- I looked a bit through the wined3d code this weekend, and it seems it already partly works the way I described above. The main difference is that it does get and convert the contents of the current VRAM frontbuffer into a texture before compositing it with the new blitted data, rather than keeping the result of the previous rendering into a texture.
Since the location of a surface is a bitmask, it seems that in theory it should be possible to simultaneously keep the frontbuffer's contents both in system memory (for use by the ddraw application in case it locks it manually) and in a texture (for use during the next rendering).
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #12 from Stefan Dösinger stefan@codeweavers.com --- The way I read your previous findings is that the slowdown isn't caused by sysmem -> video memory image transfers, but by compositing the rendered image into the quartz display output. This is also confirmed by your winemac.drv based hack working. If you skip glFlush calls in winemac.drv you're in no way changing how wined3d transfers the image to OpenGL.
There are plenty of improvements in the way wined3d handles framebuffer maps that can be done that help some games (and potentially hurt others), but I doubt this is the real issue here.
You can try to use Quartz Debug (part of the graphics developer utilities) to force beam sync off and see if that improves rendering speed.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #13 from Jonas Maebe jonas.bugzilla@gmail.com --- (In reply to Stefan Dösinger from comment #12)
The way I read your previous findings is that the slowdown isn't caused by sysmem -> video memory image transfers, but by compositing the rendered image into the quartz display output. This is also confirmed by your winemac.drv based hack working. If you skip glFlush calls in winemac.drv you're in no way changing how wined3d transfers the image to OpenGL.
That's true. In fact, if I remove the calls to glFlush altogether, it's even smoother and faster. You do have to wait occasionally for an interface element to appear in that case though (e.g. for the menu when pressing escape), but while playing the game still updates in real time (I guess because it constantly sends new data that causes the graphics pipeline to be flushed all the time).
You can try to use Quartz Debug (part of the graphics developer utilities) to force beam sync off and see if that improves rendering speed.
No effect.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #14 from Jonas Maebe jonas.bugzilla@gmail.com --- One other thing I tried yesterday: call glFlush once a second, in order to have a fixed upper bound for those UI elements to appear. This had a (to me) quite unexpected effect: about every second, the screen contents suddenly changed to those from somewhere within the past second for a very short while, and then continued updating normally. I have no idea what could cause that.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #15 from Jonas Maebe jonas.bugzilla@gmail.com --- Just wanted to add that 3beec95a092a261f3c265fd30a10e0c0ead524bc "winemac: Use CVDisplayLink to limit window redrawing to the display refresh rate. (try 2)" by Ken Thomases doesn't help (in fullscreen mode; in windowed mode, the screen stays black, but I believe that was already the case before that commit).
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #16 from Ken Thomases ken@codeweavers.com --- (In reply to Jonas Maebe from comment #15)
Just wanted to add that 3beec95a092a261f3c265fd30a10e0c0ead524bc "winemac: Use CVDisplayLink to limit window redrawing to the display refresh rate. (try 2)" by Ken Thomases doesn't help (in fullscreen mode; in windowed mode, the screen stays black, but I believe that was already the case before that commit).
Right. That wasn't expected to help this. That one is still about GDI drawing, not GL/D3D drawing.
The display link might form the basis of a future timer-based flushing mechanism, though. However, that would require multi-thread GL access which in turn requires either synchronization or render-to-texture and shared contexts.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #17 from Jonas Maebe jonas.bugzilla@gmail.com --- (In reply to Ken Thomases from comment #16)
Right. That wasn't expected to help this. That one is still about GDI drawing, not GL/D3D drawing.
FWIW, with DirectDrawRenderer=gdi it also doesn't seem to have (much of) an effect (just mentioning this in case it might have affected that, as I don't know whether the D3D->GDI path uses the same flushing/updating functionality as plain GDI).
The display link might form the basis of a future timer-based flushing mechanism, though. However, that would require multi-thread GL access which in turn requires either synchronization or render-to-texture and shared contexts.
Ok.
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #18 from Ken Thomases ken@codeweavers.com --- (In reply to Jonas Maebe from comment #17)
(In reply to Ken Thomases from comment #16)
Right. That wasn't expected to help this. That one is still about GDI drawing, not GL/D3D drawing.
FWIW, with DirectDrawRenderer=gdi it also doesn't seem to have (much of) an effect (just mentioning this in case it might have affected that, as I don't know whether the D3D->GDI path uses the same flushing/updating functionality as plain GDI).
What version of OS X are you running? I've recently found that OS X 10.11 (El Capitan) introduced an issue where programs which overdrive the Cocoa display mechanism have problems due to synchronizing with the window server's refresh cycle. That's largely what motivated the CVDisplayLink stuff.
So, I'd expect that the game would have encountered problems with DirectDrawRenderer=gdi on El Capitan prior to that commit and be back to normal with that commit.
https://bugs.winehq.org/show_bug.cgi?id=39421
Sergey Isakov isakov-sl@bk.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |isakov-sl@bk.ru
--- Comment #19 from Sergey Isakov isakov-sl@bk.ru --- Tested the patch in OSX 10.7.5 and 10.9.5 with wine-1.8. I found it worst for the most games: Tumblebugs 2, Severance Demo, NitroFamily Demo...
https://bugs.winehq.org/show_bug.cgi?id=39421
Stefan Dösinger stefandoesinger@gmx.at changed:
What |Removed |Added ---------------------------------------------------------------------------- CC|stefan@codeweavers.com |stefandoesinger@gmx.at
https://bugs.winehq.org/show_bug.cgi?id=39421
joaopa jeremielapuree@yahoo.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jeremielapuree@yahoo.fr
--- Comment #20 from joaopa jeremielapuree@yahoo.fr --- still a bug in current wine (3.11)?
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #21 from joaopa jeremielapuree@yahoo.fr --- No news from the reporter since 6 years. No download available. Can an administrator close this bug as ABANDONED?
https://bugs.winehq.org/show_bug.cgi?id=39421
--- Comment #22 from joaopa jeremielapuree@yahoo.fr --- No one interested in closing this bug as ABANDONED?