https://bugs.winehq.org/show_bug.cgi?id=44863
Bug ID: 44863 Summary: Performance regression in Prince of Persia 3D Product: Wine Version: unspecified Hardware: x86 OS: Linux Status: NEW Severity: normal Priority: P2 Component: directx-d3d Assignee: wine-bugs@winehq.org Reporter: stefan@codeweavers.com Distribution: ---
Prince of Persia 3D's performance went from perfectly smooth to about 0.5 fps. I suspect 0b92a6fba7a6e60c6ff1a3729a3b21019c2df0ce is to blame, but I have not run a regression test yet.
The problem is that the game creates a rather large (2MB) D3DVBCAPS_SYSTEMMEMORY, maps it (the entire buffer due to API limitations), writes a handful of vertices and draws a handful of vertices. Currently wined3d uploads the entire 2MB, evicts the sysmem copy and downloads it from the GPU every map / unmap / draw cycle.
The most obvious performance fix is not to create a VBO. Doing this restores the performance, but questions remain.
On startup, the game writes "NetImmerse D3DDriver Info: Hardware supports system memory textures" and "NetImmerse D3DDriver Info: No AGP support detected". The first info seems wrong, so it is possible that the game enters a codepath it does not choose on Windows.
Not creating a VBO is not an option on Core Contexts, so I investigated what's going wrong with the PBO codepath on. First of all, evicting the sysmem copy seems like a bad choice. It happens because ddraw buffers are not marked dynamic. We may want to chance this. The game uses d3d3, so there's no DDLOCK_DISCARDCONTENTS. The game passes DDLOCK_WAIT | DDLOCK_WRITEONLY to IDirect3DVertexBuffer::Lock.
Commenting out the eviction call improves performance quite a bit, but it is still noticeably slow. wined3d_buffer_map maps through heap_memory instead of glMapBuffer because of the "(flags & WINED3D_MAP_WRITE) && !(flags & (WINED3D_MAP_NOOVERWRITE | WINED3D_MAP_DISCARD))" condition.
Removing this condition uses glMapBuffer, but does not improve performance. It seems the large glMapBuffer is still slow, at least on OSX with legacy contexts.
So there are a few questions that need to be answered: *) Is the game using a broken codepath? *) Write tests for sysmem buffers *) Consider making all d3d3 buffers dynamic *) Test if the glMapBuffer path is fast on Linux *) Investigate if Core Contexts + GL_ARB_buffer_storage help on OSX.