Hi, Now that the new ddraw code in my tree is almost feature-complete, I decided to take a look at opengl-accelerated DirectDraw. I have implemented Blts from and to rendertargets already for Direct3D games like Tomb Raider 3 and Swat3, so I thought it would work. Palettized surfaces(d3d demos) and color keying(moto racer 2) are also in place.
Well, I started Anno 1602 with opengl surfaces, got a black screen and looked at the logs. The basic problem is that directdraw games tend to do a lot of 'nasty' things(The same applied to d3d games which do ddraw-rendering)
One of these nasty things is GetDC on the render target. Age of Empires 2 and Settlers 3 to that to draw text. This is going to be unplayably slow I'm afraid. Wine will need a dib engine for that to work.
Another issue is 'stupid' rendering. I tested Tibia 3(d3d with a ddraw main menu). It creates a single-buffered, non-d3d primary, and a offscreen d3d surface. Now it blts to the offscreen device(which is the wined3d back buffer, no problem), reads everything to an non-d3d offscreen surface and blts the result to the front buffer. This can work, but it needs some pretty good opengl magic to work at a usable speed(At the moment it just produces a black screen).
Swat3 showed another problem: It renders the main menu, mouse pointer and the in-game hud with ddraw blts, which works just great basically. In opengl the main menu renders at ~500 fps, instead of ~40 fps with the old gdi code. But whenever I move the mouse pointer over a control, it starts locking all it's surfaces in read-write mode, which requires them to be reloaded to gl and takes performance down(2-3 fps). This is made worse by the surface convertion for color keying emulation.
I expect opengl-accelerated directdraw to take some more time, and there will be some games(GetDC) which are unlikely to work well at all. I will make some modifications to the WineD3D texture loading code to only upload changed areas, which should improve the problems a bit.
Stefan
Stefan Dösinger wrote:
One of these nasty things is GetDC on the render target. Age of Empires 2 and Settlers 3 to that to draw text. This is going to be unplayably slow I'm afraid. Wine will need a dib engine for that to work.
I'm a bit unfamiliar with win32 but I know opengl; what exactly is happening in this case? Does this cause it to revert to GDI for font drawing or ... ?
Swat3 showed another problem: It renders the main menu, mouse pointer and the in-game hud with ddraw blts, which works just great basically. In opengl the main menu renders at ~500 fps, instead of ~40 fps with the old gdi code. But whenever I move the mouse pointer over a control, it starts locking all it's surfaces in read-write mode, which requires them to be reloaded to gl and takes performance down(2-3 fps). This is made worse by the surface convertion for color keying emulation.
Why does the locking require reuploading the texture? Have you tried always having textures uploaded the way you do for read-write surfaces? The performance hit may be small.
I expect opengl-accelerated directdraw to take some more time, and there will be some games(GetDC) which are unlikely to work well at all. I will make some modifications to the WineD3D texture loading code to only upload changed areas, which should improve the problems a bit.
How are you planning to do that without keeping the texture in memory after it's loaded into the card?
Am Samstag, 6. Mai 2006 19:57 schrieb Joseph Garvin:
Stefan Dösinger wrote:
One of these nasty things is GetDC on the render target. Age of Empires 2 and Settlers 3 to that to draw text. This is going to be unplayably slow I'm afraid. Wine will need a dib engine for that to work.
I'm a bit unfamiliar with win32 but I know opengl; what exactly is happening in this case? Does this cause it to revert to GDI for font drawing or ... ?
When GetDC is called, the surface has to be copied into a dib section in main memory, then a dc is created for it. On UnlockRect, the contents of the dib section are written back to gl.
For rendertargets this means that the front / back buffer has to be read with glReadPixels, and later written back with glDrawPixels or by drawing a textured quad.
Without a dib enginethe basic GetDC problem applies: The surface has to be converted into the X servers color depth, and copied into a server side bitmap. After the drawing operations, it is copied back to be uploaded to gl.
Swat3 showed another problem: It renders the main menu, mouse pointer and the in-game hud with ddraw blts, which works just great basically. In opengl the main menu renders at ~500 fps, instead of ~40 fps with the old gdi code. But whenever I move the mouse pointer over a control, it starts locking all it's surfaces in read-write mode, which requires them to be reloaded to gl and takes performance down(2-3 fps). This is made worse by the surface convertion for color keying emulation.
Why does the locking require reuploading the texture? Have you tried always having textures uploaded the way you do for read-write surfaces? The performance hit may be small.
When the texture is locked read-write by the app, it is marked dirty, and the next time it is used, the opengl texture is updated. Texture uploads take some time, and usually they are done only once while the game loads. However, swat3 locks the surfaces it uses for blts on every frame, but without changing anything.
This does not only require a glTexSubImage for each texture modification, I also have to do some texture conversion to emulate color keying(convert the 565 rgb texture to a 1555 argb texture).
I expect opengl-accelerated directdraw to take some more time, and there will be some games(GetDC) which are unlikely to work well at all. I will make some modifications to the WineD3D texture loading code to only upload changed areas, which should improve the problems a bit.
How are you planning to do that without keeping the texture in memory after it's loaded into the card?
I will keep some counters in the surface, counting * How often the surface is locked * How often the surface is actually changed The memory of a surface that isn't locked often(e.g. less than 5 times), the surface memory can be freed. If it is locked more often, the memory won't be freed, so the time spent for reading the opengl surface back is saved. To find out if a surface is really changed, I can eighter keep a copy of the old content around, or use some hashing method. A hash or memcmp will take time too, so if it turns out that the surface is locked very often without beeing changed, comparing the the surface can be skipped and the surface isn't uploaded. Of course sometimes it has to be checked, just in case it's somewhen really modified. On the other hand, if the surface is changed on every lock, the comparison can be skipped too and the new surface is just oploaded to gl.
I have an quick and dirty implementation of that, and it seems to work fine. The swat3 main menu is faster by a factor of 100(100-500 fps instead of 1-5) when swat3 does extensive surface locking. The videotex sdk demo is approx. 130 fps faster(940 instead of 810). It hits the case where the surface is locked often enought to be marked clean without beeing checked 5-6 times, and on the next check it turns out that it was modified. The movie in the cube still plays fine, the only negative effect is a slight delay before the texture update is recognised(well, when it's running at 940 fps a delay of 5 frames shouldn't be a problem.
Stefan