Hi,
As you all might know 2d games tend to be slow on Wine. For a lot of games the main bottleneck is depth conversion which happens in cases when the depth requested by the game and the X desktop color are not the same.
As a way to speedup 2d Lionel assisted me with hacking wine's ddraw to let it use parts of the Direct3D backend. The final rendering is then done using OpenGL and the end result is that the videocard does the color conversion for us. The patch greatly improves the performance of 2d games which don't use GetDC/ReleaseDC a lot.
While the patch fixes the conversion bottleneck for various games it doesn't handle 8bit paletted which is used by games like StarCraft as OpenGL doesn't support this by default. The second patch which I attached aswell adds support for this. On cards (atleast all nvidia cards from geforce 1 to the fx) that support the opengl paletted texture extension this extension is used. It makes StarCraft very fast atleast on my Athlon XP2000 system with a GeforceFX where the game was slow before. As not all cards support paletted textures I emulated this using a simple fragment shader. (a 1D texture containing the palette is used as a loopup table)
The attached patches are still experimental and likely contain bugs so please test them. When the patches are applied set 'HKCU\Software\Wine\DirectDraw\UseDDrawOverD3D' to Y else it won't do anything :)
Further note that lots of games like to use multiple threads for graphics... using the patch games use 'd3d' (or actually opengl) which adds more multithreaded d3d games :) Games like command&conquer, redalert, total annihilation and lots of others became multithreaded. (they crash quite quickly due to some critical section in x11drv)
All have fun with the patches and please report any issues that appear so that i can fix the patches and submit them to wine-patches,
Roderick Colenbrander
On 12/4/05, Roderick Colenbrander thunderbird2k@gmx.net wrote:
Hi,
As you all might know 2d games tend to be slow on Wine. For a lot of games the main bottleneck is depth conversion which happens in cases when the depth requested by the game and the X desktop color are not the same.
As a way to speedup 2d Lionel assisted me with hacking wine's ddraw to let it use parts of the Direct3D backend. The final rendering is then done using OpenGL and the end result is that the videocard does the color conversion for us. The patch greatly improves the performance of 2d games which don't use GetDC/ReleaseDC a lot.
While the patch fixes the conversion bottleneck for various games it doesn't handle 8bit paletted which is used by games like StarCraft as OpenGL doesn't support this by default. The second patch which I attached aswell adds support for this. On cards (atleast all nvidia cards from geforce 1 to the fx) that support the opengl paletted texture extension this extension is used. It makes StarCraft very fast atleast on my Athlon XP2000 system with a GeforceFX where the game was slow before. As not all cards support paletted textures I emulated this using a simple fragment shader. (a 1D texture containing the palette is used as a loopup table)
Is Starcraft really that slow? How does this compare with using DGA? I'm not too sure because its speed vaires. I've been testing Starcraft this weekend and it has been plenty speedy. But I do remember when trying to play it multiplayer a few months ago and was burned when it ran slow. In fact it slowed *everyone* down. Not fun.
This patch seems similar to glSDL where it wraps SDL's 2d API to OpenGL. The good thing about this it can provide acceleration and not require root like DGA. The bad thing with this idea is that it can't be used on older video cards or even some newer ones that lack proper direct rendering. Am I correct that even when just doing depth conversions, without direct rendering it will still be slow?
Jesse
Is Starcraft really that slow? How does this compare with using DGA? I'm not too sure because its speed vaires. I've been testing Starcraft this weekend and it has been plenty speedy. But I do remember when trying to play it multiplayer a few months ago and was burned when it ran slow. In fact it slowed *everyone* down. Not fun.
I fired up Total Annihilation just yesterday with Wine 0.9.2 and it was very slow. TA uses 8bit color and I'm running in 24bit at 3200x1200 (2x 19" CRTs) on a Pentium II 450Mhz with a GeForceFX 5900XT -- a bit of an odd combination... I know. If you'd like a speed comparison, perhaps with associated CPU usage I may be able to oblige.
--tim
The patch can make TA a lot faster the problem is that the game crashes because it becomes multithreaded. Command&Conquer (which crashes when you move the mouse) felt a lot faster, further StarCraft is a lot faster too. When the multithreading issue is over TA will most likely be playable on your system.
Roderick
I fired up Total Annihilation just yesterday with Wine 0.9.2 and it was very slow. TA uses 8bit color and I'm running in 24bit at 3200x1200 (2x 19" CRTs) on a Pentium II 450Mhz with a GeForceFX 5900XT -- a bit of an odd combination... I know. If you'd like a speed comparison, perhaps with associated CPU usage I may be able to oblige.
--tim
Am Montag, 5. Dezember 2005 07:48 schrieb Roderick Colenbrander:
The patch can make TA a lot faster the problem is that the game crashes because it becomes multithreaded. Command&Conquer (which crashes when you move the mouse) felt a lot faster, further StarCraft is a lot faster too. When the multithreading issue is over TA will most likely be playable on your system.
I am working on a patch which makes Direct3D7 running over WineD3D. When WineD3D has multithreading support one day and my patch is in place, these problems should be solved for D3D7 games like Empire Earth and DDraw games running with your patch.
I hope I can come up with a patch for testing this week, then we can have a look ;)
On Mon, Dec 05, 2005 at 08:17:38AM +0100, Stefan Dösinger wrote:
I hope I can come up with a patch for testing this week, then we can have a look ;)
I think that to merge Roderick's and your patch, the best would be to directly hook WineD3D even at the 2D level and not have DDraw hook DDraw's D3D which then goes into WineD3D.
This way we would have an unified DDraw.
Of course, then the problem remains of what to do with older cards :-)
Lionel
Am Montag, 5. Dezember 2005 21:10 schrieb Lionel Ulmer:
On Mon, Dec 05, 2005 at 08:17:38AM +0100, Stefan Dösinger wrote:
I hope I can come up with a patch for testing this week, then we can have a look ;)
I think that to merge Roderick's and your patch, the best would be to directly hook WineD3D even at the 2D level and not have DDraw hook DDraw's D3D which then goes into WineD3D.
I thought of the same thing, as DDraw -> D3D7 -> WineD3D -> OpenGL is a quite long chain. I abadonned it as too much work at first, but it's surely worth considering.
This way we would have an unified DDraw.
Of course, then the problem remains of what to do with older cards :-)
How about moving the current 2D code to WineD3D, and making DDraw running over WineD3D in any case. Then WineD3D could decide wether to use plain X11, DGA or OpenGL for 2D rendering. :)
Maybe we should have a close look at the details of such a thing. I do not really recommend a ad-hoc attemt, as d3d7->WineD3D was nearly too much.
Stefan
How about moving the current 2D code to WineD3D, and making DDraw running over WineD3D in any case. Then WineD3D could decide wether to use plain X11, DGA or OpenGL for 2D rendering. :)
That basically bows down to have WineD3D completely replace the 'HAL' architecture which was introduced by the TG merge.
Which is kinda an idea that I like :-)
Maybe we should have a close look at the details of such a thing. I do not really recommend a ad-hoc attemt, as d3d7->WineD3D was nearly too much.
Well, we can first stabilize D3D7 => WineD3D (which will already be quite a lot of work to first do and then polish) and then see if we move the 2D part too.
And then we will need to rename it to something else ... if not merge it integrally into the X11DRV so that other DLLs (like the new Vista ones :-) ) would be able to do 3D accelerations :-)
Lionel
Well, we can first stabilize D3D7 => WineD3D (which will already be quite a lot of work to first do and then polish) and then see if we move the 2D part too.
Moving both 2D and D3D7 at the same time would make the handling of D3D7 surfaces easier. At the moment, I have overrides for IDirectDrawSurface->Lock / Unlock and IDirectDrawSurface->Blt which access the WineD3DSurfaces' memory instead of the DDraw surfaces'. If the whole DDraw functionality is moved, this could be solved nicer.
And then we will need to rename it to something else ... if not merge it integrally into the X11DRV so that other DLLs (like the new Vista ones :-) ) would be able to do 3D accelerations :-)
Perhaps we should do that now, instead of adding extra work later. I think it's easier to make such changes 9 months before the Vista release instead of sometimes later, when users demand these functionalities.
Anyway, what are these dlls? I don't know much about Vista, I've only heard about 3D Desktop functionalities that apply to whole windows or window decorations. Is there anything new in the GDI area or is there any replacement for GDI(I think I've heard so)?
And what about .NET and Mono? I've just started reading an article about managed Direct3D. I guess Mono could use wines Direct3D too.
I'll show my current D3D7 -> WineD3D patch soon (today / tomorrow). It's far from being complete, and it's hardly working, but I'd like to ask for comments on the way it's going.
Stefan
Hi,
Is Starcraft really that slow? How does this compare with using DGA? I'm not too sure because its speed vaires. I've been testing Starcraft this weekend and it has been plenty speedy. But I do remember when trying to play it multiplayer a few months ago and was burned when it ran slow. In fact it slowed *everyone* down. Not fun.
Atleast on my system StarCraft was really unplayable while the game is supposed to run on a 20x slower system (pentium90). Perhaps the problem is video driver related as I think the issues are less severe for Ati users. (thought I heard this but not sure)
This patch seems similar to glSDL where it wraps SDL's 2d API to OpenGL. The good thing about this it can provide acceleration and not require root like DGA. The bad thing with this idea is that it can't be used on older video cards or even some newer ones that lack proper direct rendering. Am I correct that even when just doing depth conversions, without direct rendering it will still be slow?
The code won't work for cards without hw accelerated opengl but note that most cards these days have opengl support. A simple Riva TNT is already quite old but is still supported on Linux and it should do a fine job only it can't handle the palette conversion. I haven't tested performance much using Mesa, I could try that soon using some games but it doesn't have to be slower than the current code as Mesa uses Xlib too in the indirect case. In general about all cards these days have some form of opengl support (intel, via, ati, nvidia, ..) perhaps only the latest S3 GPUs and perhaps some sis ones (thought some have drivers) lack support. Note that cards don't need much functionality at all as OpenGL is only for the uploading of textures and the rendering of them.
I think the patch is a reasonable solution to work around various depth conversion problems. For sure it is the fastest way for the conversion as the videocard basicly does it for free. On my system StarCraft and the Command&Conquer series (although they crash quite quickly due to threading issues) felt a lot faster, I think that the speed is close to that of DGA.
Roderick
On 12/5/05, Roderick Colenbrander thunderbird2k@gmx.net wrote:
Atleast on my system StarCraft was really unplayable while the game is supposed to run on a 20x slower system (pentium90). Perhaps the problem is video driver related as I think the issues are less severe for Ati users. (thought I heard this but not sure)
Well I am using a 1.9 Ghz Athlon XP Barton 2600+ with an ATI 9200 with DRI. Don't know if it makes a difference.
The code won't work for cards without hw accelerated opengl but note that most cards these days have opengl support. A simple Riva TNT is already quite old but is still supported on Linux and it should do a fine job only it can't handle the palette conversion. I haven't tested performance much using Mesa, I could try that soon using some games but it doesn't have to be slower than the current code as Mesa uses Xlib too in the indirect case. In general about all cards these days have some form of opengl support (intel, via, ati, nvidia, ..) perhaps only the latest S3 GPUs and perhaps some sis ones (thought some have drivers) lack support. Note that cards don't need much functionality at all as OpenGL is only for the uploading of textures and the rendering of them.
I think the patch is a reasonable solution to work around various depth conversion problems. For sure it is the fastest way for the conversion as the videocard basicly does it for free. On my system StarCraft and the Command&Conquer series (although they crash quite quickly due to threading issues) felt a lot faster, I think that the speed is close to that of DGA.
Roderick
Yes, it is a reasonable solution for todays machines which should all have some kind of opengl acceleration support.
On Sun, Dec 04, 2005 at 07:27:29PM -0700, Jesse Allen wrote:
Is Starcraft really that slow? How does this compare with using DGA?
Nothing can beat DGA (actually DGA2 as you would need depth change to go to 8 bit colours to run StarCraft) in raw speed as it is the method which does copy the less memory over.
This patch seems similar to glSDL where it wraps SDL's 2d API to OpenGL. The good thing about this it can provide acceleration and not require root like DGA. The bad thing with this idea is that it can't be used on older video cards or even some newer ones that lack proper direct rendering.
The problem is not direct rendering, it's more 'basic' OpenGL acceleration.
Am I correct that even when just doing depth conversions, without direct rendering it will still be slow?
If you have accelerated OpenGL rendering but without direct access, I think that it should still be faster than the current method (as you would 1) move less data over the GPU bus and 2) do less CPU computation on a big amount of data).
Lionel
Lionel Ulmer schrieb:
On Sun, Dec 04, 2005 at 07:27:29PM -0700, Jesse Allen wrote:
This patch seems similar to glSDL where it wraps SDL's 2d API to OpenGL. The good thing about this it can provide acceleration and not require root like DGA. The bad thing with this idea is that it can't be used on older video cards or even some newer ones that lack proper direct rendering.
The problem is not direct rendering, it's more 'basic' OpenGL acceleration.
Am I correct that even when just doing depth conversions, without direct rendering it will still be slow?
If you have accelerated OpenGL rendering but without direct access, I think that it should still be faster than the current method (as you would 1) move less data over the GPU bus and 2) do less CPU computation on a big amount of data).
Isn't indirect rendering always unaccelerated, i.e. done in software?
On Mon, Dec 05, 2005 at 10:22:23PM +0100, Peter Beutner wrote:
Isn't indirect rendering always unaccelerated, i.e. done in software?
Nope. Indirect means only that all your OpenGL commands are encapsuled into GLX commands and then serialized over the X network link (which could be a local Unix socket) and deserialized at the other side before being sent to the graphic card.
It may well be that no current Linux GL drivers support accelerated indirect rendring (although I think that the NVIDIA drivers support it - would verify it if I had GL installed on my head-less server box) and you may be right in the current situation if not in the terms to use to describe it :-)
Lionel
Lionel Ulmer schrieb:
On Mon, Dec 05, 2005 at 10:22:23PM +0100, Peter Beutner wrote:
Isn't indirect rendering always unaccelerated, i.e. done in software?
Nope. Indirect means only that all your OpenGL commands are encapsuled into GLX commands and then serialized over the X network link (which could be a local Unix socket) and deserialized at the other side before being sent to the graphic card.
It may well be that no current Linux GL drivers support accelerated indirect rendring (although I think that the NVIDIA drivers support it - would verify it if I had GL installed on my head-less server box) and you may be right in the current situation if not in the terms to use to describe it :-)
Yup, you're right, that's how it is supposed to work ;) But iirc it cant work on current Xorg due to the fact that the server-side libglx can't load the dri driver because the server-side glx<->libGL interface differs from the one on the client-side. See http://www.cs.pdx.edu/~idr/publications/ddc-2005.pdf. So all GL commands will go through software rendering via mesa when doing indirect rendering.
Dunno if that applies as well to the closed-source drivers from nvidia/ati which ships its complete own GL stack.
Peter