Hello,
The way opengl states are managed in wined3d at the moment is a bit messy and inefficient. Basically it is a mixture of modifying the opengl settings when the application modifies the d3d settings and brute-force applying the rest in drawprim. The situation has got worse with the addition of the DirectDraw blitting code and it is time to clean that up and make it more efficient.
Basically during primitive drawing the opengl setup must equal the one requested by the application. Some things can be translated 1:1, e.g. WINED3DRS_LIGHTINGENABLE to glEnable(GL_LIGHTING). Other things like fog or texture stage states are more complex. Some operations in wined3d, for example unlocking a render target or doing a 2D blt require special opengl settings.
At the moment we have the following situation: * Some things are applied when the app requests a change * Some parts are applied by brute force in drawPrimitive * BltOverride and unlockRect read the old gl states, modify it the way they need it and restore the old states back.
This has some problems: * Setting some things during each drawprim call wastes quite a few resources, due to the loops that and gl calls. * If an application performs only blts(e.g. 2D only app) or only locks the render target(playing a movie) storing and restoring the gl state is a waste of resources * Brute force applying opengl states is likely to re-set the old state unnecessarily. Opengl does not assure that redundant calls are cheap. * if an application sets and resets a state for some reason between 2 drawprim calls that involves 2 not really necessary opengl state switches * Anything else I didn't think of?
I can think of a few ways to solve this: * Do not do any opengl changes in SetRenderState, and apply all states in drawprim. This way UnlockRect and Blt don't have to care for resetting the things they changed and redundant changes can be catched nicely * Apply everything when the app requests to do so, and take as much things out of drawprim. This keeps drawprim small and efficient. * Use a mixed style like it is done with transformed vs untransformed vertex drawing with last_was_rhw. That makes it easy to find out if reapplying anything is needed and frees other functions from resetting everything.
To avoid re-setting an old setting again the current opengl state could be stored in the d3ddevice. I think this should be done in any case, and depending on the nature of some parameters the opengl state or the d3d state should be kept in there(or both). So we shouldn't use a d3d stateblock but instead our own structure where we can add the stuff we need.
I also think we should try to get rid of as much things as possible in drawprim. I do not mean entirely removing it, but having a simple flag which tells if a bigger number of states needs attention, like last_was_rhw does. I think about adding a last_was_blit flag which is set in BltOverride and UnlockRect.
I'm not sure about the texture stage states and sampler states. In d3d the settings seem to be per-stage, while in opengl I think they are per texture. If so we could store the last d3d settings that the texture was used with in the texture too see if we have to reapply them.
A partially related sidenote about drawStridedSlow: It has a loop which iterates through the vertices to draw, and in this loop there are if statements checking which data the vertex contains. So if some data isn't there this is at worst a comparison + a jump. I had a look at the vertex data 3DMark2000 uses and removed checks and handling for the things it didn't use for testing. 3DMark2000 doesn't need drawStridedSlow a lot, but yet the score increased from 4879 to 5013 3DMarks. As a comparison forcing drawStridedFast gets about 5500 3dmarks. The fps in the low detail helicopter test increased from 95.6 to 99.2(105 with drawStridedFast).
Cheers, Stefan
On 6/25/06, Stefan Dösinger stefan@codeweavers.com wrote:
The way opengl states are managed in wined3d at the moment is a bit messy and inefficient.
Agreed. :-)
- Brute force applying opengl states is likely to re-set the old state
unnecessarily. Opengl does not assure that redundant calls are cheap.
- if an application sets and resets a state for some reason between 2 drawprim
calls that involves 2 not really necessary opengl state switches
I have this problem with Fog settings in the case of shaders. If I set & restore the fog states on each drawPrimitive, it will drop the fps from 1250 to about 1000 in the Dolphin demo compared to just setting it once and leaving it. I've been holding off on a patch (pretty much taken from Roderick Colenbrander) for this reason. At the moment in current git, fog is broken when used with shaders (Dolphin demo and Tomb Raider Legend are affected).
I can think of a few ways to solve this:
- Do not do any opengl changes in SetRenderState, and apply all states in
drawprim. This way UnlockRect and Blt don't have to care for resetting the things they changed and redundant changes can be catched nicely
This style is probably the best, since we (hopefully) don't have to change much between drawPrim calls. Just have a function which compares the current GL state to the desired (cached) state, and change only what's necessary. However, some things (as you mentioned) can be changed safely during SetRenderState, and that's outside of the drawPrim loop, so hopefully it's just once per scene. Though, every game acts a bit differently, so it's tough to say how one method will work over another. But the caching method seems the cleanest to me.
To avoid re-setting an old setting again the current opengl state could be stored in the d3ddevice. I think this should be done in any case, and depending on the nature of some parameters the opengl state or the d3d state should be kept in there(or both).
Agreed.
I also think we should try to get rid of as much things as possible in drawprim. I do not mean entirely removing it, but having a simple flag which tells if a bigger number of states needs attention, like last_was_rhw does. I think about adding a last_was_blit flag which is set in BltOverride and UnlockRect.
Yep.
So, let's get a final game plan together and I can start the process with the fog patch, then we can start cleaning up everything else (which will be a fairly large undertaking but can be done in small chunks).
Jason