State management in D3D... is currently kind of a mess. Stefan D. posted a message about this some time ago, but I can't find it right now...
- some states are applied immediately (via Set* calls) - some are applied at draw time (like shaders, textures, transforms, ...) - some are recorded into a stateblock structure, and applied when the app says apply(). - the GL code is all over the place, tightly coupled to the d3d code.
=============================== There's a number of projects going on at the moment, from what I understand:
- Roderick is working on making wined3d -> wgl - Stefan D. has expressed concern about multithreaded d3d (where the D3D device state is shared, but not necessarily the GL one - or can GL contexts be shared between multiple threads?) - I'm trying to get FBOs to work, where the FBO needs to be bound once both the render and stencil target have been assigned (that's 2 calls from the app in any order) - i.e I need those states to be applied in a delayed fashion, before draw - Henri Verbeet wants multiple render target support added to that (meaning the FBO needs to be bound once *all* render targets are assigned in addition to the depth/stencil one). ================================ All of the above have a common theme - better state management is needed, with more encapsulation, and better separation between D3D and GL. So, let's come up with a plan, and try to implement it. How about we redesign the stateblock object like this:
- remove deltas. I.E. if a SetLightEnable() command is sent, fetch the light, enable it, then save the light back - store only state, and no deltas. This should make the recording stateblock (updateStateBlock) the same as the initial device stateblock, which stores states
- provide a uniform internal interface for accessing states inside the stateblock - like: SetState(stateblock, ID_XYZ, (void*) state_data); GetState(stateblock, ID_XYZ, (void**) state_data); CaptureState(stateblock, ID_XYZ); ApplyState(stateblock, ID_XYZ);
ID_XYZ could be an individual state, or a "trigger" keyword, which will refer to a whole group of states. Those would be private functions in addition to the standard interface.
- provide a fn pointer table inside the stateblock [ which can be directed to OGL or WGL or AGL ], which, for each ID_XYZ, maps a get and set function using the same interface
- move all device.c GL code into those functions
- now all device.c Get* and Set* requests will do is: - error checking - recording into updateStateBlock - or writing to stateBlock (which may be applied later, at our discretion).
- apply() would just loop through all the IDs and call the corresponding function pointer if the states are marked dirty (we'll keep the 'dirty' field)
- capture() would do the same in the get* direction.
- new object can be instantiated per device, or per device per thread to address multithreading. It could have an associated glContext, and can be locked as necessary.
- it would use ideas from the d3d9 test framework for stateblock, except of course more competently written, and cleaner :)
I guess that seems like a large undertaking, and those are all doomed to failure.. but it doesn't have to be.
The key idea that I care about seems to be to move GL code from device.c into the data structure object, and figure out a way to apply a set of delayed states at draw time. We don't have to replace everything right now - we could have 2 coexisting data structures and slowly move things from one to the other, but I wanted to see if people agree with that idea.
I don't like the way things are done right now - Set* functions can do one of two things - record to a stateblock, or apply state. Then the stateblock calls the Set* functions itself when it's applied - seems very ugly to me [ and also in certain places we're forced to disable recording to get a state applied immediately using a Set* function ].
On 11/09/06, Ivan Gyurdiev ivg231@gmail.com wrote:
I guess that seems like a large undertaking, and those are all doomed to failure.. but it doesn't have to be.
The key idea that I care about seems to be to move GL code from device.c into the data structure object, and figure out a way to apply a set of delayed states at draw time. We don't have to replace everything right now - we could have 2 coexisting data structures and slowly move things from one to the other, but I wanted to see if people agree with that idea.
Ok, so the main idea is to separate the applying of GL state from the tracking of D3D state. Looks like a good idea. What I would like to add to that is something BBrox mentioned on IRC a while back... grouping related states together and marking that group dirty / clean. That way we would get a tree like structure for the states, which would make checking what states changed and need to be applied somewhat faster. While it would be possible to add that afterwards, I think it would be easier to just take it into account when designing the new stateblock structure.
I don't like the way things are done right now - Set* functions can do one of two things - record to a stateblock, or apply state. Then the stateblock calls the Set* functions itself when it's applied - seems very ugly to me [ and also in certain places we're forced to disable recording to get a state applied immediately using a Set* function ].
You always record to a stateblock, be it the main device stateblock or the update stateblock, but yes, it's pretty ugly.
Hi,
Ok, so the main idea is to separate the applying of GL state from the tracking of D3D state. Looks like a good idea.
Fully agreed
What I would like to add to that is something BBrox mentioned on IRC a while back... grouping related states together and marking that group dirty / clean. That way we would get a tree like structure for the states, which would make checking what states changed and need to be applied somewhat faster. While it would be possible to add that afterwards, I think it would be easier to just take it into account when designing the new stateblock structure.
I think we should do the change quicky, even if we risk regressions. I do not think that we should add comments stating "if you add new gl stuff add it to <new file.c>". But I think none of you wants that :-) What we can do for sure is to move render states, sampler states, matriced and bound shaders seperately, which we should do to keep patches small :-)
I don't like the way things are done right now - Set* functions can do one of two things - record to a stateblock, or apply state. Then the stateblock calls the Set* functions itself when it's applied - seems very ugly to me [ and also in certain places we're forced to disable recording to get a state applied immediately using a Set* function ].
You always record to a stateblock, be it the main device stateblock or the update stateblock, but yes, it's pretty ugly.
Let me illustrate my idea:
* Move out the GL calls from Set*State. Set*State writes the values to the update stateblock and updates the refcounts(maybe we should kick internal refcounting from wined3d altogether)
* Keep the stateblock and update stateblock structure as they are now. I think for recording stateblocks the idea is quite good
* Keep a list of dirty states for each gl context in use: We don't need something as fancy as trees for that, a little array can do the job, like this(example for render states, but can be used for all other stuff too):
WINED3DRENDERSTATETYPE updatedStates[WINEHIGHEST_RENDER_STATE] DWORD numDirtyStates;
SetRenderState(device, state, newValue) sets updatedStates[numDirtyStates] = state for each context and increments numDirtyStates. It doesn't store the value of the state.
In drawprim we have a loop for(i = 0; i < numDirtyStates; i++) { set_render_state(updatedStates[i]); } numDirtyStates = 0;
set_render_state does the opengl stuff. We can put that function into drawprim.c or a new file, e.g. opengl_utils.c like in old ddraw.
This concept can be optimized a bit: To group common states we can do that:
static const WINED3DRENDERSTATETYPE stategroup[] = { /*0*/ 0, /*WINED3DRS_TEXTUREHANDLE*/ 0, /*WINED3DRS_ANTIALIAS*/ WINED3DRS_ANTIALIAS, ... /*WINED3DRS_TEXTUREMAPBLEND*/ 0, ... /*WINED3DRS_FOGENABLE*/ WINED3DRS_FOGENABLE, ... /*WINED3DRS_FOGCOLOR*/ WINED3DRS_FOGCOLOR, /*WINED3DRS_FOGTABLEMODE*/ WINED3DRS_FOGTABLEMODE /*WINED3DRS_FOGSTART*/ WINED3DRS_FOGTABLEMODE /*WINED3DRS_FOGEND*/ WINED3DRS_FOGTABLEMODE /*WINED3DRS_FOGDENSITY*/ WINED3DRS_FOGDENSITY, ... /*WINED3DRS_FOGVERTEXMODE*/ WINED3DRS_FOGTABLEMODE ... /*WINED3DRS_BLENDOPALPHA*/ WINED3DRS_BLENDOPALPHA }; The current code applies FOGVERTEXMODE and FOGTABLEMODE in the same code, because the resulting gl values depend on both states. Also the applied fog range is important for this, even if we do not have it in the same group right now. (How come? looks buggy to me. Well, I was the one who did that). WINED3DRS_TEXTUREHANDLE and WINED3DRS_TEXTUREMAPPEDBLEND are legacy states which are wrapped to SetTexture and SetTextureStageState in ddraw.dll, so wined3d doesn't have to deal with them. We set them to 0 and cry bloody murder if such a state is applied.
With this modification SetRenderState would set updatedStates[numDirtyStates] = stategroup[state]; The little drawback is that we have 209 DWORDs hanging around, with some of them beeing plain useless. Well, that are 836 Bytes, and we safe some of them because set_render_state needs less case WINED3DRS_FOO: marks.
The above optimization doesn't bring much yet. Instead of applying 4 different fog states we apply FOGENABLE 4 times. But we can change the .changed field for each state in the stateblock to a changed[num contexts] array(dynamically allocated preferably). It is set to true when the state is changed the first time, and set to 0 when set_render_state applies the gl state. When it is TRUE already for the context then SetRenderState doesn't have to put the state again onto the change list. This way we can limit the size of the changed state array to WINEHIGHEST_RENDER_STATE too :-)
On 11/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
- Keep a list of dirty states for each gl context in use: We don't need
something as fancy as trees for that, a little array can do the job, like this(example for render states, but can be used for all other stuff too):
You would at least need to use a proper list. Consider an application that sets the same state multiple times. Note that what you're proposing is pretty similar (in basis) to the way we currently handle shader constants loading.
Am Montag 11 September 2006 18:41 schrieb H. Verbeet:
On 11/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
- Keep a list of dirty states for each gl context in use: We don't need
something as fancy as trees for that, a little array can do the job, like this(example for render states, but can be used for all other stuff too):
You would at least need to use a proper list. Consider an application that sets the same state multiple times. Note that what you're proposing is pretty similar (in basis) to the way we currently handle shader constants loading.
That's what I'd use the state.changed field for. Set it to TRUE when the state is first modified and to FALSE when it it applied to gl. Do not add the state to the list of changes whn state.changed == TRUE
On 11/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
That's what I'd use the state.changed field for. Set it to TRUE when the state is first modified and to FALSE when it it applied to gl. Do not add the state to the list of changes whn state.changed == TRUE
Well, sure, that's what the constants loading code does as well, but I still like a list better :-)
Am Montag 11 September 2006 19:56 schrieb H. Verbeet:
On 11/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
That's what I'd use the state.changed field for. Set it to TRUE when the state is first modified and to FALSE when it it applied to gl. Do not add the state to the list of changes whn state.changed == TRUE
Well, sure, that's what the constants loading code does as well, but I still like a list better :-)
What would the list look like? Lionel was talking about some tree.
How would the complexity of the various operations compare? With an array and the chaned marker we have constant complexity for adding an element, determining if the list is empty, finding an element(the changed marker can store the position + 1) and emptiying the list. That is, I think, everything we need. We can't cheaply remove a single state from the dirty list, but I don't think we need this.
On 11/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
Am Montag 11 September 2006 19:56 schrieb H. Verbeet:
On 11/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
That's what I'd use the state.changed field for. Set it to TRUE when the state is first modified and to FALSE when it it applied to gl. Do not add the state to the list of changes whn state.changed == TRUE
Well, sure, that's what the constants loading code does as well, but I still like a list better :-)
What would the list look like? Lionel was talking about some tree.
That's not related to the trees thing, but your proposal with a list instead of a fixed size array.
How would the complexity of the various operations compare? With an array and the chaned marker we have constant complexity for adding an element, determining if the list is empty, finding an element(the changed marker can store the position + 1) and emptiying the list. That is, I think, everything we need. We can't cheaply remove a single state from the dirty list, but I don't think we need this.
Instead of a boolean dirty flag, you could store a pointer to the list element :-)
Am Montag 11 September 2006 23:36 schrieb H. Verbeet:
On 11/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
Am Montag 11 September 2006 19:56 schrieb H. Verbeet:
On 11/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
That's what I'd use the state.changed field for. Set it to TRUE when the state is first modified and to FALSE when it it applied to gl. Do not add the state to the list of changes whn state.changed == TRUE
Well, sure, that's what the constants loading code does as well, but I still like a list better :-)
What would the list look like? Lionel was talking about some tree.
That's not related to the trees thing, but your proposal with a list instead of a fixed size array.
(the changed marker can store the position + 1)
Instead of a boolean dirty flag, you could store a pointer to the list element :-)
Yeah, but we still can't remove the entry. Or wait, set it to 0 and implement state 0 as a nop-apply :-)
On 12/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
Yeah, but we still can't remove the entry. Or wait, set it to 0 and implement state 0 as a nop-apply :-)
Why wouldn't you be able to remove an entry from a list?
Am Dienstag 12 September 2006 18:13 schrieb H. Verbeet:
On 12/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
Yeah, but we still can't remove the entry. Or wait, set it to 0 and implement state 0 as a nop-apply :-)
Why wouldn't you be able to remove an entry from a list?
With the array, we can only truly remove a single element by moving all other entries by one, reducing the total amount of entries and adjusting all values that specify a list index. This is possible, but the amount of work needed is growing linearly with the numbers of elements in the list.
However, we can set the to delete value to 0, and if our apply function hits the state 0 to apply, it just continues with the next state :-) It is not truly removed then, but this works too and is much cheaper.
On 12/09/06, Stefan Dösinger stefandoesinger@gmx.at wrote:
Why wouldn't you be able to remove an entry from a list?
With the array, we can only truly remove a single element by moving all other entries by one, reducing the total amount of entries and adjusting all values that specify a list index. This is possible, but the amount of work needed is growing linearly with the numbers of elements in the list.
An array is not the same as a list... :-)