Am Mittwoch, 25. März 2009 14:52:15 schrieb Tobias Jakobi:
A few comments on all the patches. I am a bit busy at university right now, so I didn't have a chance to answer earlier:
From patch 4/4:
if (!use_ps(stateblock)) {
TRACE("Non power two matrix multiply fixup\n");
glMultMatrixf(((IWineD3DTextureImpl *)
stateblock->textures[texUnit])->baseTexture.pow2Matrix);
}
I think a comment would be helpful why this isn't done when pixel shaders are used, as this can't be seen from the context of this code.
if (GL_TEXTURE_RECTANGLE_ARB ==
IWineD3DBaseTexture_GetTextureDimensions(stateblock->textures[sampler]) &&
use_ps(stateblock)) {
Maybe it would be helpful to add a flag that says if the fixup matrix is identity. WineD3D can emulate np2 textures with padded GL_TEXTURE_2D textures as well. Now I can't think of any hardware that has shaders, but not GL_ARB_texture_rectangle, but checking for GL_TEXTURE_RECTANGLE_ARB to find out if we need a fixup isn't that nice(We might do it in other places too, though)
From patch 1:
- /* Bitmap for texture rect coord fixups (32 samplers max currently).
* Increase size when MAX_FRAGMENT_SAMPLERS grows. */
- UINT texrect_fixup;
I think there's no point in supporting more than 16 samplers. That's the limit in d3d9 and will never change, and d3d10 makes unconditional NP2 support mandatory. The ps_compile_args structure is memcmp'ed every time the shader is changed, so the size matters. I also recommend to move it to the bottom of the structure, since it will be 0x0000 in most cases, so it prolongs the number of bytes the memcmp has to check before it finds a difference.(ok: In which direction does memcmp search?)
General: I also think that you could merge the 4 patches into one or two. I appreciate the effort to make the patches small, but that shouldn't be a self-sufficient purpose. Patches 1 and 2 can certainly be merged, as patch 1 alone doesn't do anything. Patch 4 could be merged into patch 3 because it is required for the code in patch 3 to work correctly.
The patches leave out one corner case(Subject for a separate patch though): The use of a vertex shader with fixed function fragment processing. In this case the vertex shader has to take care of the fixup. In GLSL that can conveniently be done in the pixelshader<->vertexshader glue function. It could read the same uniform and multiply the written texture coordinate.
Once all the cases work in GLSL, we could experiment with the R and RG float formats from GL_NV_float_buffer, since they're supported on RECT textures only, but may come handy on pre-geforce 8 hardware which doesn't support the ARB_RG extension.