Well, I've fixed the speed issues with using GLSL (it's now just as fast as ARB shaders for the ones that actually work correctly) and gotten a few other things to work correctly. However, I'm still having issues with certain pixel/vertex shader combinations. Ohsix caught on that it may have to do with sampler limits on graphics cards and how sampling/textures are handled different in GLSL that with ARB. I'm not sure at the moment and could use some help.
http://cmhousing.net/wine/glsl_hack6.diff
is the diff against the most recent git tree. There are still some ugly hacks lying around in there. Note that I've separated out most of the GLSL functions into a new file "glsl_shader.c". Once again, to use GLSL instead of ARB shaders, you'll have to apply this patch, have a graphics card that's capable of GLSL, and add the following case senstive registry key:
HKCU\Software\Wine\Direct3D\UseGLSL = "enabled" (String value)
Hi Jason,
Can you just clarify for my why you are changing the D3DVS_VERSION() and D3DPS_VERSION() to 3.0 for SHADER_ARB in the caps section of directx.c?
SHADER_ARB refers to ARB_vertex_program and ARB_fragment_program which do not support features greater than VS 1.1 and PS 1.4 respectively. This code does not affect SHADER_GLSL at all.
Thanks, Phil C
Jason Green wrote:
Well, I've fixed the speed issues with using GLSL (it's now just as fast as ARB shaders for the ones that actually work correctly) and gotten a few other things to work correctly. However, I'm still having issues with certain pixel/vertex shader combinations. Ohsix caught on that it may have to do with sampler limits on graphics cards and how sampling/textures are handled different in GLSL that with ARB. I'm not sure at the moment and could use some help.
http://cmhousing.net/wine/glsl_hack6.diff
is the diff against the most recent git tree. There are still some ugly hacks lying around in there. Note that I've separated out most of the GLSL functions into a new file "glsl_shader.c". Once again, to use GLSL instead of ARB shaders, you'll have to apply this patch, have a graphics card that's capable of GLSL, and add the following case senstive registry key:
HKCU\Software\Wine\Direct3D\UseGLSL = "enabled" (String value)
To clarify - It doesn't matter that you're changing the version of the future software pixel shader support capability but the vertex shader change is problematic because apps may see the 3.0 VS capability and supply a 3.0 shader to a 1.4 capable GL implementation.
Phil
Phil Costin wrote:
Hi Jason,
Can you just clarify for my why you are changing the D3DVS_VERSION() and D3DPS_VERSION() to 3.0 for SHADER_ARB in the caps section of directx.c?
SHADER_ARB refers to ARB_vertex_program and ARB_fragment_program which do not support features greater than VS 1.1 and PS 1.4 respectively. This code does not affect SHADER_GLSL at all.
Thanks, Phil C
On 6/3/06, Phil Costin [email protected] wrote:
Hi Jason,
Can you just clarify for my why you are changing the D3DVS_VERSION() and D3DPS_VERSION() to 3.0 for SHADER_ARB in the caps section of directx.c?
Sorry, that snuck in. There are some 2.0 shaders that can be handled by ARB already, so I was just telling the app that it supported them as a test. That part shouldn't have been there. :-)
+ /* Declare textures samplers */ + for(i = 0; i < /*This->baseShader.limits.texture */ 4; i++) { +// if (reg_maps->texcoord & (1 << i)) + shader_addline(buffer,"uniform sampler2D mytex%lu;\n", i); + }
What's the problem here?
+ if (wined3d_settings.shader_mode == SHADER_GLSL) + shader_glsl_add_instruction_modifiers(&hw_arg);
If you choose to pull modifier handling out of the per-opcode function, this should be done for SHADER_ARB as well, imho.
+ for (i=0; i<max_constants; ++i) { ... - for (i = 0; i <= WINED3D_VSHADER_MAX_CONSTANTS; ++i) {
Is this fixing off-by-one bug in the old code in the process (or introducing one)?
+ loadConstants(iface, GL_VERTEX_PROGRAM_ARB, constants, NULL, + vshader->baseShader.limits.constant_float, programId, TRUE);
What's the purpose of the "skip_type_check" parameter - TRUE here. Doesn't the current code only work for float constants anyway?
+ /* Update the constants */ + loadConstants(iface, GL_FRAGMENT_PROGRAM_ARB, This->stateBlock->pixelShaderConstantF, + This->stateBlock->pixelShaderConstantT, pshader->baseShader.limits.constant_float, + programId, FALSE);
Those limits aren't checked against GL capabilities yet. The current 3.0 limit is 224 float constants - does this work properly?
+void shader_glsl_add_instruction_modifiers(SHADER_OPCODE_ARG* arg
Doesn't seem to handle multiple modifiers - I think I changed this elsewhere recently.
+ FIXME("Unrecognized modifier(0x%#lx)\n", mask >> D3DSP_DSTMOD_SHIFT);
0x and # are redundant.
+ SHADER_PARAM_FCT_T param_fct;
Please don't add any more fields that do not need to persist into base shader. Why are two different param functions needed?
+extern void shader_glsl_add_cast( + int numSwizzles, + char *hwLine);
Where is this defined?
+void pshader_glsl_def(SHADER_OPCODE_ARG* arg) { ... + shader->constants[reg & 0xFF] = VS_CONSTANT_CONSTANT;
That's not what the ARBfp code does - pixel shaders are different from vertex shader (and I don't see why, conceptually this is all the same thing, only the limit is slightly different).
+ /* Set constants for the temporary argument */
I'm not sure I like how the mnxn functions use a temporary argument structure and re-invoke the gen. functions. Do they initialize every field necessary in the arg? I might have added new fields and missed the initialization part in there - maybe this arg should be 0-ed first with memcpy.
+ GLhandleARB objList[4]; /* There should never be more than 2 objects attached + (one pixel shader and one vertex shader), but we'll be safe */
This doesn't make sense to me - either you know that there are 2 objects attached exactly, or you don't. In either case, it's wrong to cover up a potential bug with a 4-element array.
+ * Note: The downside to this method is that we may end up compiling and linking + * programs that will not be used. Linking is expensive, but it's typically better to + * create more programs than necessary and just bind them when needed than it is + * to create a new program every time you set the shaders + */
This is an elaborate scheme with linked lists in each shader - is that really necessary? I'm very confused after reading the code (but maybe that's just me :) Maybe a test case would be useful?
If I call:
X = CreatePixelShader(dev); Y = CreateVertexShader(dev); Z = CreateVertexShader(dev); SetPixelShader(dev, X); SetVertexShader(dev, Y); SetVertexShader(dev, Z);
How many programs are being created? What objects are attached to each? Which linked lists does each program go into, and What's the sequence of things being attached/detached, programs created/destroyed?
On 6/4/06, Ivan Gyurdiev [email protected] wrote:
- /* Declare textures samplers */
- for(i = 0; i < /*This->baseShader.limits.texture */ 4; i++) {
+// if (reg_maps->texcoord & (1 << i))
shader_addline(buffer,"uniform sampler2D mytex%lu;\n", i);
- }
What's the problem here?
A lot of this is just from debugging. The code is still in the proof of concept phase.
if (wined3d_settings.shader_mode == SHADER_GLSL)
shader_glsl_add_instruction_modifiers(&hw_arg);
If you choose to pull modifier handling out of the per-opcode function, this should be done for SHADER_ARB as well, imho.
All of the ARB modifiers are run prior to the command. Whereas, (at least so far), the GLSL instruction modifiers are necessary to run after the command.
for (i=0; i<max_constants; ++i) {
...
for (i = 0; i <= WINED3D_VSHADER_MAX_CONSTANTS; ++i) {
Is this fixing off-by-one bug in the old code in the process (or introducing one)?
Possibly introducing - nice catch.
loadConstants(iface, GL_VERTEX_PROGRAM_ARB, constants,
NULL,
vshader->baseShader.limits.constant_float, programId, TRUE);
What's the purpose of the "skip_type_check" parameter - TRUE here. Doesn't the current code only work for float constants anyway?
It's passing NULL as the type array (because VertexDeclaration constants are always floats and they don't have one).
/* Update the constants */
loadConstants(iface, GL_FRAGMENT_PROGRAM_ARB,
This->stateBlock->pixelShaderConstantF,
This->stateBlock->pixelShaderConstantT,
pshader->baseShader.limits.constant_float,
programId, FALSE);
Those limits aren't checked against GL capabilities yet. The current 3.0 limit is 224 float constants - does this work properly?
The limits should be checked against the caps in the functions that set all of the limits based on the shader version. We should probably do something like set all of the limits as they "should" be, then do a min() against the GL caps.
+void shader_glsl_add_instruction_modifiers(SHADER_OPCODE_ARG* arg
Doesn't seem to handle multiple modifiers - I think I changed this elsewhere recently.
Possibly not. I just haven't hit (or noticed) that test case yet.
- SHADER_PARAM_FCT_T param_fct;
Please don't add any more fields that do not need to persist into base shader. Why are two different param functions needed?
This method avoids the base class having to know about its subclasses. I'm sure there's a better way to do it, still, though.
- /* Set constants for the temporary argument */
I'm not sure I like how the mnxn functions use a temporary argument structure and re-invoke the gen. functions. Do they initialize every field necessary in the arg? I might have added new fields and missed the initialization part in there - maybe this arg should be 0-ed first with memcpy.
Agreed.
GLhandleARB objList[4]; /* There should never be more than 2
objects attached
(one pixel shader and one vertex
shader), but we'll be safe */
This doesn't make sense to me - either you know that there are 2 objects attached exactly, or you don't. In either case, it's wrong to cover up a potential bug with a 4-element array.
Yeah, I must have been tired. :-)
- Note: The downside to this method is that we may end up compiling
and linking
- programs that will not be used. Linking is expensive, but it's
typically better to
- create more programs than necessary and just bind them when needed
than it is
- to create a new program every time you set the shaders
- */
This is an elaborate scheme with linked lists in each shader - is that really necessary? I'm very confused after reading the code (but maybe that's just me :)
If we link the programs together every time the Set__Shader() is called, it brings performance to a halt (that what my glsl_hack4.diff did). Everything was roughly 100 times slower that way. So, this method stores the program for every combination which is used in the app. That way, when the shaders are set, we already have a linked program to use. Linking is a very expensive operation. Keeping a list of them seems like the best method that we could come up with.
If I call:
X = CreatePixelShader(dev); Y = CreateVertexShader(dev); Z = CreateVertexShader(dev); SetPixelShader(dev, X); SetVertexShader(dev, Y); SetVertexShader(dev, Z);
How many programs are being created?
3
What objects are attached to each?
Program: Pixel Shader: Vertex Shader: 1 X NULL 2 X Y 3 X Z
Which linked lists does each program go into, and What's the sequence of things being attached/detached, programs created/destroyed?
If you run the WINEDEBUG=+d3d_shader trace, you'll get detailed information about all of the steps involed. Currently, the only time that the list doesn't get de-allocated is when the app doesn't properly clean up after itself. So, that will need to be fixed before this code can be submitted.
On a separate topic, we've learned that the reason that some of these pixel shaders fail is because I've hard-coded texture2D() and just assumed that the textures would be 2-dimensional (which is what the ARB programs do now, too). However, in ARB, only the instructions that reference the 3D texture are failing. In GLSL, the entire shader fails if one of the textures is wrong. So, this will have to be thought out some more, since prior to PS 2.0, the app doesn't have to declare if it's a 2D or a 3D texture ahead of time.
if (wined3d_settings.shader_mode == SHADER_GLSL)
shader_glsl_add_instruction_modifiers(&hw_arg);
If you choose to pull modifier handling out of the per-opcode function, this should be done for SHADER_ARB as well, imho.
All of the ARB modifiers are run prior to the command. Whereas, (at least so far), the GLSL instruction modifiers are necessary to run after the command.
Well, _SAT needs to go on the same opcode, but shifts for example go "after" the command. I have no idea how to do centroid right now. Anyway, it just seemed to me that they should be in the same place - either both in the generation fn, or both out of it, for consistency. There's a lot of SHADER_ARB opcodes right now that don't handle modifiers, while they need to do that, so we should think about centralized modifier processing [ or at least add modifier processing on the opcodes where it is missing ].
Those limits aren't checked against GL capabilities yet. The current 3.0 limit is 224 float constants - does this work properly?
The limits should be checked against the caps in the functions that set all of the limits based on the shader version. We should probably do something like set all of the limits as they "should" be, then do a min() against the GL caps.
No, I think we need to do error reporting as well - a min would hide the bugs. If the app uses too many constants, we should probably abort and report that to the application somehow. Right now our shader function doesn't even return a status code - maybe it should ?
+void shader_glsl_add_instruction_modifiers(SHADER_OPCODE_ARG* arg
Doesn't seem to handle multiple modifiers - I think I changed this elsewhere recently.
Possibly not. I just haven't hit (or noticed) that test case yet.
I see _centroid with _pp sometimes, so since we ignore _pp that's not really an issue. Anyway, the spec says multiple output modifiers are allowed, so they should be supported.
- SHADER_PARAM_FCT_T param_fct;
Please don't add any more fields that do not need to persist into base shader. Why are two different param functions needed?
This method avoids the base class having to know about its subclasses.
If the param functions aren't too different they should probably be merged together somehow. Also, fields like these would better go into the SHADER_OPCODE_ARG, not into the shader class - they're only used during generation. There's really no problem adding new fields into the arg structure - some of those don't need to be reset on every instruction, so they should just be set at the beginning when you start to generate code.
This is an elaborate scheme with linked lists in each shader - is that really necessary? I'm very confused after reading the code (but maybe that's just me :)
If we link the programs together every time the Set__Shader() is called, it brings performance to a halt (that what my glsl_hack4.diff did). Everything was roughly 100 times slower that way. So, this method stores the program for every combination which is used in the app.
Well, it seems to store a bit more than that. If I create pshaders A,B, and vshaders C,D, and do: set_pixel(A); set_vertex(C); draw_stuff(); set_pixel(B); set_vertex(D); draw_stuff(); set_pixel(NULL), set_vertex(NULL)
Won't this create...5 different programs, all kept in memory for the entire lifetime of the program, while we only need 2 [1 active]? (A, NULL), (A,C), (B,C), (B,D), (NULL, D)
Also, is it necessary to store link pair info in both shaders - isn't this redundant somehow?
That way, when the shaders are set, we already have a linked program to use. Linking is a very expensive operation. Keeping a list of them seems like the best method that we could come up with.
Ok... I guess maybe it's necessary.
On a separate topic, we've learned that the reason that some of these pixel shaders fail is because I've hard-coded texture2D() and just assumed that the textures would be 2-dimensional (which is what the ARB programs do now, too). However, in ARB, only the instructions that reference the 3D texture are failing. In GLSL, the entire shader fails if one of the textures is wrong. So, this will have to be thought out some more, since prior to PS 2.0, the app doesn't have to declare if it's a 2D or a 3D texture ahead of time.
Yes, I'm not really sure how this is handled prior to ps2, but surely there's tutorials online that explain this. Maybe the shader does its own per-type processing using those texreg2ar, and other instructions (tex matrix instructions?). For 2.0+ I can send you a patch, which probably works, but needs more testing. It stores the sampler types in the SHADER_OPCODE_ARG - I just changed the hw_tex function a little bit.
On 04/06/06, Ivan Gyurdiev [email protected] wrote:
Well, it seems to store a bit more than that. If I create pshaders A,B, and vshaders C,D, and do: set_pixel(A); set_vertex(C); draw_stuff(); set_pixel(B); set_vertex(D); draw_stuff(); set_pixel(NULL), set_vertex(NULL)
Won't this create...5 different programs, all kept in memory for the entire lifetime of the program, while we only need 2 [1 active]? (A, NULL), (A,C), (B,C), (B,D), (NULL, D)
I think it should be possible to change the code in such a way that the programs are only created once they're actually used.
Also, is it necessary to store link pair info in both shaders - isn't this redundant somehow?
Well, yes. When a shader is released, the programs it's linked to should be destroyed. However, it's probably not neccesary to point to the other shader. We could probably just link to a program struct that contains the program id and a list of linked shaders. It doubt it will make much of a difference wrt amount of memory used though, at least not as long as there are only two shaders involved. On the other hand, for d3d10 we'll also need geometry / primitive shaders.
On Sun, 04 Jun 2006 12:04:21 -0400, Jason Green wrote:
This is an elaborate scheme with linked lists in each shader - is that really necessary? I'm very confused after reading the code (but maybe that's just me :)
If we link the programs together every time the Set__Shader() is called, it brings performance to a halt (that what my glsl_hack4.diff did).
Re: the recent discussion on code commenting and documentation, it'd be nice if the patch explained this somewhere. It'd be even nicer still if at some point one of you guys could take a break from the codeface and write a chapter for the Wine Developer Guide on shaders and how all this works ... it seems incredibly opaque and the last thing we want is a re-run of the DCOM situation where the only people who understood it left or hacked on other things for a while and we lost the knowledge of how it worked.
thanks -mike
On 6/6/06, Mike Hearn [email protected] wrote:
On Sun, 04 Jun 2006 12:04:21 -0400, Jason Green wrote: Re: the recent discussion on code commenting and documentation, it'd be nice if the patch explained this somewhere. It'd be even nicer still if at some point one of you guys could take a break from the codeface and write a chapter for the Wine Developer Guide on shaders and how all this works
Yep, that is all planned before I submit any of the new real work to wine-patches. I'm still trying to figure out exactly where some of this stuff needs to go and what pitfalls we'll be up against. Should make it easier for Alexandre so he doesn't have to apply 800 patches / month of us moving stuff around after the fact. :-)
I'll add a section to the DirectX-Shaders wiki page with an explanation soon, as well as make the comments in the source code more verbose (and/or link to the wiki where applicable).
On 6/6/06, Jason Green [email protected] wrote:
I'll add a section to the DirectX-Shaders wiki page with an explanation soon, as well as make the comments in the source code more verbose (and/or link to the wiki where applicable).
FYI - I just added a bunch to this page to get us started:
On Tue, 06 Jun 2006 13:55:15 -0400, Jason Green wrote:
FYI - I just added a bunch to this page to get us started:
That's awesome, exactly the sort of thing I was looking for. This sort of high level overview is very useful for new hackers! :)
thanks -mike
On Sunday 11 June 2006 22:24, Mike Hearn wrote:
On Tue, 06 Jun 2006 13:55:15 -0400, Jason Green wrote:
FYI - I just added a bunch to this page to get us started:
That's awesome, exactly the sort of thing I was looking for. This sort of high level overview is very useful for new hackers! :)
:)
Well it will not be suficient to begin hacking on wined3d (as only wined3d shaders architecture is explained)
Anyway Most of code is splitted in "logical" sources ... and documented when needed:
for shaders/declaration: baseshader.c, glsl_shader.c, pixelshader.c, vertexshader.c, vertexdeclaration.c
for textures: basetexture.c, cubetexture.c, texture.c, volumetexture.c
for surfaces: surface.c, surface_gdi.c,
others who have d3d compliant names: :) device.c, directx.c, drawprim.c, indexbuffer.c, palette.c, query.c, resource.c, stateblock.c, swapchain.c, utils.c, vertexbuffer.c, volume.c, wined3d_main.c
Only one thing to new hackers: never begin with device.c :)
Regards, Raphael
On 13/06/06, Raphael [email protected] wrote:
Well it will not be suficient to begin hacking on wined3d (as only wined3d shaders architecture is explained)
The rest is easy ;-)