This patch is causing trouble for me. The current GLSL cmp implementation is wrong - here's 4 reasons why, only the first of which is mentioned in the source:
- it ignores destination write mask - it ignores source swizzle - it ignores other source modifiers. - it works incorrectly src0 = 0
So, I replaced it with something that's supposed to fix all the issues above - based on per-component lessThan operation followed by a mix(), which blends the two things we're selecting between, but the blend factor is always 0 or 1 exactly, so it's always a select operation, not a blend.
Patch improved: Hair, TransparentShadowMapping Patch broke: 3Dc, PhongLighting, ASCII, SelfShadowBump
I still think it's correct - tested behavior separately, and it seems to work... but it breaks other demos. Any suggestions? All the demos above can be found on http://www.humus.ca/index.php?page=3D. They require UsingGLSL to be enabled in the registry, and a GLSL-capable card.
========= Note that all of the above demos used replicate swizzle in at least one argument, which the previous implementation did not take into account. A lot of them did this:
cmp r1.a, r1.a, 0, 1 mad r0, r0, r1.a, something_else
So, that looks like z-buffer test. Note that with the previous implementation, the comparison line would ignore the write mask and write to all 4 channels, which should cause a result that's too bright in all channels. I think that's exactly what happened in a number of these demos, PhongLighting in particular. If I use the new code, and simply swap the two arguments in the cmp test, the demo will work for the most part, a lot more is shown than black screen, and the color balance looks much closer to what's shown in the picture online.
...but.. I shouldn't have to swap the two arguments in cmp, they are correct as is, imho, and the previous implementation worked by accident.
--- dlls/wined3d/glsl_shader.c | 11 +++++------ 1 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/dlls/wined3d/glsl_shader.c b/dlls/wined3d/glsl_shader.c index 223a8e0..23f944c 100644 --- a/dlls/wined3d/glsl_shader.c +++ b/dlls/wined3d/glsl_shader.c @@ -987,20 +987,19 @@ void shader_glsl_compare(SHADER_OPCODE_A /** Process CMP instruction in GLSL (dst = src0.x > 0.0 ? src1.x : src2.x), per channel */ void shader_glsl_cmp(SHADER_OPCODE_ARG* arg) {
+ char tmpLine[256]; char dst_str[100], src0_str[100], src1_str[100], src2_str[100]; char dst_reg[50], src0_reg[50], src1_reg[50], src2_reg[50]; char dst_mask[6], src0_mask[6], src1_mask[6], src2_mask[6]; - + shader_glsl_add_param(arg, arg->dst, 0, FALSE, dst_reg, dst_mask, dst_str); shader_glsl_add_param(arg, arg->src[0], arg->src_addr[0], TRUE, src0_reg, src0_mask, src0_str); shader_glsl_add_param(arg, arg->src[1], arg->src_addr[1], TRUE, src1_reg, src1_mask, src1_str); shader_glsl_add_param(arg, arg->src[2], arg->src_addr[2], TRUE, src2_reg, src2_mask, src2_str);
- /* FIXME: This isn't correct - doesn't take the dst's swizzle into account. */ - shader_addline(arg->buffer, "%s.x = (%s.x > 0.0) ? %s.x : %s.x;\n", dst_reg, src0_reg, src1_reg, src2_reg); - shader_addline(arg->buffer, "%s.y = (%s.y > 0.0) ? %s.y : %s.y;\n", dst_reg, src0_reg, src1_reg, src2_reg); - shader_addline(arg->buffer, "%s.z = (%s.z > 0.0) ? %s.z : %s.z;\n", dst_reg, src0_reg, src1_reg, src2_reg); - shader_addline(arg->buffer, "%s.w = (%s.w > 0.0) ? %s.w : %s.w;\n", dst_reg, src0_reg, src1_reg, src2_reg); + shader_glsl_add_dst(arg->dst, dst_reg, dst_mask, tmpLine); + shader_addline(arg->buffer, "%smix(vec4(%s), vec4(%s), vec4(lessThan(vec4(%s), vec4(0.0)))))%s;\n", + tmpLine, src1_str, src2_str, src0_str, dst_mask); }
/** Process the CND opcode in GLSL (dst = (src0 < 0.5) ? src1 : src2) */
...but.. I shouldn't have to swap the two arguments in cmp, they are correct as is, imho, and the previous implementation worked by accident.
In fact, swapping them will cause artifacts to appear in the Hair demo.
- shader_addline(arg->buffer, "%s.w = (%s.w > 0.0) ? %s.w : %s.w;\n", dst_reg, src0_reg, src1_reg, src2_reg);
- shader_glsl_add_dst(arg->dst, dst_reg, dst_mask, tmpLine);
- shader_addline(arg->buffer, "%smix(vec4(%s), vec4(%s), vec4(lessThan(vec4(%s), vec4(0.0)))))%s;\n",
tmpLine, src1_str, src2_str, src0_str, dst_mask);
}
To give an idea as to what this is doing, see the following translation, taken from the Hair demo:
Original Shader: cmp r1.rg, r0.r, abs(v0).brba, abs(v0).rbba
New code translates to: R1.xy = vec4(mix(vec4(abs(IN0.zxzw)), vec4(abs(IN0.xzzw)), vec4(lessThan(vec4(R0.x), vec4(0.0))))).xy;