SPIR-V already handled DSX/DSY, so only D3DBC/TPF needed new case blocks.
You'll notice that there's no test for this one - in addition to being a pretty straightforward translation for all possible formats, this feature uses the render target width/height and I wasn't sure if there was a good way to ensure that the test would always make sense.
Instead, I did the test manually, and it's what you'd expect (EDIT: Previously the test used a uniform which always optimized to 0, new test uses VPOS instead):
HLSL: ``` float4 main(float4 pos : sv_position) : sv_target { float4 x = ddx(pos.x); float4 y = ddy(pos.y); return x + y; } ```
D3DBC: ``` ps_3_0 dcl_position0 vPos mov r0.xyzw, vPos.xyzw mov r1.x, r0.x dsx r1.x, r1.x mov r0.x, r0.yxxx dsy r0.x, r0.x mov r1.xyzw, r1.x mov r0.xyzw, r0.x add r0.xyzw, r1.xyzw, r0.xyzw mov oC0.xyzw, r0.xyzw ```
DXBC-TPF: ``` ps_4_0 dcl_input_ps_siv linear v0.xyzw, position dcl_output o0.xyzw dcl_temps 2 mov r0.xyzw, v0.xyzw mov r1.x, r0.x dsx r1.x, r1.x mov r0.x, r0.yxxx dsy r0.x, r0.x mov r1.xyzw, r1.x mov r0.xyzw, r0.x add r0.xyzw, r1.xyzw, r0.xyzw mov o0.xyzw, r0.xyzw ret ```
Fixes https://bugs.winehq.org/show_bug.cgi?id=54827
-- v3: tests: Add test for ddx(), ddy() intrinsics
From: Ethan Lee flibitijibibo@gmail.com
SPIR-V already handled DSX/DSY, so only D3DBC/TPF needed new case blocks.
Signed-off-by: Ethan Lee flibitijibibo@gmail.com --- libs/vkd3d-shader/d3dbc.c | 8 ++++++++ libs/vkd3d-shader/hlsl.y | 24 ++++++++++++++++++++++++ libs/vkd3d-shader/tpf.c | 10 ++++++++++ 3 files changed, 42 insertions(+)
diff --git a/libs/vkd3d-shader/d3dbc.c b/libs/vkd3d-shader/d3dbc.c index 14268440..aa45dc2d 100644 --- a/libs/vkd3d-shader/d3dbc.c +++ b/libs/vkd3d-shader/d3dbc.c @@ -1696,6 +1696,14 @@ static void write_sm1_expr(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *b write_sm1_unary_op(ctx, buffer, D3DSIO_ABS, &instr->reg, &arg1->reg, 0, 0); break;
+ case HLSL_OP1_DSX: + write_sm1_unary_op(ctx, buffer, D3DSIO_DSX, &instr->reg, &arg1->reg, 0, 0); + break; + + case HLSL_OP1_DSY: + write_sm1_unary_op(ctx, buffer, D3DSIO_DSY, &instr->reg, &arg1->reg, 0, 0); + break; + case HLSL_OP1_EXP2: write_sm1_per_component_unary_op(ctx, buffer, instr, D3DSIO_EXP); break; diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 194d21f4..219c0a94 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2648,6 +2648,28 @@ static bool intrinsic_cross(struct hlsl_ctx *ctx, return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, mul2, mul1_neg, loc); }
+static bool intrinsic_ddx(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_node *arg; + + if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[0], loc))) + return false; + + return !!add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_DSX, arg, loc); +} + +static bool intrinsic_ddy(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_node *arg; + + if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[0], loc))) + return false; + + return !!add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_DSY, arg, loc); +} + static bool intrinsic_distance(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { @@ -3380,6 +3402,8 @@ intrinsic_functions[] = {"clamp", 3, true, intrinsic_clamp}, {"cos", 1, true, intrinsic_cos}, {"cross", 2, true, intrinsic_cross}, + {"ddx", 1, true, intrinsic_ddx}, + {"ddy", 1, true, intrinsic_ddy}, {"distance", 2, true, intrinsic_distance}, {"dot", 2, true, intrinsic_dot}, {"exp", 1, true, intrinsic_exp}, diff --git a/libs/vkd3d-shader/tpf.c b/libs/vkd3d-shader/tpf.c index e76cf8c9..d84216bf 100644 --- a/libs/vkd3d-shader/tpf.c +++ b/libs/vkd3d-shader/tpf.c @@ -3926,6 +3926,16 @@ static void write_sm4_expr(struct hlsl_ctx *ctx, write_sm4_unary_op_with_two_destinations(buffer, VKD3D_SM4_OP_SINCOS, &expr->node, 1, arg1); break;
+ case HLSL_OP1_DSX: + assert(type_is_float(dst_type)); + write_sm4_unary_op(buffer, VKD3D_SM4_OP_DERIV_RTX, &expr->node, arg1, 0); + break; + + case HLSL_OP1_DSY: + assert(type_is_float(dst_type)); + write_sm4_unary_op(buffer, VKD3D_SM4_OP_DERIV_RTY, &expr->node, arg1, 0); + break; + case HLSL_OP1_EXP2: assert(type_is_float(dst_type)); write_sm4_unary_op(buffer, VKD3D_SM4_OP_EXP, &expr->node, arg1, 0);
From: Ethan Lee flibitijibibo@gmail.com
Signed-off-by: Ethan Lee flibitijibibo@gmail.com --- Makefile.am | 1 + tests/ddxddy.shader_test | 11 +++++++++++ 2 files changed, 12 insertions(+) create mode 100644 tests/ddxddy.shader_test
diff --git a/Makefile.am b/Makefile.am index 549354b4..226da5d5 100644 --- a/Makefile.am +++ b/Makefile.am @@ -63,6 +63,7 @@ vkd3d_shader_tests = \ tests/cbuffer.shader_test \ tests/compute.shader_test \ tests/conditional.shader_test \ + tests/ddxddy.shader_test \ tests/distance.shader_test \ tests/entry-point-semantics.shader_test \ tests/exp.shader_test \ diff --git a/tests/ddxddy.shader_test b/tests/ddxddy.shader_test new file mode 100644 index 00000000..835bcbd7 --- /dev/null +++ b/tests/ddxddy.shader_test @@ -0,0 +1,11 @@ +[pixel shader] +float4 main(float4 pos : sv_position) : sv_target +{ + float dsx = ddx(pos.x); + float dsy = ddy(pos.y); + return dsx + dsy; +} + +[test] +draw quad +probe all rgba (2.0, 2.0, 2.0, 2.0)
On Wed Apr 26 14:31:19 2023 +0000, Giovanni Mascellani wrote:
I think it would be preferable to have to test in the repository, even if it hardcodes the render target size.
Added a test called ddxddy, which replicates the OP's test. It's sort of an unremarkable test, because I'm not too familiar with this feature and am not sure what a better test would look like.
Added a test called ddxddy, which replicates the OP's test. It's sort of an unremarkable test, because I'm not too familiar with this feature and am not sure what a better test would look like.
We'd probably want to use some trivial but non-linear function like e.g. a quadratic (or perhaps e.g. a sine or cosine) over sv_position as input to ddx()/ddy(), and then probe the output in some number (e.g. 16) of places.
On Wed Apr 26 15:14:09 2023 +0000, Henri Verbeet wrote:
Added a test called ddxddy, which replicates the OP's test. It's sort
of an unremarkable test, because I'm not too familiar with this feature and am not sure what a better test would look like. We'd probably want to use some trivial but non-linear function like e.g. a quadratic (or perhaps e.g. a sine or cosine) over sv_position as input to ddx()/ddy(), and then probe the output in some number (e.g. 16) of places.
Yeah, there is some room to make the test more specific. One other way is to output `float4(dsx, dsy, 0.0, 0.0)` instead of the sum. Also, given the peculiar way derivatives are computed in shading languages, it makes sense to test both even and odd pixel coordinates.
But at least exercising the code path is already much better than nothing, and good enough, for me, to approve (can't speak for Henri, though! :-) ).
This merge request was approved by Giovanni Mascellani.