This patch series includes an implementation of the long-pending `transpose` intrinsic and the `smoothstep` intrinsic.
While implementing `smoothstep` I realized that some intrinsics have different rules for the allowed data types than expressions:
- Vectors and matrices at the same time are not allowed, regardless of their dimensions. Even if they have the same number of components.
- Any combination of matrices is always allowed, even those when no matrix fits inside another, e.g.:
`float2x3` is compatible with `float3x2`, resulting in `float2x2`.
The common data type is the min on each dimension.
This is the case for `max`, `pow`, `ldexp`, `clamp` and `smoothstep`; which suggest that it is the case for all intrinsics where the operation is applied element-wise. So this was corrected.
A minor fix in `pow`'s type conversion is also included.
-- v3: vkd3d-shader/hlsl: Use add_unary_arithmetic_expr() in intrinsic_pow(). vkd3d-shader/hlsl: Allow elementwise_intrinsic_convert_args() to also convert args to float. vkd3d-shader/hlsl: Convert elementwise intrinsics args to the proper common type. tests: Test for common type conversion for element-wise intrinsics. vkd3d-shader/hlsl: Support smoothstep() intrinsic. vkd3d-shader/hlsl: Support transpose() intrinsic.
From: Francisco Casas fcasas@codeweavers.com
--- Makefile.am | 1 + libs/vkd3d-shader/hlsl.y | 59 +++++++++++++++++++++++++ tests/hlsl-transpose.shader_test | 75 ++++++++++++++++++++++++++++++++ 3 files changed, 135 insertions(+) create mode 100644 tests/hlsl-transpose.shader_test
diff --git a/Makefile.am b/Makefile.am index 85cd4642..d1f6ec6b 100644 --- a/Makefile.am +++ b/Makefile.am @@ -117,6 +117,7 @@ vkd3d_shader_tests = \ tests/hlsl-struct-array.shader_test \ tests/hlsl-struct-assignment.shader_test \ tests/hlsl-struct-semantics.shader_test \ + tests/hlsl-transpose.shader_test \ tests/hlsl-vector-indexing.shader_test \ tests/hlsl-vector-indexing-uniform.shader_test \ tests/logic-operations.shader_test \ diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index eedc85bd..a18394a6 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2596,6 +2596,64 @@ static bool intrinsic_saturate(struct hlsl_ctx *ctx, return !!add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_SAT, arg, loc); }
+static bool intrinsic_transpose(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_node *arg = params->args[0]; + struct hlsl_type *arg_type = arg->data_type; + struct hlsl_deref var_deref; + struct hlsl_type *mat_type; + struct hlsl_ir_load *load; + struct hlsl_ir_var *var; + unsigned int i, j; + + if (arg_type->type != HLSL_CLASS_SCALAR && arg_type->type != HLSL_CLASS_MATRIX) + { + struct vkd3d_string_buffer *string; + + if ((string = hlsl_type_to_string(ctx, arg_type))) + hlsl_error(ctx, &arg->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Wrong type for argument 1 of transpose(): expected a matrix or scalar type, but got '%s'.\n", + string->buffer); + hlsl_release_string_buffer(ctx, string); + return false; + } + + if (arg_type->type == HLSL_CLASS_SCALAR) + { + list_add_tail(params->instrs, &arg->entry); + return true; + } + + mat_type = hlsl_get_matrix_type(ctx, arg_type->base_type, arg_type->dimy, arg_type->dimx); + + if (!(var = hlsl_new_synthetic_var(ctx, "transpose", mat_type, loc))) + return false; + hlsl_init_simple_deref_from_var(&var_deref, var); + + for (i = 0; i < arg_type->dimx; ++i) + { + for (j = 0; j < arg_type->dimy; ++j) + { + struct hlsl_ir_store *store; + struct hlsl_block block; + + if (!(load = add_load_component(ctx, params->instrs, arg, j * arg->data_type->dimx + i, loc))) + return false; + + if (!(store = hlsl_new_store_component(ctx, &block, &var_deref, i * var->data_type->dimx + j, &load->node))) + return false; + list_move_tail(params->instrs, &block.instrs); + } + } + + if (!(load = hlsl_new_var_load(ctx, var, *loc))) + return false; + list_add_tail(params->instrs, &load->node.entry); + + return true; +} + static const struct intrinsic_function { const char *name; @@ -2623,6 +2681,7 @@ intrinsic_functions[] = {"pow", 2, true, intrinsic_pow}, {"round", 1, true, intrinsic_round}, {"saturate", 1, true, intrinsic_saturate}, + {"transpose", 1, true, intrinsic_transpose}, };
static int intrinsic_function_name_compare(const void *a, const void *b) diff --git a/tests/hlsl-transpose.shader_test b/tests/hlsl-transpose.shader_test new file mode 100644 index 00000000..83852fa1 --- /dev/null +++ b/tests/hlsl-transpose.shader_test @@ -0,0 +1,75 @@ +[pixel shader] +float4 main() : sv_target +{ + return transpose(5); +} + +[test] +draw quad +probe all rgba (5.0, 5.0, 5.0, 5.0) + + +[pixel shader] +float4 main() : sv_target +{ + float1x1 x = 5; + + return transpose(x); +} + +[test] +draw quad +probe all rgba (5.0, 5.0, 5.0, 5.0) + + +[pixel shader fail] +float4 main() : sv_target +{ + float4 x = float4(1, 2, 3, 4); + + return transpose(x); +} + +[pixel shader] +float4 main() : sv_target +{ + float1x4 x = float1x4(1.0, 2.0, 3.0, 4.0); + + return transpose(x); +} + +[test] +draw quad +probe all rgba (1.0, 2.0, 3.0, 4.0) + + +[pixel shader] +float4 main() : sv_target +{ + float4x3 m = float4x3(1.0, 2.0, 3.0, + 4.0, 5.0, 6.0, + 7.0, 8.0, 9.0, + 10.0, 11.0, 12.0); + + return transpose(m)[1]; +} + +[test] +draw quad +probe all rgba (2.0, 5.0, 8.0, 11.0) + + +[pixel shader] +float4 main() : sv_target +{ + row_major float4x3 m = float4x3(1.0, 2.0, 3.0, + 4.0, 5.0, 6.0, + 7.0, 8.0, 9.0, + 10.0, 11.0, 12.0); + + return transpose(m)[1]; +} + +[test] +draw quad +probe all rgba (2.0, 5.0, 8.0, 11.0)
From: Francisco Casas fcasas@codeweavers.com
--- Makefile.am | 1 + libs/vkd3d-shader/hlsl.y | 77 ++++++++++++++ tests/hlsl-smoothstep.shader_test | 166 ++++++++++++++++++++++++++++++ 3 files changed, 244 insertions(+) create mode 100644 tests/hlsl-smoothstep.shader_test
diff --git a/Makefile.am b/Makefile.am index d1f6ec6b..57cb76ed 100644 --- a/Makefile.am +++ b/Makefile.am @@ -111,6 +111,7 @@ vkd3d_shader_tests = \ tests/hlsl-return-void.shader_test \ tests/hlsl-shape.shader_test \ tests/hlsl-single-numeric-initializer.shader_test \ + tests/hlsl-smoothstep.shader_test \ tests/hlsl-state-block-syntax.shader_test \ tests/hlsl-static-initializer.shader_test \ tests/hlsl-storage-qualifiers.shader_test \ diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index a18394a6..82a9711b 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2596,6 +2596,82 @@ static bool intrinsic_saturate(struct hlsl_ctx *ctx, return !!add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_SAT, arg, loc); }
+/* smoothstep(a, b, x) = p^2 (3 - 2p), where p = saturate((x - a)/(b - a)) */ +static bool intrinsic_smoothstep(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + struct hlsl_ir_node *min_arg, *max_arg, *x_arg, *p, *p_num, *p_denom, *res; + struct hlsl_ir_constant *one, *minus_two, *three; + enum hlsl_type_class common_class; + struct hlsl_type *common_type; + unsigned int dimx, dimy; + + min_arg = params->args[0]; + max_arg = params->args[1]; + x_arg = params->args[2]; + + if (!expr_common_shape(ctx, min_arg->data_type, max_arg->data_type, loc, &common_class, &dimx, &dimy)) + return false; + common_type = hlsl_get_numeric_type(ctx, common_class, HLSL_TYPE_FLOAT, dimx, dimy); + + if (!expr_common_shape(ctx, common_type, x_arg->data_type, loc, &common_class, &dimx, &dimy)) + return false; + common_type = hlsl_get_numeric_type(ctx, common_class, HLSL_TYPE_FLOAT, dimx, dimy); + + if (!(min_arg = add_implicit_conversion(ctx, params->instrs, min_arg, common_type, loc))) + return false; + + if (!(max_arg = add_implicit_conversion(ctx, params->instrs, max_arg, common_type, loc))) + return false; + + if (!(x_arg = add_implicit_conversion(ctx, params->instrs, x_arg, common_type, loc))) + return false; + + if (!(min_arg = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_NEG, min_arg, loc))) + return false; + + if (!(p_num = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, x_arg, min_arg, loc))) + return false; + + if (!(p_denom = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, max_arg, min_arg, loc))) + return false; + + if (!(one = hlsl_new_float_constant(ctx, 1.0, loc))) + return false; + list_add_tail(params->instrs, &one->node.entry); + + if (!(p_denom = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_DIV, &one->node, p_denom, loc))) + return false; + + if (!(p = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MUL, p_num, p_denom, loc))) + return false; + + if (!(p = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_SAT, p, loc))) + return false; + + if (!(minus_two = hlsl_new_float_constant(ctx, -2.0, loc))) + return false; + list_add_tail(params->instrs, &minus_two->node.entry); + + if (!(three = hlsl_new_float_constant(ctx, 3.0, loc))) + return false; + list_add_tail(params->instrs, &three->node.entry); + + if (!(res = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MUL, &minus_two->node, p, loc))) + return false; + + if (!(res = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, &three->node, res, loc))) + return false; + + if (!(p = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MUL, p, p, loc))) + return false; + + if (!(res = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MUL, p, res, loc))) + return false; + + return true; +} + static bool intrinsic_transpose(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { @@ -2681,6 +2757,7 @@ intrinsic_functions[] = {"pow", 2, true, intrinsic_pow}, {"round", 1, true, intrinsic_round}, {"saturate", 1, true, intrinsic_saturate}, + {"smoothstep", 3, true, intrinsic_smoothstep}, {"transpose", 1, true, intrinsic_transpose}, };
diff --git a/tests/hlsl-smoothstep.shader_test b/tests/hlsl-smoothstep.shader_test new file mode 100644 index 00000000..fc1c856a --- /dev/null +++ b/tests/hlsl-smoothstep.shader_test @@ -0,0 +1,166 @@ + + +[pixel shader] +float4 main() : sv_target +{ + float4 a = {1, -1, -1, 10}; + float4 b = {2, 1, 1, 20}; + float4 x = {0.3, 0.4, 2, 15.4}; + + return smoothstep(a, b, x); +} + +[test] +draw quad +probe all rgba (0, 0.784, 1.0, 0.559872) 1 + + +[pixel shader] +float4 main() : sv_target +{ + float a = 1; + float b = 2; + float4 x = {0.9, 1.2, 1.8, 2.1}; + + return smoothstep(a, b, x); +} + +[test] +draw quad +probe all rgba (0, 0.104, 0.896, 1.000000) 5 + + +[pixel shader] +float4 main() : sv_target +{ + float4 a = {1, 10, 100, 1000}; + float4 b = {2, 20, 200, 2000}; + float x = 14; + + return smoothstep(a, b, x); +} + +[test] +draw quad +probe all rgba (1.0, 0.352, 0, 0) 1 + + +[pixel shader] +float4 main() : sv_target +{ + float2 a = {1, 10}; + float3 b = {2, 20, 200}; + float4 x = {1.4, 14, 140, 1400}; + + float2 res = smoothstep(a, b, x); + return float4(res, 0, 0); +} + +[test] +draw quad +probe all rgba (0.352, 0.352, 0, 0) 1 + + +[pixel shader] +float4 main() : sv_target +{ + float3 a = {1, 10, 100}; + float2 b = {2, 20}; + float4 x = {1.4, 14, 140, 1400}; + + float2 res = smoothstep(a, b, x); + return float4(res, 0, 0); +} + +[test] +draw quad +probe all rgba (0.352, 0.352, 0, 0) 1 + + +[pixel shader] +float4 main() : sv_target +{ + float4 a = {1, 10, 100, 1000}; + float4 b = {2, 20, 200, 2000}; + float2 x = {14, 140}; + + float2 res = smoothstep(a, b, x); + return float4(res, 0, 0); +} + +[test] +draw quad +probe all rgba (1.0, 1.0, 0, 0) 1 + + +[pixel shader todo] +float4 main() : sv_target +{ + float2x3 a = {1, 1, 1, 1, 1, 1}; + float3x2 b = {2, 2, 2, 2, 2, 2}; + float4x2 x = {1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8}; + + float2x2 r = smoothstep(a, b, x); + return r; +} + +[test] +todo draw quad +todo probe all rgba (0.028, 0.104, 0.216, 0.352) 1 + + +[pixel shader] +// 4 division by zero warnings. +// Only test compilation because result is implementation-dependent. +float4 main() : sv_target +{ + float4 a = {0, 0, 0, 0}; + float4 b = {-1, -1, 0, 0}; + float4 x = {0, -0.25, 0, 1}; + + return smoothstep(a, b, x); +} + + +[pixel shader] +float4 main() : sv_target +{ + float4x1 a = {0.0, 0.0, 0.0, 0.0}; + float b = 1.0; + float3x1 x = {0.5, 0.5, 0.5}; + + float3x1 r = smoothstep(a, b, x); + return float4(r, 0); +} + +[test] +draw quad +probe all rgba (0.5, 0.5, 0.5, 0.0) + + +[pixel shader todo] +float4 main() : sv_target +{ + float4x1 a = {0.0, 0.0, 0.0, 0.0}; + float2x2 b = {1.0, 1.0, 1.0, 1.0}; + float3x1 x = {0.5, 0.5, 0.5}; + + float2x1 r = smoothstep(a, b, x); + return float4(r, r); +} + +[test] +todo draw quad +todo probe all rgba (0.5, 0.5, 0.5, 0.5) + + +[pixel shader fail todo] +float4 main() : sv_target +{ + float2x2 a = {0.0, 0.0, 0.0, 0.0}; + float4 b = 1.0; + float2x2 x = {0.5, 0.5, 0.5, 0.5}; + + smoothstep(a, b, x); + return 0; +}
From: Francisco Casas fcasas@codeweavers.com
Some intrinsics have different rules for the allowed data types than expressions:
- Vectors and matrices at the same time are not allowed, regardless of their dimensions. Even if they have the same number of components.
- Any combination of matrices is always allowed, even those when no matrix fits inside another, e.g.: float2x3 is compatible with float3x2, resulting in float 2x2. The common data type is the min on each dimension.
This is the case for max, pow, ldexp, clamp and smoothstep; which suggest that it is the case for all intrinsics where the operation is applied element-wise.
Tests for mul() are also added as a counter-example where the operation is not element-wise. --- tests/hlsl-clamp.shader_test | 28 ++++++++++++++++++++++++++++ tests/hlsl-ldexp.shader_test | 26 ++++++++++++++++++++++++++ tests/hlsl-lerp.shader_test | 28 ++++++++++++++++++++++++++++ tests/hlsl-mul.shader_test | 30 ++++++++++++++++++++++++++++++ tests/max.shader_test | 27 +++++++++++++++++++++++++++ tests/pow.shader_test | 26 ++++++++++++++++++++++++++ 6 files changed, 165 insertions(+)
diff --git a/tests/hlsl-clamp.shader_test b/tests/hlsl-clamp.shader_test index 8e26270c..cc198735 100644 --- a/tests/hlsl-clamp.shader_test +++ b/tests/hlsl-clamp.shader_test @@ -8,3 +8,31 @@ float4 main(uniform float3 u) : sv_target uniform 0 float4 -0.3 -0.1 0.7 0.0 draw quad probe all rgba (-0.1, 0.7, -0.3, 0.3) + + +[pixel shader todo] +float4 main() : sv_target +{ + float3x2 a = {6, 5, 4, 3, 2, 1}; + float2x3 b = {1, 2, 3, 4.2, 5.2, 6.2}; + float3x4 c = 5.5; + + float2x2 r = clamp(a, b, c); + return float4(r); +} + +[test] +todo draw quad +todo probe all rgba (5.5, 5.0, 4.2, 5.2) + + +[pixel shader fail todo] +float4 main() : sv_target +{ + float2x2 a = {3.1, 3.1, 3.1, 3.1}; + float2x2 b = {1, 2, 3, 4}; + float4 c = {5.5, 4.5, 3.5, 2.5}; + + clamp(a, b, c); + return 0; +} diff --git a/tests/hlsl-ldexp.shader_test b/tests/hlsl-ldexp.shader_test index 0873fc9e..bea97953 100644 --- a/tests/hlsl-ldexp.shader_test +++ b/tests/hlsl-ldexp.shader_test @@ -30,3 +30,29 @@ uniform 0 int4 2 3 4 5 uniform 4 int4 0 -10 10 100 draw quad probe all rgba (2.0, 0.00292968750, 4096.0, 6.33825300e+030) + + +[pixel shader todo] +float4 main() : sv_target +{ + float2x3 a = {1, 2, 3, 4, 5, 6}; + float3x2 b = {6, 5, 4, 3, 2, 1}; + + float2x2 r = ldexp(a, b); + return float4(r); +} + +[test] +todo draw quad +todo probe all rgba (64.0, 64.0, 64.0, 40.0) + + +[pixel shader fail todo] +float4 main() : sv_target +{ + float2x2 a = {1, 2, 3, 4}; + float1 b = {2}; + + ldexp(a, b); + return 0; +} diff --git a/tests/hlsl-lerp.shader_test b/tests/hlsl-lerp.shader_test index 3f93b02d..3cd10ec1 100644 --- a/tests/hlsl-lerp.shader_test +++ b/tests/hlsl-lerp.shader_test @@ -34,3 +34,31 @@ uniform 4 int4 0 -10 10 1000000 uniform 8 int4 0 1 -1 1000000 draw quad probe all rgba (2.0, -10.0, -2.0, 1e12) + + +[pixel shader todo] +float4 main() : sv_target +{ + float3x2 a = {6, 5, 4, 3, 2, 1}; + float2x3 b = {1, 2, 3, 4.2, 5.2, 6.2}; + float3x4 c = 2.4; + + float2x2 r = lerp(a, b, c); + return float4(r); +} + +[test] +todo draw quad +todo probe all rgba (-6.0, -2.2, 4.48, 8.28) + + +[pixel shader fail todo] +float4 main() : sv_target +{ + float2x2 a = {0, 1, 2, 3}; + float2x2 b = {1, 2, 3, 4}; + float4 c = {0.5, 0.5, 0.5, 0.5}; + + lerp(a, b, c); + return 0; +} diff --git a/tests/hlsl-mul.shader_test b/tests/hlsl-mul.shader_test index 7b453187..cb104a9e 100644 --- a/tests/hlsl-mul.shader_test +++ b/tests/hlsl-mul.shader_test @@ -288,3 +288,33 @@ float4 main(float4 pos : sv_position) : sv_target [test] draw quad probe all rgba (78.0, 96.0, 114.0, 0.0) + + +[pixel shader] +float4 main() : sv_target +{ + float2x3 a = float2x3(1, 2, 3, 4, 5, 6); + float3x2 b = float3x2(6, 5, 4, 3, 2, 1); + + float2x2 r = mul(a, b); + return float4(r); +} + +[test] +draw quad +probe all rgba (20.0, 14.0, 56.0, 41.0) + + +[pixel shader] +float4 main() : sv_target +{ + float2x2 a = float2x2(1, 2, 3, 4); + float2 b = float2(1, 2); + + float2 r = mul(a, b); + return float4(r, 0, 0); +} + +[test] +draw quad +probe all rgba (5.0, 11.0, 0.0, 0.0) diff --git a/tests/max.shader_test b/tests/max.shader_test index 50083f33..7a917ec5 100644 --- a/tests/max.shader_test +++ b/tests/max.shader_test @@ -9,6 +9,7 @@ uniform 0 float4 0.7 -0.1 0.0 0.0 draw quad probe all rgba (0.7, 2.1, 2.0, -1.0)
+ [pixel shader] float4 main(uniform float4 u) : sv_target { @@ -21,3 +22,29 @@ float4 main(uniform float4 u) : sv_target uniform 0 float4 0.7 -0.1 0.4 0.8 draw quad probe all rgba (0.7, 0.8, 0.7, 0.2) + + +[pixel shader todo] +float4 main() : sv_target +{ + float2x3 a = {1, 2, 3, 4, 5, 6}; + float3x2 b = {6, 5, 4, 3, 2, 1}; + + float2x2 r = max(a, b); + return float4(r); +} + +[test] +todo draw quad +todo probe all rgba (6.0, 5.0, 4.0, 5.0) + + +[pixel shader fail todo] +float4 main() : sv_target +{ + float2x2 a = {1, 2, 3, 4}; + float4 b = {4, 3, 2, 1}; + + max(a, b); + return 0; +} diff --git a/tests/pow.shader_test b/tests/pow.shader_test index 6470494e..6f2b2741 100644 --- a/tests/pow.shader_test +++ b/tests/pow.shader_test @@ -8,3 +8,29 @@ float4 main(uniform float4 u) : sv_target uniform 0 float4 0.4 0.8 2.5 2.0 draw quad probe all rgba (0.512, 0.101192884, 0.64, 0.25) 4 + + +[pixel shader todo] +float4 main() : sv_target +{ + float2x3 a = {1, 2, 3, 4, 5, 6}; + float3x2 b = {6, 5, 4, 3, 2, 1}; + + float2x2 r = pow(a, b); + return float4(r); +} + +[test] +todo draw quad +todo probe all rgba (1.0, 32.0, 256.0, 125.0) + + +[pixel shader fail todo] +float4 main() : sv_target +{ + float2x2 a = {1, 2, 3, 4}; + float4 b = {1, 2, 3, 4}; + + pow(a, b); + return 0; +}
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl.y | 92 ++++++++++++++++++++++++++++--- tests/hlsl-clamp.shader_test | 8 +-- tests/hlsl-ldexp.shader_test | 8 +-- tests/hlsl-lerp.shader_test | 8 +-- tests/hlsl-smoothstep.shader_test | 14 ++--- tests/max.shader_test | 8 +-- tests/pow.shader_test | 2 +- 7 files changed, 107 insertions(+), 33 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 82a9711b..7d9d8a18 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2225,6 +2225,65 @@ static struct hlsl_ir_node *intrinsic_float_convert_arg(struct hlsl_ctx *ctx, return add_implicit_conversion(ctx, params->instrs, arg, type, loc); }
+static bool elementwise_intrinsic_convert_args(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + enum hlsl_base_type base = params->args[0]->data_type->base_type; + bool vectors = false, matrices = false; + unsigned int dimx = 4, dimy = 4; + struct hlsl_type *common_type; + unsigned int i; + + for (i = 0; i < params->args_count; ++i) + { + struct hlsl_type *arg_type = params->args[i]->data_type; + + base = expr_common_base_type(base, arg_type->base_type); + + if (arg_type->type == HLSL_CLASS_VECTOR) + { + vectors = true; + dimx = min(dimx, arg_type->dimx); + } + else if (arg_type->type == HLSL_CLASS_MATRIX) + { + matrices = true; + dimx = min(dimx, arg_type->dimx); + dimy = min(dimy, arg_type->dimy); + } + } + + if (matrices && vectors) + { + hlsl_error(ctx, loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Cannot use both matrices and vectors in an elementwise intrinsic."); + return false; + } + else if (matrices) + { + common_type = hlsl_get_matrix_type(ctx, base, dimx, dimy); + } + else if (vectors) + { + common_type = hlsl_get_vector_type(ctx, base, dimx); + } + else + { + common_type = hlsl_get_scalar_type(ctx, base); + } + + for (i = 0; i < params->args_count; ++i) + { + struct hlsl_ir_node *new_arg; + + if (!(new_arg = add_implicit_conversion(ctx, params->instrs, params->args[i], common_type, loc))) + return NULL; + params->args[i] = new_arg; + } + + return true; +} + static bool intrinsic_abs(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { @@ -2280,6 +2339,9 @@ static bool intrinsic_clamp(struct hlsl_ctx *ctx, { struct hlsl_ir_node *max;
+ if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + return false; + if (!(max = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MAX, params->args[0], params->args[1], loc))) return false;
@@ -2361,6 +2423,9 @@ static bool intrinsic_ldexp(struct hlsl_ctx *ctx, { struct hlsl_ir_node *arg;
+ if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + return false; + if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[1], loc))) return false;
@@ -2400,6 +2465,9 @@ static bool intrinsic_lerp(struct hlsl_ctx *ctx, { struct hlsl_ir_node *arg, *neg, *add, *mul;
+ if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + return false; + if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[0], loc))) return false;
@@ -2418,12 +2486,18 @@ static bool intrinsic_lerp(struct hlsl_ctx *ctx, static bool intrinsic_max(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { + if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + return false; + return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MAX, params->args[0], params->args[1], loc); }
static bool intrinsic_min(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { + if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + return false; + return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MIN, params->args[0], params->args[1], loc); }
@@ -2558,6 +2632,9 @@ static bool intrinsic_pow(struct hlsl_ctx *ctx, { struct hlsl_ir_node *log, *exp, *arg, *mul;
+ if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + return false; + if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[0], loc))) return false;
@@ -2602,21 +2679,18 @@ static bool intrinsic_smoothstep(struct hlsl_ctx *ctx, { struct hlsl_ir_node *min_arg, *max_arg, *x_arg, *p, *p_num, *p_denom, *res; struct hlsl_ir_constant *one, *minus_two, *three; - enum hlsl_type_class common_class; struct hlsl_type *common_type; - unsigned int dimx, dimy; + + if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + return false;
min_arg = params->args[0]; max_arg = params->args[1]; x_arg = params->args[2];
- if (!expr_common_shape(ctx, min_arg->data_type, max_arg->data_type, loc, &common_class, &dimx, &dimy)) - return false; - common_type = hlsl_get_numeric_type(ctx, common_class, HLSL_TYPE_FLOAT, dimx, dimy); - - if (!expr_common_shape(ctx, common_type, x_arg->data_type, loc, &common_class, &dimx, &dimy)) - return false; - common_type = hlsl_get_numeric_type(ctx, common_class, HLSL_TYPE_FLOAT, dimx, dimy); + common_type = x_arg->data_type; + common_type = hlsl_get_numeric_type(ctx, common_type->type, HLSL_TYPE_FLOAT, common_type->dimx, + common_type->dimy);
if (!(min_arg = add_implicit_conversion(ctx, params->instrs, min_arg, common_type, loc))) return false; diff --git a/tests/hlsl-clamp.shader_test b/tests/hlsl-clamp.shader_test index cc198735..1320c3dd 100644 --- a/tests/hlsl-clamp.shader_test +++ b/tests/hlsl-clamp.shader_test @@ -10,7 +10,7 @@ draw quad probe all rgba (-0.1, 0.7, -0.3, 0.3)
-[pixel shader todo] +[pixel shader] float4 main() : sv_target { float3x2 a = {6, 5, 4, 3, 2, 1}; @@ -22,11 +22,11 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (5.5, 5.0, 4.2, 5.2) +draw quad +probe all rgba (5.5, 5.0, 4.2, 5.2)
-[pixel shader fail todo] +[pixel shader fail] float4 main() : sv_target { float2x2 a = {3.1, 3.1, 3.1, 3.1}; diff --git a/tests/hlsl-ldexp.shader_test b/tests/hlsl-ldexp.shader_test index bea97953..92988d37 100644 --- a/tests/hlsl-ldexp.shader_test +++ b/tests/hlsl-ldexp.shader_test @@ -32,7 +32,7 @@ draw quad probe all rgba (2.0, 0.00292968750, 4096.0, 6.33825300e+030)
-[pixel shader todo] +[pixel shader] float4 main() : sv_target { float2x3 a = {1, 2, 3, 4, 5, 6}; @@ -43,11 +43,11 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (64.0, 64.0, 64.0, 40.0) +draw quad +probe all rgba (64.0, 64.0, 64.0, 40.0)
-[pixel shader fail todo] +[pixel shader fail] float4 main() : sv_target { float2x2 a = {1, 2, 3, 4}; diff --git a/tests/hlsl-lerp.shader_test b/tests/hlsl-lerp.shader_test index 3cd10ec1..15e90cef 100644 --- a/tests/hlsl-lerp.shader_test +++ b/tests/hlsl-lerp.shader_test @@ -36,7 +36,7 @@ draw quad probe all rgba (2.0, -10.0, -2.0, 1e12)
-[pixel shader todo] +[pixel shader] float4 main() : sv_target { float3x2 a = {6, 5, 4, 3, 2, 1}; @@ -48,11 +48,11 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (-6.0, -2.2, 4.48, 8.28) +draw quad +probe all rgba (-6.0, -2.2, 4.48, 8.28) 1
-[pixel shader fail todo] +[pixel shader fail] float4 main() : sv_target { float2x2 a = {0, 1, 2, 3}; diff --git a/tests/hlsl-smoothstep.shader_test b/tests/hlsl-smoothstep.shader_test index fc1c856a..63755b08 100644 --- a/tests/hlsl-smoothstep.shader_test +++ b/tests/hlsl-smoothstep.shader_test @@ -93,7 +93,7 @@ draw quad probe all rgba (1.0, 1.0, 0, 0) 1
-[pixel shader todo] +[pixel shader] float4 main() : sv_target { float2x3 a = {1, 1, 1, 1, 1, 1}; @@ -105,8 +105,8 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (0.028, 0.104, 0.216, 0.352) 1 +draw quad +probe all rgba (0.028, 0.104, 0.216, 0.352) 6
[pixel shader] @@ -138,7 +138,7 @@ draw quad probe all rgba (0.5, 0.5, 0.5, 0.0)
-[pixel shader todo] +[pixel shader] float4 main() : sv_target { float4x1 a = {0.0, 0.0, 0.0, 0.0}; @@ -150,11 +150,11 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (0.5, 0.5, 0.5, 0.5) +draw quad +probe all rgba (0.5, 0.5, 0.5, 0.5)
-[pixel shader fail todo] +[pixel shader fail] float4 main() : sv_target { float2x2 a = {0.0, 0.0, 0.0, 0.0}; diff --git a/tests/max.shader_test b/tests/max.shader_test index 7a917ec5..3a5c3125 100644 --- a/tests/max.shader_test +++ b/tests/max.shader_test @@ -24,7 +24,7 @@ draw quad probe all rgba (0.7, 0.8, 0.7, 0.2)
-[pixel shader todo] +[pixel shader] float4 main() : sv_target { float2x3 a = {1, 2, 3, 4, 5, 6}; @@ -35,11 +35,11 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (6.0, 5.0, 4.0, 5.0) +draw quad +probe all rgba (6.0, 5.0, 4.0, 5.0)
-[pixel shader fail todo] +[pixel shader fail] float4 main() : sv_target { float2x2 a = {1, 2, 3, 4}; diff --git a/tests/pow.shader_test b/tests/pow.shader_test index 6f2b2741..0c1d7de3 100644 --- a/tests/pow.shader_test +++ b/tests/pow.shader_test @@ -25,7 +25,7 @@ todo draw quad todo probe all rgba (1.0, 32.0, 256.0, 125.0)
-[pixel shader fail todo] +[pixel shader fail] float4 main() : sv_target { float2x2 a = {1, 2, 3, 4};
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl.y | 55 +++++++++++++--------------------------- 1 file changed, 18 insertions(+), 37 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 7d9d8a18..91daa482 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2226,7 +2226,8 @@ static struct hlsl_ir_node *intrinsic_float_convert_arg(struct hlsl_ctx *ctx, }
static bool elementwise_intrinsic_convert_args(struct hlsl_ctx *ctx, - const struct parse_initializer *params, const struct vkd3d_shader_location *loc) + const struct parse_initializer *params, bool convert_to_float, + const struct vkd3d_shader_location *loc) { enum hlsl_base_type base = params->args[0]->data_type->base_type; bool vectors = false, matrices = false; @@ -2253,6 +2254,9 @@ static bool elementwise_intrinsic_convert_args(struct hlsl_ctx *ctx, } }
+ if (convert_to_float) + base = HLSL_TYPE_FLOAT; + if (matrices && vectors) { hlsl_error(ctx, loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, @@ -2339,7 +2343,7 @@ static bool intrinsic_clamp(struct hlsl_ctx *ctx, { struct hlsl_ir_node *max;
- if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + if (!elementwise_intrinsic_convert_args(ctx, params, false, loc)) return false;
if (!(max = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MAX, params->args[0], params->args[1], loc))) @@ -2423,13 +2427,10 @@ static bool intrinsic_ldexp(struct hlsl_ctx *ctx, { struct hlsl_ir_node *arg;
- if (!elementwise_intrinsic_convert_args(ctx, params, loc)) - return false; - - if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[1], loc))) + if (!elementwise_intrinsic_convert_args(ctx, params, true, loc)) return false;
- if (!(arg = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_EXP2, arg, loc))) + if (!(arg = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_EXP2, params->args[1], loc))) return false;
return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MUL, params->args[0], arg, loc); @@ -2463,15 +2464,12 @@ static bool intrinsic_length(struct hlsl_ctx *ctx, static bool intrinsic_lerp(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { - struct hlsl_ir_node *arg, *neg, *add, *mul; + struct hlsl_ir_node *neg, *add, *mul;
- if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + if (!elementwise_intrinsic_convert_args(ctx, params, true, loc)) return false;
- if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[0], loc))) - return false; - - if (!(neg = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_NEG, arg, loc))) + if (!(neg = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_NEG, params->args[0], loc))) return false;
if (!(add = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, params->args[1], neg, loc))) @@ -2480,13 +2478,13 @@ static bool intrinsic_lerp(struct hlsl_ctx *ctx, if (!(mul = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MUL, params->args[2], add, loc))) return false;
- return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, arg, mul, loc); + return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_ADD, params->args[0], mul, loc); }
static bool intrinsic_max(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { - if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + if (!elementwise_intrinsic_convert_args(ctx, params, false, loc)) return false;
return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MAX, params->args[0], params->args[1], loc); @@ -2495,7 +2493,7 @@ static bool intrinsic_max(struct hlsl_ctx *ctx, static bool intrinsic_min(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { - if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + if (!elementwise_intrinsic_convert_args(ctx, params, false, loc)) return false;
return !!add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MIN, params->args[0], params->args[1], loc); @@ -2630,15 +2628,12 @@ static bool intrinsic_normalize(struct hlsl_ctx *ctx, static bool intrinsic_pow(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { - struct hlsl_ir_node *log, *exp, *arg, *mul; - - if (!elementwise_intrinsic_convert_args(ctx, params, loc)) - return false; + struct hlsl_ir_node *log, *exp, *mul;
- if (!(arg = intrinsic_float_convert_arg(ctx, params, params->args[0], loc))) + if (!elementwise_intrinsic_convert_args(ctx, params, true, loc)) return false;
- if (!(log = hlsl_new_unary_expr(ctx, HLSL_OP1_LOG2, arg, *loc))) + if (!(log = hlsl_new_unary_expr(ctx, HLSL_OP1_LOG2, params->args[0], *loc))) return false; list_add_tail(params->instrs, &log->entry);
@@ -2679,28 +2674,14 @@ static bool intrinsic_smoothstep(struct hlsl_ctx *ctx, { struct hlsl_ir_node *min_arg, *max_arg, *x_arg, *p, *p_num, *p_denom, *res; struct hlsl_ir_constant *one, *minus_two, *three; - struct hlsl_type *common_type;
- if (!elementwise_intrinsic_convert_args(ctx, params, loc)) + if (!elementwise_intrinsic_convert_args(ctx, params, true, loc)) return false;
min_arg = params->args[0]; max_arg = params->args[1]; x_arg = params->args[2];
- common_type = x_arg->data_type; - common_type = hlsl_get_numeric_type(ctx, common_type->type, HLSL_TYPE_FLOAT, common_type->dimx, - common_type->dimy); - - if (!(min_arg = add_implicit_conversion(ctx, params->instrs, min_arg, common_type, loc))) - return false; - - if (!(max_arg = add_implicit_conversion(ctx, params->instrs, max_arg, common_type, loc))) - return false; - - if (!(x_arg = add_implicit_conversion(ctx, params->instrs, x_arg, common_type, loc))) - return false; - if (!(min_arg = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_NEG, min_arg, loc))) return false;
From: Francisco Casas fcasas@codeweavers.com
Using add_unary_arithmetic_expr() instead of hlsl_new_unary_expr() allows the intrinsic to work with matrices.
Otherwise we get:
E5017: Aborting due to not yet implemented feature: Copying from unsupported node type.
because an HLSL_IR_EXPR reaches split_matrix_copies(). --- libs/vkd3d-shader/hlsl.y | 7 +++---- tests/pow.shader_test | 6 +++--- 2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 91daa482..8f3b1c90 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2633,16 +2633,15 @@ static bool intrinsic_pow(struct hlsl_ctx *ctx, if (!elementwise_intrinsic_convert_args(ctx, params, true, loc)) return false;
- if (!(log = hlsl_new_unary_expr(ctx, HLSL_OP1_LOG2, params->args[0], *loc))) + if (!(log = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_LOG2, params->args[0], loc))) return false; - list_add_tail(params->instrs, &log->entry);
if (!(mul = add_binary_arithmetic_expr(ctx, params->instrs, HLSL_OP2_MUL, params->args[1], log, loc))) return false;
- if (!(exp = hlsl_new_unary_expr(ctx, HLSL_OP1_EXP2, mul, *loc))) + if (!(exp = add_unary_arithmetic_expr(ctx, params->instrs, HLSL_OP1_EXP2, mul, loc))) return false; - list_add_tail(params->instrs, &exp->entry); + return true; }
diff --git a/tests/pow.shader_test b/tests/pow.shader_test index 0c1d7de3..1bb3bd94 100644 --- a/tests/pow.shader_test +++ b/tests/pow.shader_test @@ -10,7 +10,7 @@ draw quad probe all rgba (0.512, 0.101192884, 0.64, 0.25) 4
-[pixel shader todo] +[pixel shader] float4 main() : sv_target { float2x3 a = {1, 2, 3, 4, 5, 6}; @@ -21,8 +21,8 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (1.0, 32.0, 256.0, 125.0) +draw quad +probe all rgba (1.0, 32.0, 256.0, 125.0) 2
[pixel shader fail]
On Tue Dec 13 23:44:40 2022 +0000, Francisco Casas wrote:
changed this line in [version 3 of the diff](/wine/vkd3d/-/merge_requests/53/diffs?diff_id=24727&start_sha=ca9b09ece63a38caf890e49c9e2c4cd74b2973ee#9155b9453b4ec8ea0b9b025dfb55c061bd931610_2707_2684)
I hate to bikeshed this more, but I feel like this could be prettier. Partly because elementwise_intrinsic_convert_args() has a bool argument that's not maximally intuitive, but also because now it's not as modular as it could be. How about something like this?
``` void convert_args(ctx, params, type, loc) { for (...) params->args[i] = add_implicit_conversion(...); }
struct hlsl_type *elementwise_intrinsic_get_common_type(ctx, params, loc) { ... }
void elementwise_intrinsic_convert_args(ctx, params, loc) { type = elementwise_intrinsic_get_common_type(ctx, params, loc);
return convert_args(ctx, params, type); }
void elementwise_intrinsic_float_convert_args(ctx, params, loc) { type = elementwise_intrinsic_get_common_type(ctx, params, loc);
type = hlsl_get_numeric_type(ctx, type->type, HLSL_TYPE_FLOAT, type->dimx, type->dimy);
return convert_args(ctx, params, type); } ```
Not dissimilar from the way parts of the current code are organized, in particular intrinsic_float_convert_arg().
On Mon Dec 19 00:33:24 2022 +0000, Zebediah Figura wrote:
I hate to bikeshed this more, but I feel like this could be prettier. Partly because elementwise_intrinsic_convert_args() has a bool argument that's not maximally intuitive, but also because now it's not as modular as it could be. How about something like this?
void convert_args(ctx, params, type, loc) { for (...) params->args[i] = add_implicit_conversion(...); } struct hlsl_type *elementwise_intrinsic_get_common_type(ctx, params, loc) { ... } void elementwise_intrinsic_convert_args(ctx, params, loc) { type = elementwise_intrinsic_get_common_type(ctx, params, loc); return convert_args(ctx, params, type); } void elementwise_intrinsic_float_convert_args(ctx, params, loc) { type = elementwise_intrinsic_get_common_type(ctx, params, loc); type = hlsl_get_numeric_type(ctx, type->type, HLSL_TYPE_FLOAT, type->dimx, type->dimy); return convert_args(ctx, params, type); }
Not dissimilar from the way parts of the current code are organized, in particular intrinsic_float_convert_arg().
I agree this is better :-)