Reinterpret min16float, min10float, min16int, min12int, and min16uint as their regular counterparts: float, float, int, int, uint, respectively.
A proper implementation would require adding minimum precision indicators to all the dxbc-tpf instructions that use these types. Consider the output of fxc 10.1 with the following shader:
```hlsl uniform int i;
float4 main() : sv_target { min16float4 a = {0, 1, 2, i}; min16int2 b = {4, i}; min10float3 c = {6.4, 7, i}; min12int d = 9.4; min16uint4x2 e = {14.4, 15, 16, 17, 18, 19, 20, i};
return mul(e, b) + a + c.xyzx + d; } ```
``` ps_5_0 dcl_globalFlags refactoringAllowed | enableMinimumPrecision dcl_constantbuffer CB0[1], immediateIndexed dcl_output o0.xyzw dcl_temps 3 imad r0.xyz {min16u}, l(15, 17, 19, 0) {def32 as min16u}, cb0[0].xxxx {def32 as min16u}, l(56, 64, 72, 0) {def32 as min16u} utof r0.xyz {min16f}, r0.xyzx {min16u} imad r1.x {min16u}, cb0[0].x {def32 as min16u}, cb0[0].x {def32 as min16u}, l(80) {def32 as min16u} utof r0.w {min16f}, r1.x {min16u} mov r1.xyz {min16f}, l(0,1.000000,2.000000,0) {def32 as min16f} itof r1.w {min16f}, cb0[0].x add r0.xyzw {min16f}, r0.xyzw {min16f}, r1.xyzw {min16f} mov r2.y {min2_8f}, l(7.000000) {def32 as min2_8f} itof r2.z {min2_8f}, cb0[0].x add r0.yz {min16f}, r0.yyzy {min16f}, r2.yyzy {min2_8f as min16f} add r0.xyzw {min16f}, r0.xyzw {min16f}, l(15.400000, 9.000000, 9.000000, 15.400000) {def32 as min16f} mov o0.xyzw, r0.xyzw {min16f as def32} ret
```
However, if the graphics driver doesn't have minimum precision support, it ignores the minimum precision indicators and runs at 32-bit precision, which is equivalent as working with regular types.
-- v3: vkd3d-shader/hlsl: Reinterpret minimum precision types as their regular counterparts. tests: Test minimum-precision numeric types.
From: Francisco Casas fcasas@codeweavers.com
--- Makefile.am | 1 + tests/minimum-precision.shader_test | 21 +++++++++++++++++++++ 2 files changed, 22 insertions(+) create mode 100644 tests/minimum-precision.shader_test
diff --git a/Makefile.am b/Makefile.am index 85cd4642..e56f696e 100644 --- a/Makefile.am +++ b/Makefile.am @@ -123,6 +123,7 @@ vkd3d_shader_tests = \ tests/math.shader_test \ tests/matrix-semantics.shader_test \ tests/max.shader_test \ + tests/minimum-precision.shader_test \ tests/multiple-rt.shader_test \ tests/nointerpolation.shader_test \ tests/object-references.shader_test \ diff --git a/tests/minimum-precision.shader_test b/tests/minimum-precision.shader_test new file mode 100644 index 00000000..a9b43d74 --- /dev/null +++ b/tests/minimum-precision.shader_test @@ -0,0 +1,21 @@ +[require] +shader model >= 4.0 + + +[pixel shader todo] +float4 main() : sv_target +{ + min16float4 a = {0, 1, 2, 3}; + min10float2 b = {4, 5}; + min16int3 c = {6.4, 7, 8}; + min12int d = 9.4; + min16uint4x2 e = {14.4, 15, 16, 17, 18, 19, 20, 21}; + vector<min10float, 3> g = {22, 23, 24}; + matrix<min16uint, 3, 2> h = {25.4, 26, 27, 28, 29, 30}; + + return mul(e, b) + a + c.xyzx + d + g.xxyz + h[2].xyxy; +} + +[test] +todo draw quad +todo probe all rgba (197.0, 218.0, 238.0, 257.0)
From: Francisco Casas fcasas@codeweavers.com
Reinterpret min16float, min10float, min16int, min12int, and min16uint as their regular counterparts: float, float, int, int, uint, respectively.
A proper implementation would require adding minimum precision indicators to all the dxbc-tpf instructions that use these types. Consider the output of fxc 10.1 with the following shader:
uniform int i;
float4 main() : sv_target { min16float4 a = {0, 1, 2, i}; min16int2 b = {4, i}; min10float3 c = {6.4, 7, i}; min12int d = 9.4; min16uint4x2 e = {14.4, 15, 16, 17, 18, 19, 20, i};
return mul(e, b) + a + c.xyzx + d; }
However, if the graphics driver doesn't have minimum precision support, it ignores the minimum precision indicators and runs at 32-bit precision, which is equivalent as working with regular types. --- libs/vkd3d-shader/hlsl.c | 66 ++++++++++++++++++++++++++++- libs/vkd3d-shader/hlsl.h | 2 + libs/vkd3d-shader/hlsl.y | 18 +++++++- tests/minimum-precision.shader_test | 6 +-- 4 files changed, 85 insertions(+), 7 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.c b/libs/vkd3d-shader/hlsl.c index 8aa289ac..28346e9c 100644 --- a/libs/vkd3d-shader/hlsl.c +++ b/libs/vkd3d-shader/hlsl.c @@ -680,6 +680,7 @@ struct hlsl_type *hlsl_type_clone(struct hlsl_ctx *ctx, struct hlsl_type *old, if (!(type->modifiers & HLSL_MODIFIERS_MAJORITY_MASK)) type->modifiers |= default_majority; type->sampler_dim = old->sampler_dim; + type->is_minimum_precision = old->is_minimum_precision; switch (old->type) { case HLSL_CLASS_ARRAY: @@ -2349,7 +2350,7 @@ static int compare_function_rb(const void *key, const struct rb_entry *entry)
static void declare_predefined_types(struct hlsl_ctx *ctx) { - unsigned int x, y, bt, i; + unsigned int x, y, bt, i, v; struct hlsl_type *type;
static const char * const names[] = @@ -2361,7 +2362,11 @@ static void declare_predefined_types(struct hlsl_ctx *ctx) "uint", "bool", }; - char name[10]; + char name[15]; + + static const char *const variants_float[] = {"min10float", "min16float"}; + static const char *const variants_int[] = {"min12int", "min16int"}; + static const char *const variants_uint[] = {"min16uint"};
static const char *const sampler_names[] = { @@ -2421,6 +2426,63 @@ static void declare_predefined_types(struct hlsl_ctx *ctx) } }
+ for (bt = 0; bt <= HLSL_TYPE_LAST_SCALAR; ++bt) + { + unsigned int n_variants = 0; + const char *const *variants; + + switch (bt) + { + case HLSL_TYPE_FLOAT: + variants = variants_float; + n_variants = ARRAY_SIZE(variants_float); + break; + + case HLSL_TYPE_INT: + variants = variants_int; + n_variants = ARRAY_SIZE(variants_int); + break; + + case HLSL_TYPE_UINT: + variants = variants_uint; + n_variants = ARRAY_SIZE(variants_uint); + break; + + default: + break; + } + + for (v = 0; v < n_variants; ++v) + { + for (y = 1; y <= 4; ++y) + { + for (x = 1; x <= 4; ++x) + { + sprintf(name, "%s%ux%u", variants[v], y, x); + type = hlsl_new_type(ctx, name, HLSL_CLASS_MATRIX, bt, x, y); + type->is_minimum_precision = 1; + hlsl_scope_add_type(ctx->globals, type); + + if (y == 1) + { + sprintf(name, "%s%u", variants[v], x); + type = hlsl_new_type(ctx, name, HLSL_CLASS_VECTOR, bt, x, y); + type->is_minimum_precision = 1; + hlsl_scope_add_type(ctx->globals, type); + + if (x == 1) + { + sprintf(name, "%s", variants[v]); + type = hlsl_new_type(ctx, name, HLSL_CLASS_SCALAR, bt, x, y); + type->is_minimum_precision = 1; + hlsl_scope_add_type(ctx->globals, type); + } + } + } + } + } + } + for (bt = 0; bt <= HLSL_SAMPLER_DIM_LAST_SAMPLER; ++bt) { type = hlsl_new_type(ctx, sampler_names[bt], HLSL_CLASS_OBJECT, HLSL_TYPE_SAMPLER, 1, 1); diff --git a/libs/vkd3d-shader/hlsl.h b/libs/vkd3d-shader/hlsl.h index b6a593ca..9664dc7a 100644 --- a/libs/vkd3d-shader/hlsl.h +++ b/libs/vkd3d-shader/hlsl.h @@ -142,6 +142,8 @@ struct hlsl_type
unsigned int reg_size; size_t bytecode_offset; + + uint32_t is_minimum_precision : 1; };
struct hlsl_semantic diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index eedc85bd..a538bafe 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -3810,7 +3810,8 @@ type: YYABORT; }
- $$ = hlsl_get_vector_type(ctx, $3->base_type, $5); + $$ = hlsl_type_clone(ctx, hlsl_get_vector_type(ctx, $3->base_type, $5), 0, 0); + $$->is_minimum_precision = $3->is_minimum_precision; } | KW_VECTOR { @@ -3842,7 +3843,8 @@ type: YYABORT; }
- $$ = hlsl_get_matrix_type(ctx, $3->base_type, $7, $5); + $$ = hlsl_type_clone(ctx, hlsl_get_matrix_type(ctx, $3->base_type, $7, $5), 0, 0); + $$->is_minimum_precision = $3->is_minimum_precision; } | KW_MATRIX { @@ -3911,6 +3913,18 @@ type: | TYPE_IDENTIFIER { $$ = hlsl_get_type(ctx->cur_scope, $1, true); + if ($$->is_minimum_precision) + { + if (ctx->profile->major_version < 4) + { + hlsl_error(ctx, &@1, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Target profile doesn't support minimum-precision types."); + } + else + { + FIXME("Reinterpreting type %s.\n", $$->name); + } + } vkd3d_free($1); } | KW_STRUCT TYPE_IDENTIFIER diff --git a/tests/minimum-precision.shader_test b/tests/minimum-precision.shader_test index a9b43d74..e5053e3d 100644 --- a/tests/minimum-precision.shader_test +++ b/tests/minimum-precision.shader_test @@ -2,7 +2,7 @@ shader model >= 4.0
-[pixel shader todo] +[pixel shader] float4 main() : sv_target { min16float4 a = {0, 1, 2, 3}; @@ -17,5 +17,5 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (197.0, 218.0, 238.0, 257.0) +draw quad +probe all rgba (197.0, 218.0, 238.0, 257.0)
On Thu Jan 5 15:53:53 2023 +0000, Zebediah Figura wrote:
It'll also need to get preserved when cloning the type.
Whops, at first I thought that the "Reinterpreting type" FIXME wasn't being issued for types like `matrix<min16float, 4, 4>` and `vector<min16float, 4>`, but now I realize it is (well, at least for the scalar type within the declaration).
Issuing this warning is the main purpose of the `is_minimum_precision` flag for now. To also keep it for types declared in this way, it was necessary to use `hlsl_type_clone()`.
This merge request was approved by Giovanni Mascellani.
I'm not necessarily opposed to this, but like Zeb, this does make me a bit uneasy. (And hence the delay in review.) Part of the consideration is that HLSL compiler output may be stored and is potentially long-lived. The main questions I have at this point are the following:
- Concretely, how much time do you estimate implementing this properly would take? - Technically, wouldn't we need to work out casting rules anyway? This would affect what is considered valid HLSL input in the first place, not just what bytecode we end up producing for valid inputs.
On Mon Jan 16 23:48:54 2023 +0000, Henri Verbeet wrote:
I'm not necessarily opposed to this, but like Zeb, this does make me a bit uneasy. (And hence the delay in review.) Part of the consideration is that HLSL compiler output may be stored and is potentially long-lived. The main questions I have at this point are the following:
- Concretely, how much time do you estimate implementing this properly
would take?
- Technically, wouldn't we need to work out casting rules anyway? This
would affect what is considered valid HLSL input in the first place, not just what bytecode we end up producing for valid inputs.
2. My understanding is that, for purposes of type checking, minimum precision types are exchangeable with their regular counterparts.
So, if this assessment is correct, we won't be accepting incorrect HLSL input. We would be missing adding the minimum-precision markers on the instruction registers in the SM4 output though.
The problem I mentioned regarding casting rules is to determine what is the result type of operations between minimum precision types and regular types, or which operations, if any, discard the minimum precision information. As far as I know, the only way of checking this, for now, would be manual inspection.
Another potential problem with this workaround, is that we are also using regular types on the buffer definitions. Although, I don't see the native compiler using different offsets or other sizes for the variables declared in the buffer.
1. We may separate the problem in several steps:
- Attempt to find general rules for the resulting types of operations with minimum-precision types. - Add all the types to `enum hlsl_base_type` and go through **all** the checks for hlsl_type->base_type in the codebase to see if and how they should be included, This includes particularly sensitive functions like `expr_common_base_type()` and `expr_common_shape()`. - We also may want to consider other approaches like keeping the bits of precision on `hlsl_type`, but there will be many parts where `ctx->builtin_types` are being used that will have to be modified. - Analyze the bit patterns of the minimum-precision markers in the output bytecode and replicate them when writing SM4, in particular `sm4_dst_from_node()` and `sm4_src_from_node()`. - Create a series of test that involve both minimum precision types and regular types, and have a quick way of comparing how the minimum precision markers propagate through the code in both the native compiler and ours.
I think it may take at least 4 weeks without counting for reviews and surprises, and the rebases. Although I would want to hear the opinion of somebody else too.
Well, let's give it a chance. This too will need a rebase though.