For temporary registers, SM1-SM3 integer types are internally represented as floating point, so, in order to perform a cast from ints to floats we need a mere MOV.
By the same token, casts from floats to ints can also be implemented with a FLOOR + MOV, where the FLOOR is then lowered by the lower_floor() pass.
For constant integer registers "iN" there is no operation for casting from a floating point register to them. For address registers "aN", and the loop counting register "aL", vertex shaders have the "mova" operation but we haven't used these registers in any way yet.
We probably would want to introduce these as synthetic variables allocated in a special register set. In that case we have to remember to use MOVA instead of MOV in the store operations, but they shouldn't be src or dst of CAST operations.
Regarding constant integer registers, in some shaders, constants are expected to be received formatted as an integer, such as:
int m; float4 main() : sv_target { float4 res = {0, 0, 0, 0};
for (int k = 0; k < m; ++k) res += k; return res; }
which compiles as:
// Registers: // // Name Reg Size // ------------ ----- ---- // m i0 1 //
ps_3_0 def c0, 0, 1, 0, 0 mov r0, c0.x mov r1.x, c0.x rep i0 add r0, r0, r1.x add r1.x, r1.x, c0.y endrep mov oC0, r0
but this only happens if the integer constant is used directly in an instruction that needs it, and as I said there is no instruction that allows converting them to a float representation.
Notice how a more complex shader, that performs operations with this integer variable "m":
int m; float4 main() : sv_target { float4 res = {0, 0, 0, 0};
for (int k = 0; k < m * m; ++k) res += k; return res; }
gives the following output:
// Registers: // // Name Reg Size // ------------ ----- ---- // m c0 1 //
ps_3_0 def c1, 0, 0, 1, 0 defi i0, 255, 0, 0, 0 mul r0.x, c0.x, c0.x mov r1, c1.y mov r0.y, c1.y rep i0 mov r0.z, r0.x break_ge r0.y, r0.z add r1, r0.y, r1 add r0.y, r0.y, c1.z endrep mov oC0, r1
Meaning that the uniform "m" is just stored as a floating point in "c0", the constant integer register "i0" is just set to 255 (hoping it is a high enough value) using "defi", and the "break_ge" involving c0 is used to break from the loop.
We could potentially use this approach to implement loops from SM3 without expecting the variables being received as constant integer registers.
According to the D3D documentation, for SM1-SM3 constant integer registers are only used by the 'loop' and 'rep' instructions.
-- v2: vkd3d-shader/hlsl: Lower casts to int for SM1. tests: Add simple test for implicit cast to int. vkd3d-shader/d3dbc: Implement casts from ints to floats as a MOV. tests: Remove [require] directives for tests that use int and bool uniforms. tests/shader-runner: Pass bool uniforms as IEEE 754 floats to SM1 profiles. tests/shader-runner: Pass int uniforms as IEEE 754 floats to SM1 profiles. tests/shader-runner: Introduce "only" qualifier. tests: Don't ignore SM1 on a non-const-indexing.shader_test test.
From: Francisco Casas fcasas@codeweavers.com
The previous [require] block makes us skip the test for SM4. --- tests/hlsl/non-const-indexing.shader_test | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/tests/hlsl/non-const-indexing.shader_test b/tests/hlsl/non-const-indexing.shader_test index b0452148e..ba60367fe 100644 --- a/tests/hlsl/non-const-indexing.shader_test +++ b/tests/hlsl/non-const-indexing.shader_test @@ -216,7 +216,11 @@ draw quad probe all rgba (1, 5, 3, 4)
-[pixel shader] +[require] +% reset requirements + + +[pixel shader todo(sm<4)] uniform float4 f[4]; uniform uint4 u; uniform uint4 v;
From: Francisco Casas fcasas@codeweavers.com
When the "only" qualifier is added to a directive, the directive is skipped if the shader->minimum_shader_model is not included in the range.
This can be used on the "probe" directives for tests that have different expected results on different shader models, without having to resort to [require] blocks. --- tests/hlsl/arithmetic-int.shader_test | 8 ++--- tests/hlsl/duplicate-modifiers.shader_test | 7 ++-- .../initializer-implicit-array.shader_test | 9 ++--- tests/hlsl/initializer-numeric.shader_test | 6 ++-- tests/hlsl/non-const-indexing.shader_test | 36 +++---------------- tests/shader_runner.c | 36 +++++++++++++++++-- 6 files changed, 47 insertions(+), 55 deletions(-)
diff --git a/tests/hlsl/arithmetic-int.shader_test b/tests/hlsl/arithmetic-int.shader_test index 46b641811..2e0dc5661 100644 --- a/tests/hlsl/arithmetic-int.shader_test +++ b/tests/hlsl/arithmetic-int.shader_test @@ -104,10 +104,9 @@ float4 main() : SV_TARGET draw quad probe all rgba (0.0, 0.0, 0.0, 0.0)
+ [require] shader model >= 4.0 -% dxcompiler performs this calculation on unsigned values and emits zero. -shader model < 6.0
[pixel shader] float4 main() : SV_TARGET @@ -120,10 +119,9 @@ float4 main() : SV_TARGET
[test] draw quad -probe all rgba (-2147483648.0, -2147483648.0, -2147483648.0, -2147483648.0) +only(sm<6) probe all rgba (-2147483648.0, -2147483648.0, -2147483648.0, -2147483648.0) +only(sm>=6) probe all rgba (0.0, 0.0, 0.0, 0.0)
-[require] -shader model >= 4.0
[pixel shader] float4 main() : sv_target diff --git a/tests/hlsl/duplicate-modifiers.shader_test b/tests/hlsl/duplicate-modifiers.shader_test index bf1d9c1b8..812c0ddd5 100644 --- a/tests/hlsl/duplicate-modifiers.shader_test +++ b/tests/hlsl/duplicate-modifiers.shader_test @@ -1,7 +1,3 @@ -% Returns (0.1, 0.3, 0.2, 0.4) with dxcompiler -[require] -shader model < 6.0 - [pixel shader] typedef const precise row_major float2x2 mat_t; float4 main() : sv_target @@ -12,4 +8,5 @@ float4 main() : sv_target
[test] draw quad -probe all rgba (0.1, 0.2, 0.3, 0.4) +only(sm<6) probe all rgba (0.1, 0.2, 0.3, 0.4) +only(sm>=6) probe all rgba (0.1, 0.3, 0.2, 0.4) diff --git a/tests/hlsl/initializer-implicit-array.shader_test b/tests/hlsl/initializer-implicit-array.shader_test index 25cd15644..f220fe607 100644 --- a/tests/hlsl/initializer-implicit-array.shader_test +++ b/tests/hlsl/initializer-implicit-array.shader_test @@ -11,10 +11,6 @@ draw quad probe all rgba (50, 60, 70, 80)
-% dxcompiler emits a nop shader which returns immediately. -[require] -shader model < 6.0 - [pixel shader] float4 main() : sv_target { @@ -26,10 +22,9 @@ float4 main() : sv_target
[test] draw quad -probe all rgba (5.0, 6.0, 7.0, 8.0) +% dxcompiler emits a nop shader which returns immediately. +only(sm<6) probe all rgba (5.0, 6.0, 7.0, 8.0)
-[require] -% reset requirements
[pixel shader] float4 main() : sv_target diff --git a/tests/hlsl/initializer-numeric.shader_test b/tests/hlsl/initializer-numeric.shader_test index 617b67405..ab112a546 100644 --- a/tests/hlsl/initializer-numeric.shader_test +++ b/tests/hlsl/initializer-numeric.shader_test @@ -60,9 +60,6 @@ draw quad probe all rgba (3.0, 250.0, 16.0, 4.2949673e+009) 4
-[require] -shader model < 6.0 - [pixel shader] float4 main() : sv_target { @@ -73,4 +70,5 @@ float4 main() : sv_target
[test] draw quad -probe all rgba (-1294967296.0, 3000000000.0, 0.0, 0.0) 4 +only(sm<6) probe all rgba (-1294967296.0, 3000000000.0, 0.0, 0.0) 4 +only(sm>=6) probe all rgba (3000000000.0, 3000000000.0, 0.0, 0.0) 4 diff --git a/tests/hlsl/non-const-indexing.shader_test b/tests/hlsl/non-const-indexing.shader_test index ba60367fe..e788caeeb 100644 --- a/tests/hlsl/non-const-indexing.shader_test +++ b/tests/hlsl/non-const-indexing.shader_test @@ -236,23 +236,6 @@ float4 main() : sv_target }
% FXC is incapable of compiling this correctly, but results differ for SM1-3 vs SM4-5. -[require] -shader model < 4.0 - -[test] -uniform 0 float 1.0 -uniform 4 float 2.0 -uniform 8 float 3.0 -uniform 12 float 4.0 -uniform 16 uint4 3 1 0 2 -uniform 20 uint4 0 3 1 2 -todo draw quad -todo(sm<4) probe all rgba (1.0, 1.0, 1.0, 1.0) - -[require] -shader model >= 4.0 -shader model < 6.0 - [test] uniform 0 float 1.0 uniform 4 float 2.0 @@ -260,18 +243,7 @@ uniform 8 float 3.0 uniform 12 float 4.0 uniform 16 uint4 3 1 0 2 uniform 20 uint4 0 3 1 2 -draw quad -todo probe all rgba (4.0, 4.0, 4.0, 4.0) - -[require] -shader model >= 6.0 - -[test] -uniform 0 float 1.0 -uniform 4 float 2.0 -uniform 8 float 3.0 -uniform 12 float 4.0 -uniform 16 uint4 3 1 0 2 -uniform 20 uint4 0 3 1 2 -draw quad -probe all rgba (4.0, 3.0, 2.0, 1.0) +todo(sm<4) draw quad +only(sm<4) todo probe all rgba (1.0, 1.0, 1.0, 1.0) +only(sm>=4 & sm<6) todo probe all rgba (4.0, 4.0, 4.0, 4.0) +only(sm>=6) probe all rgba (4.0, 3.0, 2.0, 1.0) diff --git a/tests/shader_runner.c b/tests/shader_runner.c index 6c5c1dba3..23a6ddd3e 100644 --- a/tests/shader_runner.c +++ b/tests/shader_runner.c @@ -642,14 +642,46 @@ static void read_uint64_t2(const char **line, struct u64vec2 *v)
static void parse_test_directive(struct shader_runner *runner, const char *line) { + bool skip_directive = false; + const char *line_ini; + bool match = true; char *rest; int ret;
runner->is_todo = false;
- if (match_string_with_args(line, "todo", &line, runner->minimum_shader_model)) + while (match) { - runner->is_todo = true; + match = false; + + if (match_string_with_args(line, "todo", &line, runner->minimum_shader_model)) + { + runner->is_todo = true; + match = true; + } + + line_ini = line; + if (match_string_with_args(line, "only", &line, runner->minimum_shader_model)) + { + match = true; + } + else if (line != line_ini) + { + /* Matched "only" but for other shader models. */ + skip_directive = true; + match = true; + } + } + + if (skip_directive) + { + const char *new_line; + + if ((new_line = strchr(line, '\n'))) + line = new_line + 1; + else + line += strlen(line); + return; }
if (match_string(line, "dispatch", &line))
From: Francisco Casas fcasas@codeweavers.com
NOTE: regarding non-const-indexing.shader_test,
We probably were getting in (1.0, 1.0, 1.0, 1.0) on native because we were passing:
u: 0x00000003 0x00000001 0x00000000 0x00000002 v: 0x00000000 0x00000003 0x00000001 0x00000002
to the backend. But Direct3D expects ints in float format. This results in really small floats close or equal to zero, so when used when indexing, is if as they were 0.
Once we start passing the ints correctly formatted, the result becomes (4.0, 3.0, 2.0, 1.0), same as SM6. --- tests/hlsl/non-const-indexing.shader_test | 2 +- tests/shader_runner.c | 26 +++++++++++++++++++++++ 2 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/tests/hlsl/non-const-indexing.shader_test b/tests/hlsl/non-const-indexing.shader_test index e788caeeb..85a5d140c 100644 --- a/tests/hlsl/non-const-indexing.shader_test +++ b/tests/hlsl/non-const-indexing.shader_test @@ -244,6 +244,6 @@ uniform 12 float 4.0 uniform 16 uint4 3 1 0 2 uniform 20 uint4 0 3 1 2 todo(sm<4) draw quad -only(sm<4) todo probe all rgba (1.0, 1.0, 1.0, 1.0) +only(sm<4) todo probe all rgba (4.0, 3.0, 2.0, 1.0) only(sm>=4 & sm<6) todo probe all rgba (4.0, 4.0, 4.0, 4.0) only(sm>=6) probe all rgba (4.0, 3.0, 2.0, 1.0) diff --git a/tests/shader_runner.c b/tests/shader_runner.c index 23a6ddd3e..8cffe20f4 100644 --- a/tests/shader_runner.c +++ b/tests/shader_runner.c @@ -888,6 +888,7 @@ static void parse_test_directive(struct shader_runner *runner, const char *line) } else if (match_string(line, "uniform", &line)) { + bool is_d3dbc = runner->minimum_shader_model < SHADER_MODEL_4_0; unsigned int offset;
if (!sscanf(line, "%u", &offset)) @@ -921,30 +922,55 @@ static void parse_test_directive(struct shader_runner *runner, const char *line) else if (match_string(line, "int4", &line)) { struct ivec4 v; + struct vec4 f;
read_int4(&line, &v); + set_uniforms(runner, offset, 4, &v); + if (is_d3dbc) + { + f = (struct vec4){v.x, v.y, v.z, v.w}; + set_uniforms(runner, offset, 4, &f); + } } else if (match_string(line, "uint4", &line)) { struct uvec4 v; + struct vec4 f;
read_uint4(&line, &v); set_uniforms(runner, offset, 4, &v); + if (is_d3dbc) + { + f = (struct vec4){v.x, v.y, v.z, v.w}; + set_uniforms(runner, offset, 4, &f); + } } else if (match_string(line, "int", &line)) { + float f; int i;
read_int(&line, &i); set_uniforms(runner, offset, 1, &i); + if (is_d3dbc) + { + f = i; + set_uniforms(runner, offset, 1, &f); + } } else if (match_string(line, "uint", &line)) { unsigned int u; + float f;
read_uint(&line, &u); set_uniforms(runner, offset, 1, &u); + if (is_d3dbc) + { + f = u; + set_uniforms(runner, offset, 1, &f); + } } else if (match_string(line, "int64_t2", &line)) {
From: Francisco Casas fcasas@codeweavers.com
d3dbc expects bools to be passed as either 1.0f or 0.0f, accordingly. --- tests/hlsl/cast-to-float.shader_test | 2 +- tests/hlsl/cast-to-half.shader_test | 2 +- tests/hlsl/cast-to-int.shader_test | 2 +- tests/hlsl/cast-to-uint.shader_test | 2 +- tests/shader_runner.c | 38 ++++++++++++++++++++++++++++ 5 files changed, 42 insertions(+), 4 deletions(-)
diff --git a/tests/hlsl/cast-to-float.shader_test b/tests/hlsl/cast-to-float.shader_test index caaf98c02..010c722a3 100644 --- a/tests/hlsl/cast-to-float.shader_test +++ b/tests/hlsl/cast-to-float.shader_test @@ -15,7 +15,7 @@ float4 main() : sv_target [test] uniform 0 int -1 uniform 1 uint 3 -uniform 2 int -2 +uniform 2 bool true uniform 3 float 0.5 draw quad probe all rgba (0.5, 0.5, 0.5, 0.5) diff --git a/tests/hlsl/cast-to-half.shader_test b/tests/hlsl/cast-to-half.shader_test index b8feb6760..44f610502 100644 --- a/tests/hlsl/cast-to-half.shader_test +++ b/tests/hlsl/cast-to-half.shader_test @@ -15,7 +15,7 @@ float4 main() : sv_target [test] uniform 0 int -1 uniform 1 uint 3 -uniform 2 int -2 +uniform 2 bool true uniform 3 float 0.5 draw quad probe all rgba (0.5, 0.5, 0.5, 0.5) diff --git a/tests/hlsl/cast-to-int.shader_test b/tests/hlsl/cast-to-int.shader_test index 3e850fb5b..7d362128b 100644 --- a/tests/hlsl/cast-to-int.shader_test +++ b/tests/hlsl/cast-to-int.shader_test @@ -21,7 +21,7 @@ float4 main() : sv_target [test] uniform 0 float 2.6 uniform 1 int -2 -uniform 2 int -2 +uniform 2 bool true uniform 3 float -3.6 draw quad probe all rgba (0.5, 0.5, 0.5, 0.5) diff --git a/tests/hlsl/cast-to-uint.shader_test b/tests/hlsl/cast-to-uint.shader_test index 07479984a..e7ad30677 100644 --- a/tests/hlsl/cast-to-uint.shader_test +++ b/tests/hlsl/cast-to-uint.shader_test @@ -21,7 +21,7 @@ float4 main() : sv_target [test] uniform 0 float 2.6 uniform 1 int 2 -uniform 2 int -2 +uniform 2 bool true uniform 3 float -3.6 draw quad probe all rgba (0.5, 0.5, 0.5, 0.5) diff --git a/tests/shader_runner.c b/tests/shader_runner.c index 8cffe20f4..873f10447 100644 --- a/tests/shader_runner.c +++ b/tests/shader_runner.c @@ -546,6 +546,20 @@ static void set_uniforms(struct shader_runner *runner, size_t offset, size_t cou memcpy(runner->uniforms + offset, uniforms, count * sizeof(*runner->uniforms)); }
+static void read_bool(const char **line, bool *b) +{ + const char *rest; + + if (match_string(*line, "true", &rest)) + *b = true; + else if (match_string(*line, "false", &rest)) + *b = false; + else + fatal_error("Malformed bool constant '%s'.\n", *line); + + *line = rest; +} + static void read_int(const char **line, int *i) { char *rest; @@ -972,6 +986,30 @@ static void parse_test_directive(struct shader_runner *runner, const char *line) set_uniforms(runner, offset, 1, &f); } } + else if (match_string(line, "bool4", &line)) + { + unsigned int k; + float f; + bool b; + + for (k = 0; k < 4; ++k) + { + read_bool(&line, &b); + /* SM1-SM3 expects true to be 1.0f, while SM4-SM5 allows any non-zero value. */ + f = b; + set_uniforms(runner, offset + k, 1, &f); + } + } + else if (match_string(line, "bool", &line)) + { + float f; + bool b; + + read_bool(&line, &b); + /* SM1-SM3 expects true to be 1.0f, while SM4-SM5 allows any non-zero value. */ + f = b; + set_uniforms(runner, offset, 1, &f); + } else if (match_string(line, "int64_t2", &line)) { struct i64vec2 v;
From: Francisco Casas fcasas@codeweavers.com
These tests should actually compile and run, which is possible now that we are passing the int and uint uniforms in the expected IEEE 754 float format for SM1 shaders.
Naturally, adding some todo(sm<4) qualifiers is required. --- tests/hlsl/any.shader_test | 22 ++++---- tests/hlsl/bool-cast.shader_test | 19 ++++--- tests/hlsl/cast-to-float.shader_test | 7 ++- tests/hlsl/cast-to-half.shader_test | 7 ++- tests/hlsl/cast-to-int.shader_test | 9 ++- tests/hlsl/cast-to-uint.shader_test | 8 ++- tests/hlsl/ceil.shader_test | 7 +-- tests/hlsl/floor.shader_test | 6 +- tests/hlsl/function-cast.shader_test | 2 - tests/hlsl/ldexp.shader_test | 6 +- tests/hlsl/lerp.shader_test | 6 +- tests/hlsl/sign.shader_test | 17 +++--- tests/hlsl/switch.shader_test | 84 +++++++++++++++------------- tests/hlsl/trunc.shader_test | 6 +- 14 files changed, 109 insertions(+), 97 deletions(-)
diff --git a/tests/hlsl/any.shader_test b/tests/hlsl/any.shader_test index afaf81fac..f2d6533c3 100644 --- a/tests/hlsl/any.shader_test +++ b/tests/hlsl/any.shader_test @@ -48,10 +48,8 @@ uniform 0 float4 -1.0 0.0 0.0 0.0 todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0)
-[require] -shader model >= 4.0
-[pixel shader] +[pixel shader todo(sm<4)] uniform uint4 b;
float4 main() : sv_target @@ -61,25 +59,25 @@ float4 main() : sv_target
[test] uniform 0 uint4 1 1 1 1 -draw quad +todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0) uniform 0 uint4 1 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0) uniform 0 uint4 0 1 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0) uniform 0 uint4 0 0 1 0 -draw quad +todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0) uniform 0 uint4 0 0 0 1 -draw quad +todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0) uniform 0 uint4 0 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (0.0, 0.0, 0.0, 0.0)
-[pixel shader] +[pixel shader todo(sm<4)] uniform uint b;
float4 main() : sv_target @@ -89,8 +87,8 @@ float4 main() : sv_target
[test] uniform 0 uint4 1 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0) uniform 0 uint4 0 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (0.0, 0.0, 0.0, 0.0) diff --git a/tests/hlsl/bool-cast.shader_test b/tests/hlsl/bool-cast.shader_test index 09ca12e2b..9b2e96df6 100644 --- a/tests/hlsl/bool-cast.shader_test +++ b/tests/hlsl/bool-cast.shader_test @@ -14,11 +14,7 @@ draw quad probe all rgba (0.0, 0.0, 1.0, 1.0)
-[require] -shader model >= 4.0 - - -[pixel shader] +[pixel shader todo(sm<4)] uniform float4 x; uniform int4 y;
@@ -30,11 +26,11 @@ float4 main() : SV_TARGET [test] uniform 0 float4 0.0 0.0 2.0 4.0 uniform 4 int4 0 1 0 10 -draw quad +todo(sm<4) draw quad probe all rgba (0.0, 10.0, 1.0, 11.0)
-[pixel shader] +[pixel shader todo(sm<4)] uniform bool4 b;
float4 main() : sv_target @@ -43,6 +39,11 @@ float4 main() : sv_target }
[test] +uniform 0 bool4 true false true false +todo(sm<4) draw quad +probe all rgba (2.0, 0.0, 2.0, 0.0) + +% true cannot be passed as an arbitrary value in SM1. uniform 0 uint4 0x00000001 0x00000002 0x80000000 0x00000000 -draw quad -probe all rgba (2.0, 2.0, 2.0, 0.0) +only(sm>=4) draw quad +only(sm>=4) probe all rgba (2.0, 2.0, 2.0, 0.0) diff --git a/tests/hlsl/cast-to-float.shader_test b/tests/hlsl/cast-to-float.shader_test index 010c722a3..6f1d9d2aa 100644 --- a/tests/hlsl/cast-to-float.shader_test +++ b/tests/hlsl/cast-to-float.shader_test @@ -1,4 +1,5 @@ [require] +% The following test doesn't work on SM1 because, on it, each uniform has the whole register. shader model >= 4.0
[pixel shader] @@ -20,8 +21,12 @@ uniform 3 float 0.5 draw quad probe all rgba (0.5, 0.5, 0.5, 0.5)
-[pixel shader]
+[require] +% reset requirements + + +[pixel shader] float4 main() : sv_target { int i = -1; diff --git a/tests/hlsl/cast-to-half.shader_test b/tests/hlsl/cast-to-half.shader_test index 44f610502..76e9c680a 100644 --- a/tests/hlsl/cast-to-half.shader_test +++ b/tests/hlsl/cast-to-half.shader_test @@ -1,4 +1,5 @@ [require] +% The following test doesn't work on SM1 because, on it, each uniform has the whole register. shader model >= 4.0
[pixel shader] @@ -20,8 +21,12 @@ uniform 3 float 0.5 draw quad probe all rgba (0.5, 0.5, 0.5, 0.5)
-[pixel shader]
+[require] +% reset requirements + + +[pixel shader] float4 main() : sv_target { int i = -1; diff --git a/tests/hlsl/cast-to-int.shader_test b/tests/hlsl/cast-to-int.shader_test index 7d362128b..302919b22 100644 --- a/tests/hlsl/cast-to-int.shader_test +++ b/tests/hlsl/cast-to-int.shader_test @@ -1,4 +1,5 @@ [require] +% The following test doesn't work on SM1 because, on it, each uniform has the whole register. shader model >= 4.0
[pixel shader] @@ -26,8 +27,11 @@ uniform 3 float -3.6 draw quad probe all rgba (0.5, 0.5, 0.5, 0.5)
-[pixel shader]
+[require] +% reset requirements + +[pixel shader] float4 main() : sv_target { float f = 2.6; @@ -45,4 +49,5 @@ float4 main() : sv_target
[test] draw quad -probe all rgba (0.5, 0.5, 0.5, 0.5) +only(sm<4) todo probe all rgba (0.5, 4.2949673e+009, 0.5, 0.5) +only(sm>=4) probe all rgba (0.5, 0.5, 0.5, 0.5) diff --git a/tests/hlsl/cast-to-uint.shader_test b/tests/hlsl/cast-to-uint.shader_test index e7ad30677..66752b4d9 100644 --- a/tests/hlsl/cast-to-uint.shader_test +++ b/tests/hlsl/cast-to-uint.shader_test @@ -1,6 +1,8 @@ +% On SM1, uints can only be used with known-positive values. [require] shader model >= 4.0
+ [pixel shader] uniform float f; uniform int i; @@ -26,8 +28,12 @@ uniform 3 float -3.6 draw quad probe all rgba (0.5, 0.5, 0.5, 0.5)
-[pixel shader]
+[require] +% reset requirements + + +[pixel shader] float4 main() : sv_target { float f = 2.6; diff --git a/tests/hlsl/ceil.shader_test b/tests/hlsl/ceil.shader_test index 46414a92b..0082c8f4e 100644 --- a/tests/hlsl/ceil.shader_test +++ b/tests/hlsl/ceil.shader_test @@ -37,10 +37,7 @@ uniform 0 float4 -0.5 6.5 7.5 3.4 todo(sm<4) draw quad probe all rgba (7.0, 8.0, 0.0, 4.0) 4
-[require] -shader model >= 4.0 - -[pixel shader] +[pixel shader todo(sm<4)] uniform int4 u;
float4 main() : sv_target @@ -53,5 +50,5 @@ float4 main() : sv_target
[test] uniform 0 int4 -1 6 7 3 -draw quad +todo(sm<4) draw quad probe all rgba (6.0, 7.0, -1.0, 3.0) 4 diff --git a/tests/hlsl/floor.shader_test b/tests/hlsl/floor.shader_test index 89e1f12ef..dc9e31f79 100644 --- a/tests/hlsl/floor.shader_test +++ b/tests/hlsl/floor.shader_test @@ -37,10 +37,8 @@ uniform 0 float4 -0.5 6.5 7.5 3.4 todo(sm<4) draw quad probe all rgba (6.0, 7.0, -1.0, 3.0) 4
-[require] -shader model >= 4.0
-[pixel shader] +[pixel shader todo(sm<4)] uniform int4 u;
float4 main() : sv_target @@ -53,5 +51,5 @@ float4 main() : sv_target
[test] uniform 0 int4 -1 6 7 3 -draw quad +todo(sm<4) draw quad probe all rgba (6.0, 7.0, -1.0, 3.0) 4 diff --git a/tests/hlsl/function-cast.shader_test b/tests/hlsl/function-cast.shader_test index c92289863..620e27d69 100644 --- a/tests/hlsl/function-cast.shader_test +++ b/tests/hlsl/function-cast.shader_test @@ -69,8 +69,6 @@ uniform 0 float4 -1.9 -1.0 2.9 4.0 todo draw quad probe all rgba (-1.0, -1.0, 2.0, 4.0)
-[require] -shader model >= 4.0
[pixel shader todo] uniform int4 i; diff --git a/tests/hlsl/ldexp.shader_test b/tests/hlsl/ldexp.shader_test index f8ad40d8e..3becf5f60 100644 --- a/tests/hlsl/ldexp.shader_test +++ b/tests/hlsl/ldexp.shader_test @@ -13,10 +13,8 @@ uniform 4 float4 0.0 -10.0 10.0 100.0 draw quad probe all rgba (2.0, 0.00292968750, 4096.0, 6.33825300e+030) 2
-[require] -shader model >= 4.0
-[pixel shader] +[pixel shader todo(sm<4)] uniform int4 x; uniform int4 y;
@@ -28,7 +26,7 @@ float4 main() : SV_TARGET [test] uniform 0 int4 2 3 4 5 uniform 4 int4 0 -10 10 100 -draw quad +todo(sm<4) draw quad probe all rgba (2.0, 0.00292968750, 4096.0, 6.33825300e+030) 2
diff --git a/tests/hlsl/lerp.shader_test b/tests/hlsl/lerp.shader_test index 15e90cef9..e3dbc1e89 100644 --- a/tests/hlsl/lerp.shader_test +++ b/tests/hlsl/lerp.shader_test @@ -15,10 +15,8 @@ uniform 8 float4 0.0 1.0 -1.0 0.75 draw quad probe all rgba (2.0, -10.0, -2.0, 76.25)
-[require] -shader model >= 4.0
-[pixel shader] +[pixel shader todo(sm<4)] uniform int4 x; uniform int4 y; uniform int4 s; @@ -32,7 +30,7 @@ float4 main() : SV_TARGET uniform 0 int4 2 3 4 0 uniform 4 int4 0 -10 10 1000000 uniform 8 int4 0 1 -1 1000000 -draw quad +todo(sm<4) draw quad probe all rgba (2.0, -10.0, -2.0, 1e12)
diff --git a/tests/hlsl/sign.shader_test b/tests/hlsl/sign.shader_test index 6ec5a571d..b72ec27c8 100644 --- a/tests/hlsl/sign.shader_test +++ b/tests/hlsl/sign.shader_test @@ -17,6 +17,7 @@ uniform 0 float4 0.0 0.0 0.0 0.0 todo(sm<4) draw quad probe all rgba (0.0, 0.0, 0.0, 0.0)
+ [pixel shader todo(sm<4)] uniform float4 f;
@@ -30,6 +31,7 @@ uniform 0 float4 1.0 2.0 3.0 4.0 todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0)
+ [pixel shader todo(sm<4)] uniform float2x2 f;
@@ -44,9 +46,6 @@ uniform 4 float4 3.0 4.0 0.0 0.0 todo(sm<4) draw quad probe all rgba (1.0, 1.0, 1.0, 1.0)
-[require] -% SM1-3 doesn't support integral types -shader model >= 4.0
[pixel shader todo(sm<4)] uniform int f; @@ -58,15 +57,16 @@ float4 main() : sv_target
[test] uniform 0 int4 1 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (1, 1, 1, 1) uniform 0 int4 -1 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (-1, -1, -1, -1) uniform 0 int4 0 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (0, 0, 0, 0)
+ [pixel shader todo(sm<4)] uniform int4 f;
@@ -77,9 +77,10 @@ float4 main() : sv_target
[test] uniform 0 int4 1 2 3 4 -draw quad +todo(sm<4) draw quad probe all rgba (1, 1, 1, 1)
+ [pixel shader todo(sm<4)] uniform int2x2 f;
@@ -91,5 +92,5 @@ float4 main() : sv_target [test] uniform 0 int4 1 2 0 0 uniform 4 int4 3 4 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (1, 1, 1, 1) diff --git a/tests/hlsl/switch.shader_test b/tests/hlsl/switch.shader_test index 01624f97c..7b988cd7a 100644 --- a/tests/hlsl/switch.shader_test +++ b/tests/hlsl/switch.shader_test @@ -1,7 +1,4 @@ -[require] -shader model >= 4.0 - -[pixel shader] +[pixel shader todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -19,17 +16,18 @@ float4 main() : sv_target
[test] uniform 0 uint4 3 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (5.0, 5.0, 5.0, 5.0) uniform 0 uint4 1 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (4.0, 4.0, 4.0, 4.0) uniform 0 uint4 0 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (3.0, 3.0, 3.0, 3.0)
+ % just a default case -[pixel shader] +[pixel shader todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -43,15 +41,16 @@ float4 main() : sv_target
[test] uniform 0 uint4 3 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (5.0, 5.0, 5.0, 5.0) uniform 0 uint4 1 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (5.0, 5.0, 5.0, 5.0) uniform 0 uint4 0 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (5.0, 5.0, 5.0, 5.0)
+ % completely empty [pixel shader fail] uint4 v; @@ -63,8 +62,9 @@ float4 main() : sv_target } }
+ % falling through is only supported for empty case statements -[pixel shader] +[pixel shader todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -84,17 +84,18 @@ float4 main() : sv_target
[test] uniform 0 uint4 2 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (1.0, 2.0, 3.0, 4.0) uniform 0 uint4 1 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (1.1, 2.0, 3.0, 4.0) uniform 0 uint4 0 0 0 0 -draw quad +todo(sm<4) draw quad probe all rgba (1.1, 2.0, 3.0, 4.0)
+ % case value evaluation -[pixel shader] +[pixel shader todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -116,14 +117,15 @@ float4 main() : sv_target
[test] uniform 0 uint4 2 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.1, 2.1, 3.1, 4.1) uniform 0 uint4 1 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.0, 2.0, 3.0, 4.0)
+ % floats are accepted -[pixel shader fail(sm>=6)] +[pixel shader fail(sm>=6) todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -145,13 +147,13 @@ float4 main() : sv_target
[test] uniform 0 uint4 2 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.1, 2.1, 3.1, 4.1) uniform 0 uint4 1 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.0, 2.0, 3.0, 4.0)
-[pixel shader fail(sm>=6)] +[pixel shader fail(sm<4 | sm>=6) todo(sm<4)] float4 v;
float4 main() : sv_target @@ -173,10 +175,10 @@ float4 main() : sv_target
[test] uniform 0 float4 2.0 0.0 0.0 0.0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.1, 2.1, 3.1, 4.1) uniform 0 float4 1.0 0.0 0.0 0.0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.0, 2.0, 3.0, 4.0)
[pixel shader fail] @@ -347,7 +349,7 @@ float4 main() : sv_target }
% more complicated breaks -[pixel shader] +[pixel shader todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -374,17 +376,17 @@ float4 main() : sv_target
[test] uniform 0 uint4 2 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.1, 2.1, 3.1, 4.1) uniform 0 uint4 1 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.2, 2.2, 3.2, 4.2) uniform 0 uint4 0 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.0, 2.0, 3.0, 4.0)
% switch breaks within a loop -[pixel shader] +[pixel shader todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -412,11 +414,12 @@ float4 main() : sv_target
[test] uniform 0 uint4 2 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (5.0, 6.0, 7.0, 8.0)
+ % default case placement -[pixel shader] +[pixel shader todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -443,16 +446,17 @@ float4 main() : sv_target
[test] uniform 0 uint4 0 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (4.0, 5.0, 6.0, 7.0) uniform 0 uint4 2 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (2.0, 3.0, 4.0, 5.0) uniform 0 uint4 3 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (4.0, 5.0, 6.0, 7.0)
-[pixel shader] + +[pixel shader todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -480,13 +484,13 @@ float4 main() : sv_target
[test] uniform 0 uint4 3 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.0, 2.0, 3.0, 4.0) uniform 0 uint4 0 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (4.0, 5.0, 6.0, 7.0) uniform 0 uint4 5 0 0 0 -todo(sm>=6) draw quad +todo(sm<4 | sm>=6) draw quad probe all rgba (1.0, 2.0, 3.0, 4.0)
% 'continue' is not supported in switches @@ -515,7 +519,7 @@ float4 main() : sv_target return c; }
-[pixel shader] +[pixel shader fail(sm<4) todo(sm<4)] uint4 v;
float4 main() : sv_target @@ -553,7 +557,7 @@ todo(sm>=6) draw quad probe all rgba (7.0, 8.0, 9.0, 10.0)
% return from a switch nested in a loop -[pixel shader] +[pixel shader fail(sm<4) todo(sm<4)] uint4 v;
float4 main() : sv_target diff --git a/tests/hlsl/trunc.shader_test b/tests/hlsl/trunc.shader_test index f1d23bf82..8ea43f67b 100644 --- a/tests/hlsl/trunc.shader_test +++ b/tests/hlsl/trunc.shader_test @@ -30,10 +30,8 @@ uniform 0 float4 -0.5 6.5 7.5 3.4 todo(sm<4) draw quad probe all rgba (6.0, 7.0, 0.0, 3.0)
-[require] -shader model >= 4.0
-[pixel shader] +[pixel shader todo(sm<4)] uniform int4 u;
float4 main() : sv_target @@ -46,5 +44,5 @@ float4 main() : sv_target
[test] uniform 0 int4 -1 6 7 3 -draw quad +todo(sm<4) draw quad probe all rgba (6.0, 7.0, -1.0, 3.0)
From: Francisco Casas fcasas@codeweavers.com
For temporary registers, SM1-SM3 integer types are internally represented as floating point, so, in order to perform a cast from ints to floats we need a mere MOV.
For constant integer registers "iN" there is no operation for casting from a floating point register to them. For address registers "aN", and the loop counting register "aL", vertex shaders have the "mova" operation but we haven't used these registers in any way yet.
We probably would want to introduce these as synthetic variables allocated in a special register set. In that case we have to remember to use MOVA instead of MOV in the store operations, but they shouldn't be src or dst of CAST operations.
Regarding constant integer registers, in some shaders, constants are expected to be received formatted as an integer, such as:
int m; float4 main() : sv_target { float4 res = {0, 0, 0, 0};
for (int k = 0; k < m; ++k) res += k; return res; }
which compiles as:
// Registers: // // Name Reg Size // ------------ ----- ---- // m i0 1 //
ps_3_0 def c0, 0, 1, 0, 0 mov r0, c0.x mov r1.x, c0.x rep i0 add r0, r0, r1.x add r1.x, r1.x, c0.y endrep mov oC0, r0
but this only happens if the integer constant is used directly in an instruction that needs it, and as I said there is no instruction that allows converting them to a float representation.
Notice how a more complex shader, that performs operations with this integer variable "m":
int m; float4 main() : sv_target { float4 res = {0, 0, 0, 0};
for (int k = 0; k < m * m; ++k) res += k; return res; }
gives the following output:
// Registers: // // Name Reg Size // ------------ ----- ---- // m c0 1 //
ps_3_0 def c1, 0, 0, 1, 0 defi i0, 255, 0, 0, 0 mul r0.x, c0.x, c0.x mov r1, c1.y mov r0.y, c1.y rep i0 mov r0.z, r0.x break_ge r0.y, r0.z add r1, r0.y, r1 add r0.y, r0.y, c1.z endrep mov oC0, r1
Meaning that the uniform "m" is just stored as a floating point in "c0", the constant integer register "i0" is just set to 255 (hoping it is a high enough value) using "defi", and the "break_ge" involving c0 is used to break from the loop.
We could potentially use this approach to implement loops from SM3 without expecting the variables being received as constant integer registers.
According to the D3D documentation, for SM1-SM3 constant integer registers are only used by the 'loop' and 'rep' instructions. --- libs/vkd3d-shader/d3dbc.c | 82 +++++++++++++++++++++++++++++++++ tests/hlsl/distance.shader_test | 2 +- tests/hlsl/half.shader_test | 4 +- tests/hlsl/ldexp.shader_test | 4 +- tests/hlsl/lerp.shader_test | 4 +- 5 files changed, 89 insertions(+), 7 deletions(-)
diff --git a/libs/vkd3d-shader/d3dbc.c b/libs/vkd3d-shader/d3dbc.c index a4ca7aa21..b131e768b 100644 --- a/libs/vkd3d-shader/d3dbc.c +++ b/libs/vkd3d-shader/d3dbc.c @@ -1957,6 +1957,84 @@ static void write_sm1_unary_op(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffe write_sm1_instruction(ctx, buffer, &instr); }
+static void write_sm1_cast(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *buffer, + const struct hlsl_ir_node *instr) +{ + struct hlsl_ir_expr *expr = hlsl_ir_expr(instr); + const struct hlsl_ir_node *arg1 = expr->operands[0].node; + const struct hlsl_type *dst_type = expr->node.data_type; + const struct hlsl_type *src_type = arg1->data_type; + + /* Narrowing casts were already lowered. */ + assert(src_type->dimx == dst_type->dimx); + + switch (dst_type->base_type) + { + case HLSL_TYPE_HALF: + case HLSL_TYPE_FLOAT: + switch (src_type->base_type) + { + case HLSL_TYPE_INT: + case HLSL_TYPE_UINT: + /* Integers are internally represented as floats, so no change is necessary.*/ + case HLSL_TYPE_HALF: + case HLSL_TYPE_FLOAT: + write_sm1_unary_op(ctx, buffer, D3DSIO_MOV, &instr->reg, &arg1->reg, 0, 0); + break; + + case HLSL_TYPE_BOOL: + hlsl_fixme(ctx, &instr->loc, "SM1 cast from bool to float."); + break; + + case HLSL_TYPE_DOUBLE: + hlsl_fixme(ctx, &instr->loc, "SM1 cast from double to float."); + break; + + default: + vkd3d_unreachable(); + } + break; + + case HLSL_TYPE_INT: + case HLSL_TYPE_UINT: + switch(src_type->base_type) + { + case HLSL_TYPE_INT: + case HLSL_TYPE_UINT: + write_sm1_unary_op(ctx, buffer, D3DSIO_MOV, &instr->reg, &arg1->reg, 0, 0); + break; + + case HLSL_TYPE_HALF: + case HLSL_TYPE_FLOAT: + hlsl_fixme(ctx, &instr->loc, "SM1 cast from float to integer."); + break; + + case HLSL_TYPE_BOOL: + hlsl_fixme(ctx, &instr->loc, "SM1 cast from bool to integer."); + break; + + case HLSL_TYPE_DOUBLE: + hlsl_fixme(ctx, &instr->loc, "SM1 cast from double to integer."); + break; + + default: + vkd3d_unreachable(); + } + break; + + case HLSL_TYPE_DOUBLE: + hlsl_fixme(ctx, &instr->loc, "SM1 cast to double."); + break; + + case HLSL_TYPE_BOOL: + /* Casts to bool should have already been lowered. */ + default: + hlsl_fixme(ctx, &expr->node.loc, "SM1 cast from %s to %s.\n", + debug_hlsl_type(ctx, src_type), debug_hlsl_type(ctx, dst_type)); + break; + } +} + static void write_sm1_constant_defs(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *buffer) { unsigned int i, x; @@ -2178,6 +2256,10 @@ static void write_sm1_expr(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *b write_sm1_unary_op(ctx, buffer, D3DSIO_ABS, &instr->reg, &arg1->reg, 0, 0); break;
+ case HLSL_OP1_CAST: + write_sm1_cast(ctx, buffer, instr); + break; + case HLSL_OP1_DSX: write_sm1_unary_op(ctx, buffer, D3DSIO_DSX, &instr->reg, &arg1->reg, 0, 0); break; diff --git a/tests/hlsl/distance.shader_test b/tests/hlsl/distance.shader_test index bf2423c7a..3f5446451 100644 --- a/tests/hlsl/distance.shader_test +++ b/tests/hlsl/distance.shader_test @@ -13,7 +13,7 @@ uniform 4 float4 2.0 -1.0 4.0 5.0 draw quad probe all rgba (7.483983, 7.483983, 7.483983, 7.483983) 1
-[pixel shader todo(sm<4)] +[pixel shader] uniform int4 x; uniform int4 y;
diff --git a/tests/hlsl/half.shader_test b/tests/hlsl/half.shader_test index 8cf7a756f..fe7074e45 100644 --- a/tests/hlsl/half.shader_test +++ b/tests/hlsl/half.shader_test @@ -9,7 +9,7 @@ float4 main() : sv_target [require] options: backcompat
-[pixel shader todo(sm<4)] +[pixel shader] uniform half h;
float4 main() : sv_target @@ -19,5 +19,5 @@ float4 main() : sv_target
[test] uniform 0 float 10.0 -todo(sm<4) draw quad +draw quad probe all rgba (10.0, 10.0, 10.0, 10.0) diff --git a/tests/hlsl/ldexp.shader_test b/tests/hlsl/ldexp.shader_test index 3becf5f60..d7275d0c5 100644 --- a/tests/hlsl/ldexp.shader_test +++ b/tests/hlsl/ldexp.shader_test @@ -14,7 +14,7 @@ draw quad probe all rgba (2.0, 0.00292968750, 4096.0, 6.33825300e+030) 2
-[pixel shader todo(sm<4)] +[pixel shader] uniform int4 x; uniform int4 y;
@@ -26,7 +26,7 @@ float4 main() : SV_TARGET [test] uniform 0 int4 2 3 4 5 uniform 4 int4 0 -10 10 100 -todo(sm<4) draw quad +draw quad probe all rgba (2.0, 0.00292968750, 4096.0, 6.33825300e+030) 2
diff --git a/tests/hlsl/lerp.shader_test b/tests/hlsl/lerp.shader_test index e3dbc1e89..23921adc1 100644 --- a/tests/hlsl/lerp.shader_test +++ b/tests/hlsl/lerp.shader_test @@ -16,7 +16,7 @@ draw quad probe all rgba (2.0, -10.0, -2.0, 76.25)
-[pixel shader todo(sm<4)] +[pixel shader] uniform int4 x; uniform int4 y; uniform int4 s; @@ -30,7 +30,7 @@ float4 main() : SV_TARGET uniform 0 int4 2 3 4 0 uniform 4 int4 0 -10 10 1000000 uniform 8 int4 0 1 -1 1000000 -todo(sm<4) draw quad +draw quad probe all rgba (2.0, -10.0, -2.0, 1e12)
From: Francisco Casas fcasas@codeweavers.com
--- tests/hlsl/cast-to-int.shader_test | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/tests/hlsl/cast-to-int.shader_test b/tests/hlsl/cast-to-int.shader_test index 302919b22..417c3dbed 100644 --- a/tests/hlsl/cast-to-int.shader_test +++ b/tests/hlsl/cast-to-int.shader_test @@ -1,3 +1,19 @@ +[pixel shader todo(sm<4)] +uniform float3 f; + +float4 main() : sv_target +{ + int3 r = f; + + return float4(r, 0); +} + +[test] +uniform 0 float4 10.3 11.5 12.8 13.1 +todo(sm<4) draw quad +probe all rgba (10, 11, 12, 0) + + [require] % The following test doesn't work on SM1 because, on it, each uniform has the whole register. shader model >= 4.0
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/d3dbc.c | 18 ++++++++-------- libs/vkd3d-shader/hlsl_codegen.c | 34 ++++++++++++++++++++++++++++++ tests/hlsl/cast-to-int.shader_test | 4 ++-- tests/hlsl/ceil.shader_test | 8 +++---- tests/hlsl/floor.shader_test | 8 +++---- tests/hlsl/round.shader_test | 8 +++---- tests/hlsl/saturate.shader_test | 4 ++-- 7 files changed, 59 insertions(+), 25 deletions(-)
diff --git a/libs/vkd3d-shader/d3dbc.c b/libs/vkd3d-shader/d3dbc.c index b131e768b..ce56392f7 100644 --- a/libs/vkd3d-shader/d3dbc.c +++ b/libs/vkd3d-shader/d3dbc.c @@ -1999,16 +1999,14 @@ static void write_sm1_cast(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *b case HLSL_TYPE_UINT: switch(src_type->base_type) { + case HLSL_TYPE_HALF: + case HLSL_TYPE_FLOAT: + /* A compilation pass applies a FLOOR operation to casts to int, so no change is necessary. */ case HLSL_TYPE_INT: case HLSL_TYPE_UINT: write_sm1_unary_op(ctx, buffer, D3DSIO_MOV, &instr->reg, &arg1->reg, 0, 0); break;
- case HLSL_TYPE_HALF: - case HLSL_TYPE_FLOAT: - hlsl_fixme(ctx, &instr->loc, "SM1 cast from float to integer."); - break; - case HLSL_TYPE_BOOL: hlsl_fixme(ctx, &instr->loc, "SM1 cast from bool to integer."); break; @@ -2243,6 +2241,12 @@ static void write_sm1_expr(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *b
assert(instr->reg.allocated);
+ if (expr->op == HLSL_OP1_CAST) + { + write_sm1_cast(ctx, buffer, instr); + return; + } + if (instr->data_type->base_type != HLSL_TYPE_FLOAT) { /* These need to be lowered. */ @@ -2256,10 +2260,6 @@ static void write_sm1_expr(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *b write_sm1_unary_op(ctx, buffer, D3DSIO_ABS, &instr->reg, &arg1->reg, 0, 0); break;
- case HLSL_OP1_CAST: - write_sm1_cast(ctx, buffer, instr); - break; - case HLSL_OP1_DSX: write_sm1_unary_op(ctx, buffer, D3DSIO_DSX, &instr->reg, &arg1->reg, 0, 0); break; diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 6ad60e4c6..4121fadf3 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -2647,6 +2647,39 @@ static bool sort_synthetic_separated_samplers_first(struct hlsl_ctx *ctx) return false; }
+/* Append a FLOOR before a CAST to int or uint (which is written as a mere MOV). */ +static bool lower_casts_to_int(struct hlsl_ctx *ctx, struct hlsl_ir_node *instr, struct hlsl_block *block) +{ + struct hlsl_ir_node *arg, *floor, *cast2; + struct hlsl_ir_expr *expr; + + if (instr->type != HLSL_IR_EXPR) + return false; + expr = hlsl_ir_expr(instr); + if (expr->op != HLSL_OP1_CAST) + return false; + + arg = expr->operands[0].node; + if (instr->data_type->base_type != HLSL_TYPE_INT && instr->data_type->base_type != HLSL_TYPE_UINT) + return false; + if (arg->data_type->base_type != HLSL_TYPE_FLOAT && arg->data_type->base_type != HLSL_TYPE_HALF) + return false; + + /* Check that the argument is not already a FLOOR */ + if (arg->type == HLSL_IR_EXPR && hlsl_ir_expr(arg)->op == HLSL_OP1_FLOOR) + return false; + + if (!(floor = hlsl_new_unary_expr(ctx, HLSL_OP1_FLOOR, arg, &instr->loc))) + return false; + hlsl_block_add_instr(block, floor); + + if (!(cast2 = hlsl_new_cast(ctx, floor, instr->data_type, &instr->loc))) + return false; + hlsl_block_add_instr(block, cast2); + + return true; +} + /* Lower DIV to RCP + MUL. */ static bool lower_division(struct hlsl_ctx *ctx, struct hlsl_ir_node *instr, struct hlsl_block *block) { @@ -5060,6 +5093,7 @@ int hlsl_emit_bytecode(struct hlsl_ctx *ctx, struct hlsl_ir_function_decl *entry lower_ir(ctx, lower_ternary, body); if (profile->major_version < 4) { + lower_ir(ctx, lower_casts_to_int, body); lower_ir(ctx, lower_division, body); lower_ir(ctx, lower_sqrt, body); lower_ir(ctx, lower_dot, body); diff --git a/tests/hlsl/cast-to-int.shader_test b/tests/hlsl/cast-to-int.shader_test index 417c3dbed..f2d823e6a 100644 --- a/tests/hlsl/cast-to-int.shader_test +++ b/tests/hlsl/cast-to-int.shader_test @@ -1,4 +1,4 @@ -[pixel shader todo(sm<4)] +[pixel shader] uniform float3 f;
float4 main() : sv_target @@ -10,7 +10,7 @@ float4 main() : sv_target
[test] uniform 0 float4 10.3 11.5 12.8 13.1 -todo(sm<4) draw quad +draw quad probe all rgba (10, 11, 12, 0)
diff --git a/tests/hlsl/ceil.shader_test b/tests/hlsl/ceil.shader_test index 0082c8f4e..73ba77250 100644 --- a/tests/hlsl/ceil.shader_test +++ b/tests/hlsl/ceil.shader_test @@ -21,7 +21,7 @@ uniform 0 float4 -0.5 6.5 7.5 3.4 draw quad probe all rgba (0.0, 7.0, 8.0, 4.0) 4
-[pixel shader todo(sm<4)] +[pixel shader] uniform float4 u;
float4 main() : sv_target @@ -34,10 +34,10 @@ float4 main() : sv_target
[test] uniform 0 float4 -0.5 6.5 7.5 3.4 -todo(sm<4) draw quad +draw quad probe all rgba (7.0, 8.0, 0.0, 4.0) 4
-[pixel shader todo(sm<4)] +[pixel shader] uniform int4 u;
float4 main() : sv_target @@ -50,5 +50,5 @@ float4 main() : sv_target
[test] uniform 0 int4 -1 6 7 3 -todo(sm<4) draw quad +draw quad probe all rgba (6.0, 7.0, -1.0, 3.0) 4 diff --git a/tests/hlsl/floor.shader_test b/tests/hlsl/floor.shader_test index dc9e31f79..85abb7f6b 100644 --- a/tests/hlsl/floor.shader_test +++ b/tests/hlsl/floor.shader_test @@ -21,7 +21,7 @@ uniform 0 float4 -0.5 6.5 7.5 3.4 draw quad probe all rgba (-1.0, 6.0, 7.0, 3.0) 4
-[pixel shader todo(sm<4)] +[pixel shader] uniform float4 u;
float4 main() : sv_target @@ -34,11 +34,11 @@ float4 main() : sv_target
[test] uniform 0 float4 -0.5 6.5 7.5 3.4 -todo(sm<4) draw quad +draw quad probe all rgba (6.0, 7.0, -1.0, 3.0) 4
-[pixel shader todo(sm<4)] +[pixel shader] uniform int4 u;
float4 main() : sv_target @@ -51,5 +51,5 @@ float4 main() : sv_target
[test] uniform 0 int4 -1 6 7 3 -todo(sm<4) draw quad +draw quad probe all rgba (6.0, 7.0, -1.0, 3.0) 4 diff --git a/tests/hlsl/round.shader_test b/tests/hlsl/round.shader_test index 7b4c68cb7..b9234b010 100644 --- a/tests/hlsl/round.shader_test +++ b/tests/hlsl/round.shader_test @@ -13,7 +13,7 @@ probe all rgba (0.0, -7.0, 8.0, 3.0) 4
-[pixel shader todo(sm<4)] +[pixel shader] uniform float4 u;
float4 main() : sv_target @@ -26,12 +26,12 @@ float4 main() : sv_target
[test] uniform 0 float4 -0.4 -6.6 7.6 3.4 -todo(sm<4) draw quad +draw quad probe all rgba (-7.0, 8.0, 0.0, 3.0) 4
-[pixel shader todo(sm<4)] +[pixel shader] uniform float4 u;
float4 main() : sv_target @@ -42,5 +42,5 @@ float4 main() : sv_target
[test] uniform 0 float4 -1 0 2 10 -todo(sm<4) draw quad +draw quad probe all rgba (-1.0, 0.0, 2.0, 10.0) 4 diff --git a/tests/hlsl/saturate.shader_test b/tests/hlsl/saturate.shader_test index 6852015b2..2ed83cf66 100644 --- a/tests/hlsl/saturate.shader_test +++ b/tests/hlsl/saturate.shader_test @@ -11,7 +11,7 @@ uniform 0 float4 0.7 -0.1 0.0 0.0 todo(sm>=6) draw quad probe all rgba (0.7, 0.0, 1.0, 0.0)
-[pixel shader todo(sm<4)] +[pixel shader] uniform float4 u;
float4 main() : sv_target @@ -22,5 +22,5 @@ float4 main() : sv_target
[test] uniform 0 float4 -2 0 2 -1 -todo(sm<4 | sm>=6) draw quad +todo(sm>=6) draw quad probe all rgba (0.0, 0.0, 1.0, 0.0)