From: Francisco Casas fcasas@codeweavers.com
Signed-off-by: Francisco Casas fcasas@codeweavers.com Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com --- v3: - Simplified tests that are expected to fail. - Addeed a tests for implicit arrays in a typedef. - Added a test for an array of complex elements. - Moved all tests not expected to fail to the top. - Removed the 'Implicit array of elements of size 0 with broadcast initialization' test. Replaced with a simple 'Broadcast to an implicit array' test from zf. - Calling 'implicit arrays' as 'implicit size arrays' now. --- ...lsl-initializer-implicit-array.shader_test | 223 +++++++++++++++++- 1 file changed, 222 insertions(+), 1 deletion(-)
diff --git a/tests/hlsl-initializer-implicit-array.shader_test b/tests/hlsl-initializer-implicit-array.shader_test index d2b94da4..648825eb 100644 --- a/tests/hlsl-initializer-implicit-array.shader_test +++ b/tests/hlsl-initializer-implicit-array.shader_test @@ -2,6 +2,7 @@ float4 main() : SV_TARGET { float4 arr[] = {10, 20, 30, 40, 50, 60, 70, 80}; + return arr[1]; }
@@ -10,9 +11,229 @@ todo draw quad todo probe all rgba (50, 60, 70, 80)
+[pixel shader] +float4 main() : sv_target +{ + float4 arr1[2] = {1, 2, 3, 4, 5, 6, 7, 8}; + float4 arr2[] = arr1; + + return arr2[1]; +} + +[test] +todo draw quad +todo probe all rgba (5.0, 6.0, 7.0, 8.0) + + +[pixel shader] +float4 main() : sv_target +{ + float2 arr[][3] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}; + + return float4(arr[1][0], arr[1][1]); +} + +[test] +todo draw quad +todo probe all rgba (7.0, 8.0, 9.0, 10.0) + + +[pixel shader] +struct foo +{ + float3 foo1; + float4 foo2; +}; + +struct bar +{ + struct foo aa; + int2 bb; + float4 cc[2]; +}; + +float4 main() : SV_TARGET +{ + struct bar p[] = { + 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, + 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, + }; + return p[0].aa.foo2 + p[1].cc[1]; +} + +[test] +todo draw quad +todo probe all rgba (318.0, 320.0, 322.0, 324.0) + + [pixel shader fail] +// Incompatible number of arguments in implicit size array initializer float4 main() : SV_TARGET { float4 arr[] = {10, 20, 30, 40, 50, 60, 70}; - return arr[0]; + + return 0.0; +} + + +[pixel shader fail] +// Incompatible number of arguments in implicit size array initializer +float4 main() : SV_TARGET +{ + float4 arr[] = {10, 20, 30, 40, 50, 60, 70, 80, 90}; + + return 0.0; +} + +[pixel shader fail] +// Implicit size inner array +float4 main() : sv_target +{ + float2 arr[3][] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}; + + return 0.0; +} + + +[pixel shader fail] +// Implicit size array without initializer +float4 main() : sv_target +{ + float4 arr[]; + + return 0.0; +} + + +[pixel shader fail] +// Implicit size array as struct member +struct foobar +{ + int a; + float4 arr[]; +}; + +float4 main() : sv_target +{ + struct foobar s; + + return 0.0; +} + + +[pixel shader fail] +// Implicit size array as function argument +float4 fun(float4 arr[]) +{ + return 1.0; +} +float4 main() : sv_target +{ + float4 arr[3] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}; + return fun(arr); +} + + +[pixel shader fail] +// Implicit size array as function return type +// Note: explicit size arrays are not allowed either. +float4[] fun() +{ + float4 ret[2] = {1, 2, 3, 4, 5, 6, 7, 8}; + + return ret; +} + +float4 main() : sv_target +{ fun(); + return 0.0; +} + + +[pixel shader fail] +// Implicit size array as a cast +float4 main() : sv_target +{ + float2 arr1[4] = {1, 2, 3, 4, 5, 6, 7, 8}; + float4 arr2[2] = (float4 []) arr1; + + return 0.0; +} + + +[pixel shader fail] +// Implicit size array as a typedef +typedef float4 arrtype[]; + +float4 main() : sv_target +{ + arrtype arr1 = {1, 2, 3, 4, 5, 6, 7, 8}; + + return 0.0; +} + + +[pixel shader fail] +// Implicit size array of elements of size 0 +struct emp +{ +}; + +float4 main() : sv_target +{ + struct emp arr[] = {1, 2, 3, 4}; + + return 0.0; +} + + +[pixel shader fail] +// Implicit size array of elements of size 0, without initializer +struct emp +{ +}; + +float4 main() : sv_target +{ + struct emp arr[]; + + return 0.0; +} + + +[pixel shader fail] +// Broadcast to an implicit size array +float4 main() : sv_target +{ + int a[4] = (int[]) 0; + + return 0.0; +} + + +[pixel shader fail] +// Implicit size array with an initializer of size 0 +struct emp +{ +}; + +float4 main() : sv_target +{ + float4 arr[] = (struct emp) 42; + + return 0.0; +} + + +[pixel shader fail] +// Implicit size array of elements of size 0 with initializer of size 0 +struct emp +{ +}; + +float4 main() : sv_target +{ + struct emp arr[] = (struct emp) 42; + + return 0.0; }
From: Francisco Casas fcasas@codeweavers.com
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com --- Francisco, I did a couple of minor changes: * in the patch subject, "Textures" -> "textures" * added a 1 ULP tolerance in one test, to have it accepted on my machine
Notice that, after a patch by Zeb, on master you don't need to mark the probe directive as todo if the draw directive is todo too. --- Makefile.am | 1 + tests/hlsl-initializer-objects.shader_test | 109 +++++++++++++++++++++ 2 files changed, 110 insertions(+) create mode 100644 tests/hlsl-initializer-objects.shader_test
diff --git a/Makefile.am b/Makefile.am index 2742e77c..dd3ce1ff 100644 --- a/Makefile.am +++ b/Makefile.am @@ -79,6 +79,7 @@ vkd3d_shader_tests = \ tests/hlsl-initializer-invalid-arg-count.shader_test \ tests/hlsl-initializer-implicit-array.shader_test \ tests/hlsl-initializer-local-array.shader_test \ + tests/hlsl-initializer-objects.shader_test \ tests/hlsl-initializer-nested.shader_test \ tests/hlsl-initializer-numeric.shader_test \ tests/hlsl-initializer-matrix.shader_test \ diff --git a/tests/hlsl-initializer-objects.shader_test b/tests/hlsl-initializer-objects.shader_test new file mode 100644 index 00000000..2306d07f --- /dev/null +++ b/tests/hlsl-initializer-objects.shader_test @@ -0,0 +1,109 @@ +[require] +shader model >= 4.0 + +[texture 0] +size (3, 3) +0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.3 0.3 0.3 0.3 +0.4 0.4 0.4 0.4 0.5 0.5 0.5 0.5 0.6 0.6 0.6 0.6 +0.7 0.7 0.7 0.7 0.8 0.8 0.8 0.8 0.9 0.9 0.9 0.9 + +[texture 1] +size (2, 2) +0.1 0.1 0.1 0.0 0.2 0.2 0.2 0.0 +0.4 0.4 0.4 0.0 0.5 0.5 0.5 0.0 + + +[pixel shader] +Texture2D tex1; +Texture2D tex2; + +float4 main() : sv_target +{ + Texture2D q[2] = {tex1, tex2}; + + return q[0].Load(int3(0, 0, 0)) + q[1].Load(int3(0, 0, 0)); +} + +[test] +todo draw quad +todo probe all rgba (0.2, 0.2, 0.2, 0.1) + + +[pixel shader] +Texture2D tex; + +struct foo +{ + float2 aa; + Texture2D bb; + Texture2D cc; + float4 dd; +}; + +float4 main() : sv_target +{ + struct foo q = {10, 20, tex, tex, 30, 40, 50, 60}; + + return q.bb.Load(int3(2, 0, 0)) + q.cc.Load(int3(1, 2, 0)) + q.dd; +} + +[test] +todo draw quad +todo probe all rgba (31.1, 41.1, 51.1, 61.1) 1 + + +[pixel shader] +Texture2D tex1; +Texture2D tex2; + +struct foo +{ + float2 aa; + Texture2D bb[2]; // NOTE: this cannot be initialized in shader model >= 5.1 + float4 cc; +}; + +float4 main() : sv_target +{ + struct foo q = {10, 20, tex1, tex2, 30, 40, 50, 60}; + + return q.bb[0].Load(int3(0, 0, 0)) + q.bb[1].Load(int3(1, 1, 0)) + q.cc; +} + + +[pixel shader fail todo] +Texture2D tex; + +struct foo +{ + float2 aa; + Texture2D bb; + Texture2D cc; + float4 dd; +}; + +float4 main() : sv_target +{ + struct foo q = {10, 20, tex, 30, 40, 50, 60}; + + return 0.0; +} + + +[pixel shader fail todo] +Texture2D tex; + +struct foo +{ + float2 aa; + Texture2D bb; + Texture2D cc; + float4 dd; +}; + +float4 main() : sv_target +{ + struct foo q = {10, 20, tex, tex, tex, 30, 40, 50, 60}; + + return 0.0; +}
Signed-off-by: Zebediah Figura zfigura@codeweavers.com
Signed-off-by: Henri Verbeet hverbeet@codeweavers.com
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl.c | 17 ++++++++++------- tests/hlsl-initializer-objects.shader_test | 10 +++++----- 2 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.c b/libs/vkd3d-shader/hlsl.c index 7239b183..4bdd770a 100644 --- a/libs/vkd3d-shader/hlsl.c +++ b/libs/vkd3d-shader/hlsl.c @@ -412,17 +412,20 @@ unsigned int hlsl_type_component_count(struct hlsl_type *type) { return hlsl_type_component_count(type->e.array.type) * type->e.array.elements_count; } - if (type->type != HLSL_CLASS_STRUCT) + if (type->type == HLSL_CLASS_OBJECT) { - ERR("Unexpected data type %#x.\n", type->type); - return 0; + return 1; } - - LIST_FOR_EACH_ENTRY(field, type->e.elements, struct hlsl_struct_field, entry) + if (type->type == HLSL_CLASS_STRUCT) { - count += hlsl_type_component_count(field->type); + LIST_FOR_EACH_ENTRY(field, type->e.elements, struct hlsl_struct_field, entry) + { + count += hlsl_type_component_count(field->type); + } + return count; } - return count; + assert(0); + return 0; }
bool hlsl_types_are_equal(const struct hlsl_type *t1, const struct hlsl_type *t2) diff --git a/tests/hlsl-initializer-objects.shader_test b/tests/hlsl-initializer-objects.shader_test index 2306d07f..5261aa05 100644 --- a/tests/hlsl-initializer-objects.shader_test +++ b/tests/hlsl-initializer-objects.shader_test @@ -25,8 +25,8 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (0.2, 0.2, 0.2, 0.1) +draw quad +probe all rgba (0.2, 0.2, 0.2, 0.1)
[pixel shader] @@ -49,7 +49,7 @@ float4 main() : sv_target
[test] todo draw quad -todo probe all rgba (31.1, 41.1, 51.1, 61.1) 1 +probe all rgba (31.1, 41.1, 51.1, 61.1) 1
[pixel shader] @@ -71,7 +71,7 @@ float4 main() : sv_target }
-[pixel shader fail todo] +[pixel shader fail] Texture2D tex;
struct foo @@ -90,7 +90,7 @@ float4 main() : sv_target }
-[pixel shader fail todo] +[pixel shader fail] Texture2D tex;
struct foo
Hi,
sorry, I made a mess with the commit message and some of it remained in the subject line.
At any rate, I think this patch still requires some thought: changing hlsl_type_component_count(), while making sense in itself, impacts hlsl_compute_component_offset().
More in general, I think there are two different concepts that are still a bit too confused in the code: * the "logical" component count, i.e., how many things you have to put into an initializer; * the "physical" register count, i.e., how many register that type will require.
Most of base types take one component and one register, which is probably why we're confusing them right now. But objects don't (they take a component, but no registers, because they are pure entities, they don't have a value). I think doubles wouldn't either if we supported them (they take one component, but two registers).
Also, components are always packed (as far as I know), while registers have padding we have to take into account (for example, a matrix row is padded to a whole four registers; same for an array item; but these rules depend on the SM).
Therefore I think we should distinguish more clearly which of these two concept is used each time, and be sure that we're computing correctly both offsets and sizes based on either of them.
Consider for example:
struct foo { Texture2D t; float x; };
With this patch the field x would go to register offset 1, which doesn't look correct.
BTW, I just made up the words "component" and "register", and I think they're wrong already, because I guess that a register is technically four things of what I called "register" in this email. Maybe HLSL/DXBC gurus have better word proposals.
Giovanni.
Il 10/05/22 15:08, Giovanni Mascellani ha scritto:
From: Francisco Casas fcasas@codeweavers.com
libs/vkd3d-shader/hlsl.c | 17 ++++++++++------- tests/hlsl-initializer-objects.shader_test | 10 +++++----- 2 files changed, 15 insertions(+), 12 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.c b/libs/vkd3d-shader/hlsl.c index 7239b183..4bdd770a 100644 --- a/libs/vkd3d-shader/hlsl.c +++ b/libs/vkd3d-shader/hlsl.c @@ -412,17 +412,20 @@ unsigned int hlsl_type_component_count(struct hlsl_type *type) { return hlsl_type_component_count(type->e.array.type) * type->e.array.elements_count; }
- if (type->type != HLSL_CLASS_STRUCT)
- if (type->type == HLSL_CLASS_OBJECT) {
ERR("Unexpected data type %#x.\n", type->type);
return 0;
return 1; }
- LIST_FOR_EACH_ENTRY(field, type->e.elements, struct hlsl_struct_field, entry)
- if (type->type == HLSL_CLASS_STRUCT) {
count += hlsl_type_component_count(field->type);
LIST_FOR_EACH_ENTRY(field, type->e.elements, struct hlsl_struct_field, entry)
{
count += hlsl_type_component_count(field->type);
}
return count; }
- return count;
assert(0);
return 0; }
bool hlsl_types_are_equal(const struct hlsl_type *t1, const struct hlsl_type *t2)
diff --git a/tests/hlsl-initializer-objects.shader_test b/tests/hlsl-initializer-objects.shader_test index 2306d07f..5261aa05 100644 --- a/tests/hlsl-initializer-objects.shader_test +++ b/tests/hlsl-initializer-objects.shader_test @@ -25,8 +25,8 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (0.2, 0.2, 0.2, 0.1) +draw quad +probe all rgba (0.2, 0.2, 0.2, 0.1)
[pixel shader] @@ -49,7 +49,7 @@ float4 main() : sv_target
[test] todo draw quad -todo probe all rgba (31.1, 41.1, 51.1, 61.1) 1 +probe all rgba (31.1, 41.1, 51.1, 61.1) 1
[pixel shader] @@ -71,7 +71,7 @@ float4 main() : sv_target }
-[pixel shader fail todo] +[pixel shader fail] Texture2D tex;
struct foo @@ -90,7 +90,7 @@ float4 main() : sv_target }
-[pixel shader fail todo] +[pixel shader fail] Texture2D tex;
struct foo
On 5/10/22 08:31, Giovanni Mascellani wrote:
Hi,
sorry, I made a mess with the commit message and some of it remained in the subject line.
At any rate, I think this patch still requires some thought: changing hlsl_type_component_count(), while making sense in itself, impacts hlsl_compute_component_offset().
More in general, I think there are two different concepts that are still a bit too confused in the code:
- the "logical" component count, i.e., how many things you have to put
into an initializer;
- the "physical" register count, i.e., how many register that type
will require.
Most of base types take one component and one register, which is probably why we're confusing them right now. But objects don't (they take a component, but no registers, because they are pure entities, they don't have a value). I think doubles wouldn't either if we supported them (they take one component, but two registers).
Also, components are always packed (as far as I know), while registers have padding we have to take into account (for example, a matrix row is padded to a whole four registers; same for an array item; but these rules depend on the SM).
Therefore I think we should distinguish more clearly which of these two concept is used each time, and be sure that we're computing correctly both offsets and sizes based on either of them.
Consider for example:
struct foo { Texture2D t; float x; };
With this patch the field x would go to register offset 1, which doesn't look correct.
BTW, I just made up the words "component" and "register", and I think they're wrong already, because I guess that a register is technically four things of what I called "register" in this email. Maybe HLSL/DXBC gurus have better word proposals.
Yes, quite. Without trying to hold things up, it's probably past time we fixed this in a better way. Adding hlsl_fixmes for object arrays is feasible but has proven somewhat awkward.
The above summary is about accurate, although I will nitpick one thing—objects *do* have registers; they're just allocated in a different namespace and according to different rules.
There was some private discussion between myself and Matteo about this, but I think the correct thing to do is to deal in HLSL *only* in terms of components.
In fact, I would go so far as to say that reg_size and reg_offset should go away (or perhaps be accessed only from hlsl_sm4.c and hlsl_sm1.c—but better just to invent a new helper to calculate them from scratch, and stop storing them in the hlsl_type structure, I think).
I would also go so far as to say that hlsl.c should stop caring about matrix majority, and just use the same component ordering for both row-major and column-major matrices.
There are a few interesting consequences here:
* Doing the translation from "component-based" offsets to "register-based" offsets (i.e. getting the alignment and namespaces right) is actually relatively trivial; as Francisco pointed out, we basically just need to use hlsl_compute_component_offset() in hlsl_offset_from_deref().
* One thing we should do, though, is make a point of watching for non-contiguous loads in hlsl_offset_from_deref(). Because of field alignment, I don't think we'll ever generate any, but that fact doesn't inhere in the IR, and accordingly I'd rather not even assert() that it's the case. (In fact, if we ignore matrix majority, it may end up being quite easy to "accidentally" generate non-contiguous loads.)
* in order to generate less terrible code, we are going to want to do vectorization passes, and probably other passes as well, on sm1/sm4 IR (henceforth "LLIR"). This doesn't necessarily need to happen yet, since we can generate correct code without such optimizations, but it is another reason that introducing a "proper" LLIR will need to happen, as has been talked about often.
From: Francisco Casas fcasas@codeweavers.com
HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT (zero) is used as a temporal value for elements_count for implicit size arrays. It should be replaced by the correct value after parsing the initializer.
In case the implicit array is not initialized correctly, hlsl_error() is called but the array size is kept at 0. So the rest of the code must handle these cases.
Signed-off-by: Francisco Casas fcasas@codeweavers.com
--- v2: - Detection of incorrect use of implicit arrays was moved from the parser rules to the respective add_* functions. v3: - Replaced 'if' with 'elif' in hlsl_type_calculate_reg_size(). - Updated changes to the new implicit array tests. - Removed incorrect empty line. - Fixed typo in error. - Removed field_size assertion in hlsl_type_calculate_reg_size(). v4: - Drop variables and struct fields declared as unbounded resource arrays. v5: - Renamed 'implicit arrays' to 'implicit size arrays'. v6: - Use 'true' instead of '1'. - Reordered some error detection checks. - Checking for shader model 5.1. --- libs/vkd3d-shader/hlsl.c | 14 ++- libs/vkd3d-shader/hlsl.h | 2 + libs/vkd3d-shader/hlsl.y | 119 ++++++++++++++++++ ...lsl-initializer-implicit-array.shader_test | 16 +-- 4 files changed, 138 insertions(+), 13 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.c b/libs/vkd3d-shader/hlsl.c index 4bdd770a..c78b387c 100644 --- a/libs/vkd3d-shader/hlsl.c +++ b/libs/vkd3d-shader/hlsl.c @@ -165,8 +165,9 @@ static void hlsl_type_calculate_reg_size(struct hlsl_ctx *ctx, struct hlsl_type { unsigned int element_size = type->e.array.type->reg_size;
- assert(element_size); - if (is_sm4) + if (type->e.array.elements_count == HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT) + type->reg_size = 0; + else if (is_sm4) type->reg_size = (type->e.array.elements_count - 1) * align(element_size, 4) + element_size; else type->reg_size = type->e.array.elements_count * element_size; @@ -184,8 +185,6 @@ static void hlsl_type_calculate_reg_size(struct hlsl_ctx *ctx, struct hlsl_type { unsigned int field_size = field->type->reg_size;
- assert(field_size); - type->reg_size = hlsl_type_get_sm4_offset(field->type, type->reg_size); field->reg_offset = type->reg_size; type->reg_size += field_size; @@ -1026,7 +1025,12 @@ struct vkd3d_string_buffer *hlsl_type_to_string(struct hlsl_ctx *ctx, const stru }
for (t = type; t->type == HLSL_CLASS_ARRAY; t = t->e.array.type) - vkd3d_string_buffer_printf(string, "[%u]", t->e.array.elements_count); + { + if (t->e.array.elements_count == HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT) + vkd3d_string_buffer_printf(string, "[]"); + else + vkd3d_string_buffer_printf(string, "[%u]", t->e.array.elements_count); + } return string; }
diff --git a/libs/vkd3d-shader/hlsl.h b/libs/vkd3d-shader/hlsl.h index 28b2ff1b..a142fa48 100644 --- a/libs/vkd3d-shader/hlsl.h +++ b/libs/vkd3d-shader/hlsl.h @@ -227,6 +227,8 @@ struct hlsl_src
#define HLSL_MODIFIERS_MAJORITY_MASK (HLSL_MODIFIER_ROW_MAJOR | HLSL_MODIFIER_COLUMN_MAJOR)
+#define HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT 0 + struct hlsl_reg_reservation { char type; diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 44e4964f..698169d8 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -765,6 +765,11 @@ static void free_parse_variable_def(struct parse_variable_def *v) vkd3d_free(v); }
+static bool shader_is_sm_5_1(const struct hlsl_ctx *ctx) +{ + return ctx->profile->major_version == 5 && ctx->profile->minor_version >= 1; +} + static struct list *gen_struct_fields(struct hlsl_ctx *ctx, struct hlsl_type *type, unsigned int modifiers, struct list *fields) { @@ -779,6 +784,7 @@ static struct list *gen_struct_fields(struct hlsl_ctx *ctx, return NULL; LIST_FOR_EACH_ENTRY_SAFE(v, v_next, fields, struct parse_variable_def, entry) { + bool unbounded_res_array = false; unsigned int i;
if (!(field = hlsl_alloc(ctx, sizeof(*field)))) @@ -789,7 +795,36 @@ static struct list *gen_struct_fields(struct hlsl_ctx *ctx,
field->type = type; for (i = 0; i < v->arrays.count; ++i) + { + if (v->arrays.sizes[i] == HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT) + { + if (type->type == HLSL_CLASS_OBJECT && shader_is_sm_5_1(ctx)) + { + if (i < v->arrays.count - 1) + { + hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Inner array size cannot be implicit."); + } + else + { + unbounded_res_array = true; + } + } + else + { + hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Implicit size arrays not allowed in struct fields."); + } + } field->type = hlsl_new_array_type(ctx, field->type, v->arrays.sizes[i]); + } + if (unbounded_res_array) + { + hlsl_fixme(ctx, &v->loc, "Unbounded resource arrays as struct fields."); + free_parse_variable_def(v); + vkd3d_free(field); + continue; + } vkd3d_free(v->arrays.sizes); field->loc = v->loc; field->name = v->name; @@ -832,6 +867,12 @@ static bool add_typedef(struct hlsl_ctx *ctx, DWORD modifiers, struct hlsl_type ret = true; for (i = 0; i < v->arrays.count; ++i) { + if (v->arrays.sizes[i] == HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT) + { + hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Implicit size arrays not allowed in typedefs."); + } + if (!(type = hlsl_new_array_type(ctx, type, v->arrays.sizes[i]))) { free_parse_variable_def(v); @@ -1601,11 +1642,67 @@ static struct list *declare_vars(struct hlsl_ctx *ctx, struct hlsl_type *basic_t
LIST_FOR_EACH_ENTRY_SAFE(v, v_next, var_list, struct parse_variable_def, entry) { + bool unbounded_res_array = false; unsigned int i;
type = basic_type; + for (i = 0; i < v->arrays.count; ++i) + { + if (v->arrays.sizes[i] == HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT) + { + unsigned int size = initializer_size(&v->initializer); + unsigned int elem_components = hlsl_type_component_count(type); + + if (i < v->arrays.count - 1) + { + hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Inner array size cannot be implicit."); + free_parse_initializer(&v->initializer); + v->initializer.args_count = 0; + } + else if (elem_components == 0) + { + hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Cannot declare an implicit size array of a size 0 type."); + free_parse_initializer(&v->initializer); + v->initializer.args_count = 0; + } + else if (size == 0) + { + if (type->type == HLSL_CLASS_OBJECT && shader_is_sm_5_1(ctx)) + { + unbounded_res_array = true; + } + else + { + hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Implicit size arrays need to be initialized."); + free_parse_initializer(&v->initializer); + v->initializer.args_count = 0; + } + } + else if (size % elem_components != 0) + { + hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_WRONG_PARAMETER_COUNT, + "Cannot initialize implicit size array with %u components, expected a multiple of %u.", + size, elem_components); + free_parse_initializer(&v->initializer); + v->initializer.args_count = 0; + } + else + { + v->arrays.sizes[i] = size / elem_components; + } + } type = hlsl_new_array_type(ctx, type, v->arrays.sizes[i]); + } + if (unbounded_res_array) + { + hlsl_fixme(ctx, &v->loc, "Unbounded resource arrays."); + free_parse_variable_def(v); + continue; + } vkd3d_free(v->arrays.sizes);
if (type->type != HLSL_CLASS_MATRIX) @@ -3186,6 +3283,21 @@ arrays: $$.sizes = new_array; $$.sizes[$$.count++] = size; } + | '[' ']' arrays + { + uint32_t *new_array; + + $$ = $3; + + if (!(new_array = hlsl_realloc(ctx, $$.sizes, ($$.count + 1) * sizeof(*new_array)))) + { + vkd3d_free($$.sizes); + YYABORT; + } + + $$.sizes = new_array; + $$.sizes[$$.count++] = HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT; + }
var_modifiers: %empty @@ -3726,7 +3838,14 @@ unary_expr:
dst_type = $3; for (i = 0; i < $4.count; ++i) + { + if ($4.sizes[i] == HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT) + { + hlsl_error(ctx, &@3, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Implicit size arrays not allowed in casts."); + } dst_type = hlsl_new_array_type(ctx, dst_type, $4.sizes[i]); + }
if (!compatible_data_types(src_type, dst_type)) { diff --git a/tests/hlsl-initializer-implicit-array.shader_test b/tests/hlsl-initializer-implicit-array.shader_test index 648825eb..38c8234c 100644 --- a/tests/hlsl-initializer-implicit-array.shader_test +++ b/tests/hlsl-initializer-implicit-array.shader_test @@ -7,8 +7,8 @@ float4 main() : SV_TARGET }
[test] -todo draw quad -todo probe all rgba (50, 60, 70, 80) +draw quad +probe all rgba (50, 60, 70, 80)
[pixel shader] @@ -21,8 +21,8 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (5.0, 6.0, 7.0, 8.0) +draw quad +probe all rgba (5.0, 6.0, 7.0, 8.0)
[pixel shader] @@ -34,8 +34,8 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (7.0, 8.0, 9.0, 10.0) +draw quad +probe all rgba (7.0, 8.0, 9.0, 10.0)
[pixel shader] @@ -62,8 +62,8 @@ float4 main() : SV_TARGET }
[test] -todo draw quad -todo probe all rgba (318.0, 320.0, 322.0, 324.0) +draw quad +probe all rgba (318.0, 320.0, 322.0, 324.0)
[pixel shader fail]
Hi,
I think this patch could go as it is now, but given that there will have to be another round...
Il 10/05/22 15:08, Giovanni Mascellani ha scritto:
for (i = 0; i < v->arrays.count; ++i)
{
if (v->arrays.sizes[i] == HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT)
{
if (type->type == HLSL_CLASS_OBJECT && shader_is_sm_5_1(ctx))
{
if (i < v->arrays.count - 1)
{
hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE,
"Inner array size cannot be implicit.");
}
else
{
unbounded_res_array = true;
}
}
else
{
hlsl_error(ctx, &v->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE,
"Implicit size arrays not allowed in struct fields.");
}
} field->type = hlsl_new_array_type(ctx, field->type, v->arrays.sizes[i]);
}
if (unbounded_res_array)
{
hlsl_fixme(ctx, &v->loc, "Unbounded resource arrays as struct fields.");
free_parse_variable_def(v);
vkd3d_free(field);
continue;
}
I forgot to ask a question: why are you introducing this "unbounded_res_array" variable instead of calling hlsl_fixme() directly in the for loop?
Giovanni.
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com --- tests/d3d12_test_utils.h | 19 +++++++++++++------ tests/shader_runner.c | 32 +++++++++++++++++++++++--------- tests/shader_runner.h | 3 ++- tests/shader_runner_d3d11.c | 20 ++++++++++++++------ tests/shader_runner_d3d12.c | 4 ++-- tests/shader_runner_d3d9.c | 20 ++++++++++++++------ tests/shader_runner_vulkan.c | 20 ++++++++++++++------ tests/utils.h | 9 +++++++++ 8 files changed, 91 insertions(+), 36 deletions(-)
diff --git a/tests/d3d12_test_utils.h b/tests/d3d12_test_utils.h index bb298082..9fdc4bb3 100644 --- a/tests/d3d12_test_utils.h +++ b/tests/d3d12_test_utils.h @@ -569,9 +569,9 @@ static void check_readback_data_uint_(unsigned int line, struct resource_readbac ok_(line)(all_match, "Got 0x%08x, expected 0x%08x at (%u, %u, %u).\n", got, expected, x, y, z); }
-#define check_readback_data_vec4(a, b, c, d) check_readback_data_vec4_(__LINE__, a, b, c, d) +#define check_readback_data_vec4(a, b, c, d, e) check_readback_data_vec4_(__LINE__, a, b, c, d, e) static void check_readback_data_vec4_(unsigned int line, struct resource_readback *rb, - const RECT *rect, const struct vec4 *expected, unsigned int max_diff) + const RECT *rect, const struct vec4 *expected, unsigned int max_diff, unsigned int comp_count) { RECT r = {0, 0, rb->width, rb->height}; unsigned int x = 0, y = 0; @@ -586,7 +586,7 @@ static void check_readback_data_vec4_(unsigned int line, struct resource_readbac for (x = r.left; x < r.right; ++x) { got = *get_readback_vec4(rb, x, y); - if (!compare_vec4(&got, expected, max_diff)) + if (!compare_vec4_count(&got, expected, max_diff, comp_count)) { all_match = false; break; @@ -595,8 +595,15 @@ static void check_readback_data_vec4_(unsigned int line, struct resource_readbac if (!all_match) break; } - ok_(line)(all_match, "Got {%.8e, %.8e, %.8e, %.8e}, expected {%.8e, %.8e, %.8e, %.8e} at (%u, %u).\n", - got.x, got.y, got.z, got.w, expected->x, expected->y, expected->z, expected->w, x, y); + if (comp_count == 4) + { + ok_(line)(all_match, "Got {%.8e, %.8e, %.8e, %.8e}, expected {%.8e, %.8e, %.8e, %.8e} at (%u, %u).\n", + got.x, got.y, got.z, got.w, expected->x, expected->y, expected->z, expected->w, x, y); + } + else + { + ok_(line)(all_match, "Got %.8e, expected %.8e at (%u, %u).\n", got.x, expected->x, x, y); + } }
#define check_sub_resource_uint(a, b, c, d, e, f) check_sub_resource_uint_(__LINE__, a, b, c, d, e, f) @@ -619,7 +626,7 @@ static inline void check_sub_resource_vec4_(unsigned int line, ID3D12Resource *t struct resource_readback rb;
get_texture_readback_with_command_list(texture, sub_resource_idx, &rb, queue, command_list); - check_readback_data_vec4_(line, &rb, NULL, expected, max_diff); + check_readback_data_vec4_(line, &rb, NULL, expected, max_diff, 4); release_resource_readback(&rb); }
diff --git a/tests/shader_runner.c b/tests/shader_runner.c index a820810a..f8fd6e3e 100644 --- a/tests/shader_runner.c +++ b/tests/shader_runner.c @@ -411,7 +411,7 @@ static void parse_test_directive(struct shader_runner *runner, const char *line) } else if (match_string(line, "probe", &line)) { - unsigned int left, top, right, bottom, ulps; + unsigned int left, top, right, bottom, ulps, comp_count; struct vec4 v; int ret, len; RECT rect; @@ -434,16 +434,30 @@ static void parse_test_directive(struct shader_runner *runner, const char *line) line += len; }
- if (!match_string(line, "rgba", &line)) - fatal_error("Malformed probe arguments '%s'.\n", line); - - ret = sscanf(line, "( %f , %f , %f , %f ) %u", &v.x, &v.y, &v.z, &v.w, &ulps); - if (ret < 4) + if (match_string(line, "rgba", &line)) + { + comp_count = 4; + ret = sscanf(line, "( %f , %f , %f , %f ) %u", &v.x, &v.y, &v.z, &v.w, &ulps); + if (ret < 4) + fatal_error("Malformed probe arguments '%s'.\n", line); + if (ret < 5) + ulps = 0; + } + else if (match_string(line, "r", &line)) + { + comp_count = 1; + ret = sscanf(line, "%f %u", &v.x, &ulps); + if (ret < 1) + fatal_error("Malformed probe arguments '%s'.\n", line); + if (ret < 2) + ulps = 0; + } + else + { fatal_error("Malformed probe arguments '%s'.\n", line); - if (ret < 5) - ulps = 0; + }
- runner->ops->probe_vec4(runner, &rect, &v, ulps); + runner->ops->probe_vec4(runner, &rect, &v, ulps, comp_count); } else if (match_string(line, "uniform", &line)) { diff --git a/tests/shader_runner.h b/tests/shader_runner.h index 9685eaa3..a85b84e2 100644 --- a/tests/shader_runner.h +++ b/tests/shader_runner.h @@ -123,7 +123,8 @@ struct shader_runner_ops struct resource *(*create_resource)(struct shader_runner *runner, const struct resource_params *params); void (*destroy_resource)(struct shader_runner *runner, struct resource *resource); bool (*draw)(struct shader_runner *runner, D3D_PRIMITIVE_TOPOLOGY primitive_topology, unsigned int vertex_count); - void (*probe_vec4)(struct shader_runner *runner, const RECT *rect, const struct vec4 *v, unsigned int ulps); + void (*probe_vec4)(struct shader_runner *runner, const RECT *rect, const struct vec4 *v, + unsigned int ulps, unsigned int comp_count); };
void fatal_error(const char *format, ...) VKD3D_NORETURN VKD3D_PRINTF_FUNC(1, 2); diff --git a/tests/shader_runner_d3d11.c b/tests/shader_runner_d3d11.c index 9d1ed09b..5a67f501 100644 --- a/tests/shader_runner_d3d11.c +++ b/tests/shader_runner_d3d11.c @@ -576,7 +576,7 @@ static const struct vec4 *get_readback_vec4(struct resource_readback *rb, unsign }
static void check_readback_data_vec4(struct resource_readback *rb, - const RECT *rect, const struct vec4 *expected, unsigned int max_diff) + const RECT *rect, const struct vec4 *expected, unsigned int max_diff, unsigned int comp_count) { unsigned int x = 0, y = 0; struct vec4 got = {0}; @@ -587,7 +587,7 @@ static void check_readback_data_vec4(struct resource_readback *rb, for (x = rect->left; x < rect->right; ++x) { got = *get_readback_vec4(rb, x, y); - if (!compare_vec4(&got, expected, max_diff)) + if (!compare_vec4_count(&got, expected, max_diff, comp_count)) { all_match = false; break; @@ -596,17 +596,25 @@ static void check_readback_data_vec4(struct resource_readback *rb, if (!all_match) break; } - ok(all_match, "Got {%.8e, %.8e, %.8e, %.8e}, expected {%.8e, %.8e, %.8e, %.8e} at (%u, %u).\n", - got.x, got.y, got.z, got.w, expected->x, expected->y, expected->z, expected->w, x, y); + if (comp_count == 4) + { + ok(all_match, "Got {%.8e, %.8e, %.8e, %.8e}, expected {%.8e, %.8e, %.8e, %.8e} at (%u, %u).\n", + got.x, got.y, got.z, got.w, expected->x, expected->y, expected->z, expected->w, x, y); + } + else + { + ok(all_match, "Got %.8e, expected %.8e at (%u, %u).\n", got.x, expected->x, x, y); + } }
-static void d3d11_runner_probe_vec4(struct shader_runner *r, const RECT *rect, const struct vec4 *v, unsigned int ulps) +static void d3d11_runner_probe_vec4(struct shader_runner *r, const RECT *rect, const struct vec4 *v, + unsigned int ulps, unsigned int comp_count) { struct d3d11_shader_runner *runner = d3d11_shader_runner(r); struct resource_readback rb;
init_readback(runner, &rb); - check_readback_data_vec4(&rb, rect, v, ulps); + check_readback_data_vec4(&rb, rect, v, ulps, comp_count); release_readback(runner, &rb); }
diff --git a/tests/shader_runner_d3d12.c b/tests/shader_runner_d3d12.c index 1fa2a461..1822029d 100644 --- a/tests/shader_runner_d3d12.c +++ b/tests/shader_runner_d3d12.c @@ -310,7 +310,7 @@ static bool d3d12_runner_draw(struct shader_runner *r, }
static void d3d12_runner_probe_vec4(struct shader_runner *r, - const RECT *rect, const struct vec4 *v, unsigned int ulps) + const RECT *rect, const struct vec4 *v, unsigned int ulps, unsigned int comp_count) { struct d3d12_shader_runner *runner = d3d12_shader_runner(r); struct test_context *test_context = &runner->test_context; @@ -320,7 +320,7 @@ static void d3d12_runner_probe_vec4(struct shader_runner *r, D3D12_RESOURCE_STATE_RENDER_TARGET, D3D12_RESOURCE_STATE_COPY_SOURCE); get_texture_readback_with_command_list(test_context->render_target, 0, &rb, test_context->queue, test_context->list); - todo_if (runner->r.is_todo) check_readback_data_vec4(&rb, rect, v, ulps); + todo_if (runner->r.is_todo) check_readback_data_vec4(&rb, rect, v, ulps, comp_count); release_resource_readback(&rb); reset_command_list(test_context->list, test_context->allocator); transition_resource_state(test_context->list, test_context->render_target, diff --git a/tests/shader_runner_d3d9.c b/tests/shader_runner_d3d9.c index f2875d42..0b2f71ad 100644 --- a/tests/shader_runner_d3d9.c +++ b/tests/shader_runner_d3d9.c @@ -479,7 +479,7 @@ static const struct vec4 *get_readback_vec4(const struct resource_readback *rb, }
static void check_readback_data_vec4(struct resource_readback *rb, - const RECT *rect, const struct vec4 *expected, unsigned int max_diff) + const RECT *rect, const struct vec4 *expected, unsigned int max_diff, unsigned int comp_count) { unsigned int x = 0, y = 0; struct vec4 got = {0}; @@ -490,7 +490,7 @@ static void check_readback_data_vec4(struct resource_readback *rb, for (x = rect->left; x < rect->right; ++x) { got = *get_readback_vec4(rb, x, y); - if (!compare_vec4(&got, expected, max_diff)) + if (!compare_vec4_count(&got, expected, max_diff, comp_count)) { all_match = false; break; @@ -499,8 +499,15 @@ static void check_readback_data_vec4(struct resource_readback *rb, if (!all_match) break; } - ok(all_match, "Got {%.8e, %.8e, %.8e, %.8e}, expected {%.8e, %.8e, %.8e, %.8e} at (%u, %u).\n", - got.x, got.y, got.z, got.w, expected->x, expected->y, expected->z, expected->w, x, y); + if (comp_count == 4) + { + ok(all_match, "Got {%.8e, %.8e, %.8e, %.8e}, expected {%.8e, %.8e, %.8e, %.8e} at (%u, %u).\n", + got.x, got.y, got.z, got.w, expected->x, expected->y, expected->z, expected->w, x, y); + } + else + { + ok(all_match, "Got %.8e, expected %.8e at (%u, %u).\n", got.x, expected->x, x, y); + } }
static void release_readback(struct resource_readback *rb) @@ -509,13 +516,14 @@ static void release_readback(struct resource_readback *rb) IDirect3DSurface9_Release(rb->surface); }
-static void d3d9_runner_probe_vec4(struct shader_runner *r, const RECT *rect, const struct vec4 *v, unsigned int ulps) +static void d3d9_runner_probe_vec4(struct shader_runner *r, const RECT *rect, const struct vec4 *v, + unsigned int ulps, unsigned int comp_count) { struct d3d9_shader_runner *runner = d3d9_shader_runner(r); struct resource_readback rb;
init_readback(runner, &rb); - check_readback_data_vec4(&rb, rect, v, ulps); + check_readback_data_vec4(&rb, rect, v, ulps, comp_count); release_readback(&rb); }
diff --git a/tests/shader_runner_vulkan.c b/tests/shader_runner_vulkan.c index 5d4b65cf..8b461993 100644 --- a/tests/shader_runner_vulkan.c +++ b/tests/shader_runner_vulkan.c @@ -831,7 +831,7 @@ static const struct vec4 *get_readback_vec4(const uint8_t *data, unsigned int ro }
static void check_readback_data_vec4(const uint8_t *data, unsigned int row_pitch, - const RECT *rect, const struct vec4 *expected, unsigned int max_diff) + const RECT *rect, const struct vec4 *expected, unsigned int max_diff, unsigned int comp_count) { unsigned int x = 0, y = 0; struct vec4 got = {0}; @@ -842,7 +842,7 @@ static void check_readback_data_vec4(const uint8_t *data, unsigned int row_pitch for (x = rect->left; x < rect->right; ++x) { got = *get_readback_vec4(data, row_pitch, x, y); - if (!compare_vec4(&got, expected, max_diff)) + if (!compare_vec4_count(&got, expected, max_diff, comp_count)) { all_match = false; break; @@ -851,11 +851,19 @@ static void check_readback_data_vec4(const uint8_t *data, unsigned int row_pitch if (!all_match) break; } - ok(all_match, "Got {%.8e, %.8e, %.8e, %.8e}, expected {%.8e, %.8e, %.8e, %.8e} at (%u, %u).\n", - got.x, got.y, got.z, got.w, expected->x, expected->y, expected->z, expected->w, x, y); + if (comp_count == 4) + { + ok(all_match, "Got {%.8e, %.8e, %.8e, %.8e}, expected {%.8e, %.8e, %.8e, %.8e} at (%u, %u).\n", + got.x, got.y, got.z, got.w, expected->x, expected->y, expected->z, expected->w, x, y); + } + else + { + ok(all_match, "Got %.8e, expected %.8e at (%u, %u).\n", got.x, expected->x, x, y); + } }
-static void vulkan_runner_probe_vec4(struct shader_runner *r, const RECT *rect, const struct vec4 *v, unsigned int ulps) +static void vulkan_runner_probe_vec4(struct shader_runner *r, const RECT *rect, const struct vec4 *v, + unsigned int ulps, unsigned int comp_count) { struct vulkan_shader_runner *runner = vulkan_shader_runner(r); VkDevice device = runner->device; @@ -890,7 +898,7 @@ static void vulkan_runner_probe_vec4(struct shader_runner *r, const RECT *rect, end_command_buffer(runner);
VK_CALL(vkMapMemory(device, memory, 0, VK_WHOLE_SIZE, 0, &data)); - todo_if (runner->r.is_todo) check_readback_data_vec4(data, row_pitch, rect, v, ulps); + todo_if (runner->r.is_todo) check_readback_data_vec4(data, row_pitch, rect, v, ulps, comp_count); VK_CALL(vkUnmapMemory(device, memory));
VK_CALL(vkFreeMemory(device, memory, NULL)); diff --git a/tests/utils.h b/tests/utils.h index 563f0b0f..8adfd5cb 100644 --- a/tests/utils.h +++ b/tests/utils.h @@ -94,6 +94,15 @@ static inline bool compare_vec4(const struct vec4 *v1, const struct vec4 *v2, un && compare_float(v1->w, v2->w, ulps); }
+static inline bool compare_vec4_count(const struct vec4 *v1, const struct vec4 *v2, unsigned int ulps, + unsigned int comp_count) +{ + return (comp_count < 1 || compare_float(v1->x, v2->x, ulps)) + && (comp_count < 2 || compare_float(v1->y, v2->y, ulps)) + && (comp_count < 3 || compare_float(v1->z, v2->z, ulps)) + && (comp_count < 4 || compare_float(v1->w, v2->w, ulps)); +} + static inline void set_rect(RECT *rect, int left, int top, int right, int bottom) { rect->left = left;
On 5/10/22 08:08, Giovanni Mascellani wrote:
diff --git a/tests/shader_runner.h b/tests/shader_runner.h index 9685eaa3..a85b84e2 100644 --- a/tests/shader_runner.h +++ b/tests/shader_runner.h @@ -123,7 +123,8 @@ struct shader_runner_ops struct resource *(*create_resource)(struct shader_runner *runner, const struct resource_params *params); void (*destroy_resource)(struct shader_runner *runner, struct resource *resource); bool (*draw)(struct shader_runner *runner, D3D_PRIMITIVE_TOPOLOGY primitive_topology, unsigned int vertex_count);
- void (*probe_vec4)(struct shader_runner *runner, const RECT *rect, const struct vec4 *v, unsigned int ulps);
- void (*probe_vec4)(struct shader_runner *runner, const RECT *rect, const struct vec4 *v,
};unsigned int ulps, unsigned int comp_count);
I ran into a similar case with UAVs, and given our current preponderance of backends, I'd rather move more logic out of them. I have some local patches to that effect.
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com Signed-off-by: Francisco Casas fcasas@codeweavers.com --- Makefile.am | 1 + tests/matrix-semantics.shader_test | 71 ++++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+) create mode 100644 tests/matrix-semantics.shader_test
diff --git a/Makefile.am b/Makefile.am index dd3ce1ff..74c23b2b 100644 --- a/Makefile.am +++ b/Makefile.am @@ -108,6 +108,7 @@ vkd3d_shader_tests = \ tests/hlsl-vector-indexing-uniform.shader_test \ tests/logic-operations.shader_test \ tests/math.shader_test \ + tests/matrix-semantics.shader_test \ tests/nointerpolation.shader_test \ tests/pow.shader_test \ tests/preproc-if.shader_test \ diff --git a/tests/matrix-semantics.shader_test b/tests/matrix-semantics.shader_test new file mode 100644 index 00000000..7106eb86 --- /dev/null +++ b/tests/matrix-semantics.shader_test @@ -0,0 +1,71 @@ +[pixel shader] +float4x1 main() : sv_target +{ + return float4(1.0, 2.0, 3.0, 4.0); +} + +[test] +todo draw quad +probe all rgba (1.0, 2.0, 3.0, 4.0) + +[pixel shader] +row_major float1x4 main() : sv_target +{ + return float4(1.0, 2.0, 3.0, 4.0); +} + +[test] +todo draw quad +probe all rgba (1.0, 2.0, 3.0, 4.0) + +[require] +shader model >= 4.0 + +[pixel shader] +row_major float4x1 main() : sv_target +{ + return float4(1.0, 2.0, 3.0, 4.0); +} + +[test] +todo draw quad +probe all r 1.0 + +[pixel shader] +float1x4 main() : sv_target +{ + return float4(1.0, 2.0, 3.0, 4.0); +} + +[test] +todo draw quad +probe all r 1.0 + +[pixel shader] +void main(out row_major float1x4 x : sv_target0, out float1x4 y : sv_target1) +{ + x = float4(1.0, 2.0, 3.0, 4.0); + y = float4(5.0, 6.0, 7.0, 8.0); +} + +[test] +todo draw quad +probe all rgba (1.0, 2.0, 3.0, 4.0) + +[pixel shader fail todo] +void main(out float1x4 x : sv_target0, out float1x4 y : sv_target1) +{ + x = float4(1.0, 2.0, 3.0, 4.0); + y = float4(5.0, 6.0, 7.0, 8.0); +} + +[pixel shader] +void main(out float1x4 x : sv_target0, out float1x4 y : sv_target4) +{ + x = float4(1.0, 2.0, 3.0, 4.0); + y = float4(5.0, 6.0, 7.0, 8.0); +} + +[test] +todo draw quad +probe all r 1.0
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com Signed-off-by: Francisco Casas fcasas@codeweavers.com --- If this is reverted on top of "Lower numeric casts", then the shader runner crashes with:
shader_runner: ../vkd3d/libs/vkd3d-shader/vkd3d_shader_private.h:1165: vkd3d_write_mask_component_count: Assertion `1 <= count && count <= VKD3D_VEC4_SIZE' failed.
While the HLSL copiler is expected to output correct programs, the DXBC parser should not crash on malformed inputs anyway, so there is a bug to be fixed there too. --- libs/vkd3d-shader/hlsl_codegen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 78b22910..5fdcd4dc 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -1348,7 +1348,7 @@ static void allocate_variable_temp_register(struct hlsl_ctx *ctx, struct hlsl_ir var->last_read, var->data_type->reg_size); else var->reg = allocate_register(ctx, liveness, var->first_write, - var->last_read, var->data_type->dimx); + var->last_read, var->data_type->reg_size); TRACE("Allocated %s to %s (liveness %u-%u).\n", var->name, debug_register('r', var->reg, var->data_type), var->first_write, var->last_read); }
On 5/10/22 08:08, Giovanni Mascellani wrote:
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com Signed-off-by: Francisco Casas fcasas@codeweavers.com
If this is reverted on top of "Lower numeric casts", then the shader runner crashes with:
shader_runner: ../vkd3d/libs/vkd3d-shader/vkd3d_shader_private.h:1165: vkd3d_write_mask_component_count: Assertion `1 <= count && count <= VKD3D_VEC4_SIZE' failed.
While the HLSL copiler is expected to output correct programs, the DXBC parser should not crash on malformed inputs anyway, so there is a bug to be fixed there too.
libs/vkd3d-shader/hlsl_codegen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 78b22910..5fdcd4dc 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -1348,7 +1348,7 @@ static void allocate_variable_temp_register(struct hlsl_ctx *ctx, struct hlsl_ir var->last_read, var->data_type->reg_size); else var->reg = allocate_register(ctx, liveness, var->first_write,
var->last_read, var->data_type->dimx);
var->last_read, var->data_type->reg_size); TRACE("Allocated %s to %s (liveness %u-%u).\n", var->name, debug_register('r', var->reg, var->data_type), var->first_write, var->last_read); }
This uses more storage than necessary for e.g. sm1 scalars, though. I think we can use hlsl_type_component_count() instead.
(All of the RA logic desperately needs to be rewritten, but oh well...)
Hi,
Il 13/05/22 01:10, Zebediah Figura ha scritto:
This uses more storage than necessary for e.g. sm1 scalars, though. I think we can use hlsl_type_component_count() instead.
Ok, makes sense.
(All of the RA logic desperately needs to be rewritten, but oh well...)
Do we consider this a realistic/sensible target for 1.4? Or should we postpone that to 1.5 (or whatever comes next)? Given a number of things we'd like to have in 1.4 we have already informally discussed, I guess we can postpone this.
Gio.
On 5/16/22 09:51, Giovanni Mascellani wrote:
Hi,
Il 13/05/22 01:10, Zebediah Figura ha scritto:
This uses more storage than necessary for e.g. sm1 scalars, though. I think we can use hlsl_type_component_count() instead.
Ok, makes sense.
(All of the RA logic desperately needs to be rewritten, but oh well...)
Do we consider this a realistic/sensible target for 1.4? Or should we postpone that to 1.5 (or whatever comes next)? Given a number of things we'd like to have in 1.4 we have already informally discussed, I guess we can postpone this.
Yeah, that's not happening soon, unfortunately, hence why I suggested hlsl_type_component_count() instead ;-)
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com Signed-off-by: Francisco Casas fcasas@codeweavers.com --- libs/vkd3d-shader/hlsl_sm4.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl_sm4.c b/libs/vkd3d-shader/hlsl_sm4.c index a5fc094b..a6ac2aa3 100644 --- a/libs/vkd3d-shader/hlsl_sm4.c +++ b/libs/vkd3d-shader/hlsl_sm4.c @@ -891,7 +891,7 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r
if (has_idx) { - reg->idx[0] = var->semantic.index; + reg->idx[0] = var->semantic.index + offset / 4; reg->idx_count = 1; }
@@ -922,7 +922,7 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r
if (has_idx) { - reg->idx[0] = var->semantic.index; + reg->idx[0] = var->semantic.index + offset / 4; reg->idx_count = 1; }
Signed-off-by: Zebediah Figura zfigura@codeweavers.com
Signed-off-by: Henri Verbeet hverbeet@codeweavers.com
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com Signed-off-by: Francisco Casas fcasas@codeweavers.com --- libs/vkd3d-shader/hlsl.y | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 698169d8..7d9c0a2d 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2823,14 +2823,18 @@ func_prototype: /* var_modifiers is necessary to avoid shift/reduce conflicts. */ var_modifiers type var_identifier '(' parameters ')' colon_attribute { + unsigned int modifiers = $1; struct hlsl_ir_var *var; + struct hlsl_type *type;
- if ($1) + if (modifiers & ~HLSL_MODIFIERS_MAJORITY_MASK) { hlsl_error(ctx, &@1, VKD3D_SHADER_ERROR_HLSL_INVALID_MODIFIER, - "Modifiers are not allowed on functions."); + "Only majority modifiers are allowed on functions."); YYABORT; } + if (!(type = apply_type_modifiers(ctx, $2, &modifiers, @1))) + YYABORT; if ((var = hlsl_get_var(ctx->globals, $3))) { hlsl_error(ctx, &@3, VKD3D_SHADER_ERROR_HLSL_REDEFINED, @@ -2839,7 +2843,7 @@ func_prototype: ""%s" was previously declared here.", $3); YYABORT; } - if (hlsl_types_are_equal($2, ctx->builtin_types.Void) && $7.semantic.name) + if (hlsl_types_are_equal(type, ctx->builtin_types.Void) && $7.semantic.name) { hlsl_error(ctx, &@7, VKD3D_SHADER_ERROR_HLSL_INVALID_SEMANTIC, "Semantics are not allowed on void functions."); @@ -2848,7 +2852,7 @@ func_prototype: if ($7.reg_reservation.type) FIXME("Unexpected register reservation for a function.\n");
- if (!($$.decl = hlsl_new_func_decl(ctx, $2, $5, &$7.semantic, @3))) + if (!($$.decl = hlsl_new_func_decl(ctx, type, $5, &$7.semantic, @3))) YYABORT; $$.name = $3; ctx->cur_function = $$.decl;
Signed-off-by: Zebediah Figura zfigura@codeweavers.com
Signed-off-by: Henri Verbeet hverbeet@codeweavers.com
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com Signed-off-by: Francisco Casas fcasas@codeweavers.com --- Functionality-wise, this should be able to wholly replace lower_broadcasts(). However, it generates less vectorized code, so I don't know what is the general sentiment WRT immediately remove lower_broadcasts(), remove it once more vectorized code is generated (or an effective vectorization pass cares about it), or whatever else. --- libs/vkd3d-shader/hlsl.h | 4 + libs/vkd3d-shader/hlsl.y | 52 +++++----- libs/vkd3d-shader/hlsl_codegen.c | 97 +++++++++++++++++++ tests/hlsl-duplicate-modifiers.shader_test | 2 +- tests/hlsl-initializer-matrix.shader_test | 2 +- ...lsl-return-implicit-conversion.shader_test | 10 +- tests/hlsl-shape.shader_test | 10 +- tests/matrix-semantics.shader_test | 14 +-- 8 files changed, 146 insertions(+), 45 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.h b/libs/vkd3d-shader/hlsl.h index a142fa48..34c6bfc3 100644 --- a/libs/vkd3d-shader/hlsl.h +++ b/libs/vkd3d-shader/hlsl.h @@ -709,6 +709,10 @@ struct vkd3d_string_buffer *hlsl_modifiers_to_string(struct hlsl_ctx *ctx, unsig const char *hlsl_node_type_to_string(enum hlsl_ir_node_type type);
void hlsl_add_function(struct hlsl_ctx *ctx, char *name, struct hlsl_ir_function_decl *decl, bool intrinsic); +struct hlsl_ir_node *hlsl_add_implicit_conversion(struct hlsl_ctx *ctx, struct list *instrs, + struct hlsl_ir_node *node, struct hlsl_type *dst_type, const struct vkd3d_shader_location *loc); +struct hlsl_ir_load *hlsl_add_load(struct hlsl_ctx *ctx, struct list *instrs, struct hlsl_ir_node *var_node, + struct hlsl_ir_node *offset, struct hlsl_type *data_type, const struct vkd3d_shader_location loc); bool hlsl_add_var(struct hlsl_ctx *ctx, struct hlsl_ir_var *decl, bool local_var);
void hlsl_dump_function(struct hlsl_ctx *ctx, const struct hlsl_ir_function_decl *func); diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 7d9c0a2d..ac17f8b1 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -266,7 +266,7 @@ static bool implicit_compatible_data_types(struct hlsl_type *t1, struct hlsl_typ return false; }
-static struct hlsl_ir_node *add_implicit_conversion(struct hlsl_ctx *ctx, struct list *instrs, +struct hlsl_ir_node *hlsl_add_implicit_conversion(struct hlsl_ctx *ctx, struct list *instrs, struct hlsl_ir_node *node, struct hlsl_type *dst_type, const struct vkd3d_shader_location *loc) { struct hlsl_type *src_type = node->data_type; @@ -508,7 +508,7 @@ static struct hlsl_ir_jump *add_return(struct hlsl_ctx *ctx, struct list *instrs { struct hlsl_ir_store *store;
- if (!(return_value = add_implicit_conversion(ctx, instrs, return_value, return_type, &loc))) + if (!(return_value = hlsl_add_implicit_conversion(ctx, instrs, return_value, return_type, &loc))) return NULL;
if (!(store = hlsl_new_simple_store(ctx, ctx->cur_function->return_var, return_value))) @@ -528,7 +528,7 @@ static struct hlsl_ir_jump *add_return(struct hlsl_ctx *ctx, struct list *instrs return jump; }
-static struct hlsl_ir_load *add_load(struct hlsl_ctx *ctx, struct list *instrs, struct hlsl_ir_node *var_node, +struct hlsl_ir_load *hlsl_add_load(struct hlsl_ctx *ctx, struct list *instrs, struct hlsl_ir_node *var_node, struct hlsl_ir_node *offset, struct hlsl_type *data_type, const struct vkd3d_shader_location loc) { struct hlsl_ir_node *add = NULL; @@ -578,7 +578,7 @@ static bool add_record_load(struct hlsl_ctx *ctx, struct list *instrs, struct hl return false; list_add_tail(instrs, &c->node.entry);
- return !!add_load(ctx, instrs, record, &c->node, field->type, loc); + return !!hlsl_add_load(ctx, instrs, record, &c->node, field->type, loc); }
static struct hlsl_ir_expr *add_binary_arithmetic_expr(struct hlsl_ctx *ctx, struct list *instrs, @@ -618,7 +618,7 @@ static struct hlsl_ir_node *add_matrix_scalar_load(struct hlsl_ctx *ctx, struct if (!(add = add_binary_arithmetic_expr(ctx, instrs, HLSL_OP2_ADD, &mul->node, minor, loc))) return NULL;
- if (!(load = add_load(ctx, instrs, matrix, &add->node, scalar_type, *loc))) + if (!(load = hlsl_add_load(ctx, instrs, matrix, &add->node, scalar_type, *loc))) return NULL;
return &load->node; @@ -704,7 +704,7 @@ static bool add_array_load(struct hlsl_ctx *ctx, struct list *instrs, struct hls return false; }
- return !!add_load(ctx, instrs, array, index, data_type, loc); + return !!hlsl_add_load(ctx, instrs, array, index, data_type, loc); }
static struct hlsl_struct_field *get_struct_field(struct list *fields, const char *name) @@ -1214,7 +1214,7 @@ static struct hlsl_ir_expr *add_unary_logical_expr(struct hlsl_ctx *ctx, struct bool_type = hlsl_get_numeric_type(ctx, arg->data_type->type, HLSL_TYPE_BOOL, arg->data_type->dimx, arg->data_type->dimy);
- if (!(args[0] = add_implicit_conversion(ctx, instrs, arg, bool_type, loc))) + if (!(args[0] = hlsl_add_implicit_conversion(ctx, instrs, arg, bool_type, loc))) return NULL;
return add_expr(ctx, instrs, op, args, bool_type, loc); @@ -1235,10 +1235,10 @@ static struct hlsl_ir_expr *add_binary_arithmetic_expr(struct hlsl_ctx *ctx, str
common_type = hlsl_get_numeric_type(ctx, type, base, dimx, dimy);
- if (!(args[0] = add_implicit_conversion(ctx, instrs, arg1, common_type, loc))) + if (!(args[0] = hlsl_add_implicit_conversion(ctx, instrs, arg1, common_type, loc))) return NULL;
- if (!(args[1] = add_implicit_conversion(ctx, instrs, arg2, common_type, loc))) + if (!(args[1] = hlsl_add_implicit_conversion(ctx, instrs, arg2, common_type, loc))) return NULL;
return add_expr(ctx, instrs, op, args, common_type, loc); @@ -1293,10 +1293,10 @@ static struct hlsl_ir_expr *add_binary_comparison_expr(struct hlsl_ctx *ctx, str common_type = hlsl_get_numeric_type(ctx, type, base, dimx, dimy); return_type = hlsl_get_numeric_type(ctx, type, HLSL_TYPE_BOOL, dimx, dimy);
- if (!(args[0] = add_implicit_conversion(ctx, instrs, arg1, common_type, loc))) + if (!(args[0] = hlsl_add_implicit_conversion(ctx, instrs, arg1, common_type, loc))) return NULL;
- if (!(args[1] = add_implicit_conversion(ctx, instrs, arg2, common_type, loc))) + if (!(args[1] = hlsl_add_implicit_conversion(ctx, instrs, arg2, common_type, loc))) return NULL;
return add_expr(ctx, instrs, op, args, return_type, loc); @@ -1327,10 +1327,10 @@ static struct hlsl_ir_expr *add_binary_logical_expr(struct hlsl_ctx *ctx, struct
common_type = hlsl_get_numeric_type(ctx, type, HLSL_TYPE_BOOL, dimx, dimy);
- if (!(args[0] = add_implicit_conversion(ctx, instrs, arg1, common_type, loc))) + if (!(args[0] = hlsl_add_implicit_conversion(ctx, instrs, arg1, common_type, loc))) return NULL;
- if (!(args[1] = add_implicit_conversion(ctx, instrs, arg2, common_type, loc))) + if (!(args[1] = hlsl_add_implicit_conversion(ctx, instrs, arg2, common_type, loc))) return NULL;
return add_expr(ctx, instrs, op, args, common_type, loc); @@ -1370,10 +1370,10 @@ static struct hlsl_ir_expr *add_binary_shift_expr(struct hlsl_ctx *ctx, struct l return_type = hlsl_get_numeric_type(ctx, type, base, dimx, dimy); integer_type = hlsl_get_numeric_type(ctx, type, HLSL_TYPE_INT, dimx, dimy);
- if (!(args[0] = add_implicit_conversion(ctx, instrs, arg1, return_type, loc))) + if (!(args[0] = hlsl_add_implicit_conversion(ctx, instrs, arg1, return_type, loc))) return NULL;
- if (!(args[1] = add_implicit_conversion(ctx, instrs, arg2, integer_type, loc))) + if (!(args[1] = hlsl_add_implicit_conversion(ctx, instrs, arg2, integer_type, loc))) return NULL;
return add_expr(ctx, instrs, op, args, return_type, loc); @@ -1479,7 +1479,7 @@ static struct hlsl_ir_node *add_assignment(struct hlsl_ctx *ctx, struct list *in { writemask = (1 << lhs_type->dimx) - 1;
- if (!(rhs = add_implicit_conversion(ctx, instrs, rhs, lhs_type, &rhs->loc))) + if (!(rhs = hlsl_add_implicit_conversion(ctx, instrs, rhs, lhs_type, &rhs->loc))) return NULL; }
@@ -1598,10 +1598,10 @@ static void initialize_var_components(struct hlsl_ctx *ctx, struct list *instrs, return; list_add_tail(instrs, &c->node.entry);
- if (!(load = add_load(ctx, instrs, src, &c->node, src_comp_type, src->loc))) + if (!(load = hlsl_add_load(ctx, instrs, src, &c->node, src_comp_type, src->loc))) return;
- if (!(conv = add_implicit_conversion(ctx, instrs, &load->node, dst_comp_type, &src->loc))) + if (!(conv = hlsl_add_implicit_conversion(ctx, instrs, &load->node, dst_comp_type, &src->loc))) return;
if (!(c = hlsl_new_uint_constant(ctx, dst_reg_offset, &src->loc))) @@ -1886,7 +1886,7 @@ static struct hlsl_ir_node *intrinsic_float_convert_arg(struct hlsl_ctx *ctx, return arg;
type = hlsl_get_numeric_type(ctx, type->type, HLSL_TYPE_FLOAT, type->dimx, type->dimy); - return add_implicit_conversion(ctx, params->instrs, arg, type, loc); + return hlsl_add_implicit_conversion(ctx, params->instrs, arg, type, loc); }
static bool intrinsic_abs(struct hlsl_ctx *ctx, @@ -1923,10 +1923,10 @@ static bool intrinsic_cross(struct hlsl_ctx *ctx,
cast_type = hlsl_get_vector_type(ctx, base, 3);
- if (!(arg1_cast = add_implicit_conversion(ctx, params->instrs, arg1, cast_type, loc))) + if (!(arg1_cast = hlsl_add_implicit_conversion(ctx, params->instrs, arg1, cast_type, loc))) return false;
- if (!(arg2_cast = add_implicit_conversion(ctx, params->instrs, arg2, cast_type, loc))) + if (!(arg2_cast = hlsl_add_implicit_conversion(ctx, params->instrs, arg2, cast_type, loc))) return false;
if (!(arg1_swzl1 = hlsl_new_swizzle(ctx, HLSL_SWIZZLE(Z, X, Y, Z), 3, arg1_cast, loc))) @@ -2204,7 +2204,7 @@ static bool add_method_call(struct hlsl_ctx *ctx, struct list *instrs, struct hl hlsl_fixme(ctx, loc, "Tiled resource status argument.");
/* +1 for the mipmap level */ - if (!(coords = add_implicit_conversion(ctx, instrs, params->args[0], + if (!(coords = hlsl_add_implicit_conversion(ctx, instrs, params->args[0], hlsl_get_vector_type(ctx, HLSL_TYPE_INT, sampler_dim + 1), loc))) return false;
@@ -2248,13 +2248,13 @@ static bool add_method_call(struct hlsl_ctx *ctx, struct list *instrs, struct hl /* Only HLSL_IR_LOAD can return an object. */ sampler_load = hlsl_ir_load(params->args[0]);
- if (!(coords = add_implicit_conversion(ctx, instrs, params->args[1], + if (!(coords = hlsl_add_implicit_conversion(ctx, instrs, params->args[1], hlsl_get_vector_type(ctx, HLSL_TYPE_FLOAT, sampler_dim), loc))) return false;
if (params->args_count == 3) { - if (!(offset = add_implicit_conversion(ctx, instrs, params->args[2], + if (!(offset = hlsl_add_implicit_conversion(ctx, instrs, params->args[2], hlsl_get_vector_type(ctx, HLSL_TYPE_INT, sampler_dim), loc))) return false; } @@ -2329,7 +2329,7 @@ static bool add_method_call(struct hlsl_ctx *ctx, struct list *instrs, struct hl
if (params->args_count == 3 || params->args_count == 4) { - if (!(offset = add_implicit_conversion(ctx, instrs, params->args[2], + if (!(offset = hlsl_add_implicit_conversion(ctx, instrs, params->args[2], hlsl_get_vector_type(ctx, HLSL_TYPE_INT, sampler_dim), loc))) return false; } @@ -2359,7 +2359,7 @@ static bool add_method_call(struct hlsl_ctx *ctx, struct list *instrs, struct hl /* Only HLSL_IR_LOAD can return an object. */ sampler_load = hlsl_ir_load(params->args[0]);
- if (!(coords = add_implicit_conversion(ctx, instrs, params->args[1], + if (!(coords = hlsl_add_implicit_conversion(ctx, instrs, params->args[1], hlsl_get_vector_type(ctx, HLSL_TYPE_FLOAT, sampler_dim), loc))) return false;
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 5fdcd4dc..6d4b550d 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -273,6 +273,102 @@ static bool lower_broadcasts(struct hlsl_ctx *ctx, struct hlsl_ir_node *instr, v return false; }
+static bool lower_numeric_casts(struct hlsl_ctx *ctx, struct hlsl_ir_node *instr, void *context) +{ + struct hlsl_type *src_type, *dst_type = instr->data_type; + struct hlsl_ir_node *src, *value; + struct vkd3d_string_buffer *name; + static unsigned int counter = 0; + struct hlsl_ir_load *load; + struct hlsl_ir_expr *expr; + struct hlsl_ir_var *var; + unsigned int dst_idx; + bool broadcast; + + if (instr->type != HLSL_IR_EXPR) + return false; + expr = hlsl_ir_expr(instr); + if (expr->op != HLSL_OP1_CAST) + return false; + + src = expr->operands[0].node; + src_type = src->data_type; + + if (dst_type->type > HLSL_CLASS_LAST_NUMERIC || src_type->type > HLSL_CLASS_LAST_NUMERIC) + return false; + if (dst_type->type == HLSL_CLASS_SCALAR && src_type->type == HLSL_CLASS_SCALAR) + return false; + + broadcast = src_type->dimx == 1 && src_type->dimy == 1; + assert(dst_type->dimx * dst_type->dimy <= src_type->dimx * src_type->dimy || broadcast); + if (src_type->type == HLSL_CLASS_MATRIX && dst_type->type == HLSL_CLASS_MATRIX && !broadcast) + { + assert(dst_type->dimx <= src_type->dimx); + assert(dst_type->dimy <= src_type->dimy); + } + + name = vkd3d_string_buffer_get(&ctx->string_buffers); + vkd3d_string_buffer_printf(name, "<cast-%u>", counter++); + var = hlsl_new_synthetic_var(ctx, name->buffer, dst_type, instr->loc); + vkd3d_string_buffer_release(&ctx->string_buffers, name); + if (!var) + return false; + + for (dst_idx = 0; dst_idx < dst_type->dimx * dst_type->dimy; ++dst_idx) + { + struct hlsl_type *src_scalar_type, *dst_scalar_type; + unsigned int src_idx, src_offset, dst_offset; + struct hlsl_ir_store *store; + struct hlsl_ir_constant *c; + + if (broadcast) + { + src_idx = 0; + } + else + { + if (src_type->type == HLSL_CLASS_MATRIX && dst_type->type == HLSL_CLASS_MATRIX) + { + unsigned int x = dst_idx % dst_type->dimx, y = dst_idx / dst_type->dimx; + + src_idx = y * src_type->dimx + x; + } + else + { + src_idx = dst_idx; + } + } + + dst_offset = hlsl_compute_component_offset(ctx, dst_type, dst_idx, &dst_scalar_type); + src_offset = hlsl_compute_component_offset(ctx, src_type, src_idx, &src_scalar_type); + + if (!(c = hlsl_new_uint_constant(ctx, src_offset, &src->loc))) + return false; + list_add_before(&instr->entry, &c->node.entry); + + if (!(load = hlsl_add_load(ctx, &instr->entry, src, &c->node, src_scalar_type, src->loc))) + return false; + + if (!(value = hlsl_add_implicit_conversion(ctx, &instr->entry, &load->node, dst_scalar_type, &src->loc))) + return false; + + if (!(c = hlsl_new_uint_constant(ctx, dst_offset, &src->loc))) + return false; + list_add_before(&instr->entry, &c->node.entry); + + if (!(store = hlsl_new_store(ctx, var, &c->node, value, 0, src->loc))) + return false; + list_add_before(&instr->entry, &store->node.entry); + } + + if (!(load = hlsl_new_load(ctx, var, NULL, dst_type, instr->loc))) + return false; + list_add_before(&instr->entry, &load->node.entry); + hlsl_replace_node(instr, &load->node); + + return true; +} + enum copy_propagation_value_state { VALUE_STATE_NOT_WRITTEN = 0, @@ -1904,6 +2000,7 @@ int hlsl_emit_bytecode(struct hlsl_ctx *ctx, struct hlsl_ir_function_decl *entry }
transform_ir(ctx, lower_broadcasts, body, NULL); + transform_ir(ctx, lower_numeric_casts, body, NULL); while (transform_ir(ctx, fold_redundant_casts, body, NULL)); do { diff --git a/tests/hlsl-duplicate-modifiers.shader_test b/tests/hlsl-duplicate-modifiers.shader_test index fcae12da..6491701a 100644 --- a/tests/hlsl-duplicate-modifiers.shader_test +++ b/tests/hlsl-duplicate-modifiers.shader_test @@ -7,5 +7,5 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (0.1, 0.2, 0.3, 0.4) diff --git a/tests/hlsl-initializer-matrix.shader_test b/tests/hlsl-initializer-matrix.shader_test index ea9de9c0..7e12b0a0 100644 --- a/tests/hlsl-initializer-matrix.shader_test +++ b/tests/hlsl-initializer-matrix.shader_test @@ -55,7 +55,7 @@ float4 main() : SV_TARGET }
[test] -todo draw quad +draw quad probe all rgba (21, 22, 31, 32)
diff --git a/tests/hlsl-return-implicit-conversion.shader_test b/tests/hlsl-return-implicit-conversion.shader_test index bf99d9cb..4fe8e7eb 100644 --- a/tests/hlsl-return-implicit-conversion.shader_test +++ b/tests/hlsl-return-implicit-conversion.shader_test @@ -5,7 +5,7 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (0.4, 0.3, 0.2, 0.1)
[pixel shader] @@ -15,7 +15,7 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (0.4, 0.3, 0.2, 0.1)
[pixel shader] @@ -25,7 +25,7 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (0.4, 0.3, 0.2, 0.1)
[pixel shader] @@ -35,8 +35,8 @@ float4x1 main() : sv_target }
[test] -todo draw quad -probe all rgba (0.4, 0.3, 0.2, 0.1) +draw quad +todo probe all rgba (0.4, 0.3, 0.2, 0.1)
[pixel shader] float3 func() diff --git a/tests/hlsl-shape.shader_test b/tests/hlsl-shape.shader_test index 57d59534..65cc322c 100644 --- a/tests/hlsl-shape.shader_test +++ b/tests/hlsl-shape.shader_test @@ -211,7 +211,7 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (2.0, 4.0, 6.0, 8.0)
[pixel shader] @@ -235,7 +235,7 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (2.0, 4.0, 6.0, 8.0)
[pixel shader] @@ -260,7 +260,7 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (2.0, 4.0, 6.0, 8.0)
[pixel shader] @@ -309,7 +309,7 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (2.0, 4.0, 0.0, 0.0)
[pixel shader] @@ -321,7 +321,7 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (2.0, 4.0, 0.0, 0.0)
[pixel shader] diff --git a/tests/matrix-semantics.shader_test b/tests/matrix-semantics.shader_test index 7106eb86..acda4a16 100644 --- a/tests/matrix-semantics.shader_test +++ b/tests/matrix-semantics.shader_test @@ -5,8 +5,8 @@ float4x1 main() : sv_target }
[test] -todo draw quad -probe all rgba (1.0, 2.0, 3.0, 4.0) +draw quad +todo probe all rgba (1.0, 2.0, 3.0, 4.0)
[pixel shader] row_major float1x4 main() : sv_target @@ -15,7 +15,7 @@ row_major float1x4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (1.0, 2.0, 3.0, 4.0)
[require] @@ -28,7 +28,7 @@ row_major float4x1 main() : sv_target }
[test] -todo draw quad +draw quad probe all r 1.0
[pixel shader] @@ -38,7 +38,7 @@ float1x4 main() : sv_target }
[test] -todo draw quad +draw quad probe all r 1.0
[pixel shader] @@ -49,7 +49,7 @@ void main(out row_major float1x4 x : sv_target0, out float1x4 y : sv_target1) }
[test] -todo draw quad +draw quad probe all rgba (1.0, 2.0, 3.0, 4.0)
[pixel shader fail todo] @@ -67,5 +67,5 @@ void main(out float1x4 x : sv_target0, out float1x4 y : sv_target4) }
[test] -todo draw quad +draw quad probe all r 1.0
On 5/10/22 08:08, Giovanni Mascellani wrote:
Signed-off-by: Giovanni Mascellani gmascellani@codeweavers.com Signed-off-by: Francisco Casas fcasas@codeweavers.com
Functionality-wise, this should be able to wholly replace lower_broadcasts(). However, it generates less vectorized code, so I don't know what is the general sentiment WRT immediately remove lower_broadcasts(), remove it once more vectorized code is generated (or an effective vectorization pass cares about it), or whatever else.
libs/vkd3d-shader/hlsl.h | 4 + libs/vkd3d-shader/hlsl.y | 52 +++++----- libs/vkd3d-shader/hlsl_codegen.c | 97 +++++++++++++++++++ tests/hlsl-duplicate-modifiers.shader_test | 2 +- tests/hlsl-initializer-matrix.shader_test | 2 +- ...lsl-return-implicit-conversion.shader_test | 10 +- tests/hlsl-shape.shader_test | 10 +- tests/matrix-semantics.shader_test | 14 +-- 8 files changed, 146 insertions(+), 45 deletions(-)
I don't think this feels like the right place. Can we handle this when parsing explicit casts instead?
Hi,
Il 13/05/22 01:18, Zebediah Figura ha scritto:
I don't think this feels like the right place. Can we handle this when parsing explicit casts instead?
Hm, this is useful in many places beside explicit casts. Actually, most tests fixed by this commit are not explicit casts. In principle we could do the same processing each time we add any cast, explicit or not, but that means we have to hook in many places, making everything more brittle. What's the problem with having this as an optimization pass? Why is that different from lower_broadcasts()?
Thanks, Giovanni.
On 5/16/22 09:58, Giovanni Mascellani wrote:
Hi,
Il 13/05/22 01:18, Zebediah Figura ha scritto:
I don't think this feels like the right place. Can we handle this when parsing explicit casts instead?
Hm, this is useful in many places beside explicit casts. Actually, most tests fixed by this commit are not explicit casts.
Eh, explicit or implicit, sorry. You can't implicitly cast scalars to structs or arrays (at least not in the places I checked) but I forgot that you can cast them to matrices.
In principle we could do the same processing each time we add any cast, explicit or not, but that means we have to hook in many places, making everything more brittle.
Would it be many places, though? I think it'd only be the explicit cast parsing rule and add_implicit_conversion(). We'd want a helper function anyway, that wraps hlsl_new_cast() and handles scalar-to-structure casts.
What's the problem with having this as an optimization pass? Why is that different from lower_broadcasts()?
Broadly, it's the principle of making the IR as simple as possible. In this case it's especially helpful to be able to assume that types larger than a vector can only be generated by HLSL_IR_LOAD instructions. But I think it'd also be desirable to handle scalar-to-vector broadcasts at parse time; I just didn't consider that when writing the pass.
[This certainly isn't the only place where we can generate instructions with matrix type. Most notably they can also come from per-component arithmetic instructions. But I suspect the answer is that we want to lower those at parse time as well; I think it shouldn't be too painful since we can do it in add_expr() or something close to it.]
Hi,
Il 16/05/22 19:42, Zebediah Figura ha scritto:
In principle we could do the same processing each time we add any cast, explicit or not, but that means we have to hook in many places, making everything more brittle.
Would it be many places, though? I think it'd only be the explicit cast parsing rule and add_implicit_conversion(). We'd want a helper function anyway, that wraps hlsl_new_cast() and handles scalar-to-structure casts.
Whatever number they are, it is at least a few places versus just one well-defined place. Also, in the latter case the place is always the same (it would be one of the first optimization passes), while in the former case the places where you need to take care of this might change with time, further complicating a piece of code (the front end) which is already rather involved in itself. The optimization passes have a much nicer structure, it's just a pipeline for which hlsl_emit_bytecode() gives a quick and readable summary.
What's the problem with having this as an optimization pass? Why is that different from lower_broadcasts()?
Broadly, it's the principle of making the IR as simple as possible. In this case it's especially helpful to be able to assume that types larger than a vector can only be generated by HLSL_IR_LOAD instructions. But I think it'd also be desirable to handle scalar-to-vector broadcasts at parse time; I just didn't consider that when writing the pass.
If the first few passes care about ensuring the invariants that we want in later "real" passes then we get both the advantages of having a simpler IR (except in the first few stages, but that's doesn't look like a huge drawback) and giving an easier-to-read structure to the compiler without overloading the front end.
But I already tried to convince you and Matteo of this and I didn't manage, so I guess I'll withdraw.
Giovanni.
On 5/17/22 08:21, Giovanni Mascellani wrote:
Hi,
Il 16/05/22 19:42, Zebediah Figura ha scritto:
In principle we could do the same processing each time we add any cast, explicit or not, but that means we have to hook in many places, making everything more brittle.
Would it be many places, though? I think it'd only be the explicit cast parsing rule and add_implicit_conversion(). We'd want a helper function anyway, that wraps hlsl_new_cast() and handles scalar-to-structure casts.
Whatever number they are, it is at least a few places versus just one well-defined place. Also, in the latter case the place is always the same (it would be one of the first optimization passes), while in the former case the places where you need to take care of this might change with time, further complicating a piece of code (the front end) which is already rather involved in itself. The optimization passes have a much nicer structure, it's just a pipeline for which hlsl_emit_bytecode() gives a quick and readable summary.
On the one hand, yes, but it comes at the cost of making the IR more complicated. That complexity may not always be reflected in the structures used to encode it, but it's a mental burden to consider "what is it legal to express in IR at what point in execution?"
In some cases, like this, it doesn't really make a difference either way, but then I also don't see a reason to perform a lowering pass instead of emitting the code already lowered. With the use of helper functions I have a hard time conceiving of that lowering as happening in any more than one place.
What's the problem with having this as an optimization pass? Why is that different from lower_broadcasts()?
Broadly, it's the principle of making the IR as simple as possible. In this case it's especially helpful to be able to assume that types larger than a vector can only be generated by HLSL_IR_LOAD instructions. But I think it'd also be desirable to handle scalar-to-vector broadcasts at parse time; I just didn't consider that when writing the pass.
If the first few passes care about ensuring the invariants that we want in later "real" passes then we get both the advantages of having a simpler IR (except in the first few stages, but that's doesn't look like a huge drawback) and giving an easier-to-read structure to the compiler without overloading the front end.
But I already tried to convince you and Matteo of this and I didn't manage, so I guess I'll withdraw.
Giovanni.