Currently, we can't use add_load_index() or add_load_component() on the loads to the `"retval"` variables that come from `add_call()` because these functions assume that the loaded value (`var_instr`) won't change between its location and the location and the new load to be created.
We can't get rid of this assumptions either, because, at least `add_load_index()` may be used in the lhs of an assignment, and in that case we can't store to the "deref" synthetic: ``` x[0] = 20; ```
Here I implemented the alternative solution of copying `"retval"` into a synthetic variable after each function call, but we may want to discuss this approach.
The second problem this that is covered in 3/3 is fixing the number of components of the swizzle instructions generated by copy-prop, without it, the following assertion in `hlsl_sm4.c` fails for me
```c /* Narrowing casts were already lowered. */ assert(src_type->dimx == dst_type->dimx); ```
for some tests, in one of my branches, because of narrowing casts may end up appearing if the node is replaced with a node with more components. This error happens under quite complex conditions (involving casts and matrices) that I still try to narrow down, but I think that the fix is logical enough, without it, the asserts introduced in the patch fail in many tests.
From: Francisco Casas fcasas@codeweavers.com
--- tests/hlsl-function.shader_test | 34 +++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+)
diff --git a/tests/hlsl-function.shader_test b/tests/hlsl-function.shader_test index c836808d..a7cad6d7 100644 --- a/tests/hlsl-function.shader_test +++ b/tests/hlsl-function.shader_test @@ -261,3 +261,37 @@ float4 main() : sv_target func(true); return 0; } + + +[pixel shader] +float func(float a) +{ + return a + 1; +} + +float4 main() : sv_target +{ + return float4(func(1.0), func(2.0), func(5.0), func(6.0)); +} + +[test] +draw quad +todo probe all rgba (2.0, 3.0, 6.0, 7.0) + + +[pixel shader] +float func(float a) +{ + return a + 1; +} + +float4 main() : sv_target +{ + float4 a = {func(1.0), func(2.0), func(5.0), func(6.0)}; + + return a; +} + +[test] +draw quad +todo probe all rgba (2.0, 3.0, 6.0, 7.0)
From: Francisco Casas fcasas@codeweavers.com
Otherwise we can't use add_load_index() or add_load_component() on these loads, becase they assume that the loaded value (var_instr) won't change between its location and the new load.
We can't get rid of this assumptions either, because, at least add_load_index() may be used in the lhs of a store, and in that case we can't store to the "deref" synthetic. --- libs/vkd3d-shader/hlsl.y | 20 ++++++++++++++++++++ tests/hlsl-function.shader_test | 4 ++-- 2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 95c1c0fa..e8ea0f30 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -641,6 +641,8 @@ static struct hlsl_ir_load *add_load_index(struct hlsl_ctx *ctx, struct list *in
if (var_instr->type == HLSL_IR_LOAD) { + /* When var_instr is a load instruction, the value at the index is expected to not be + * modified between this var_instr and the point where the new load will be inserted. */ src = &hlsl_ir_load(var_instr)->src; } else @@ -674,6 +676,8 @@ static struct hlsl_ir_load *add_load_component(struct hlsl_ctx *ctx, struct list
if (var_instr->type == HLSL_IR_LOAD) { + /* When var_instr is a load instruction, the value at the component is expected to not be + * modified between this var_instr and the point where the new load will be inserted. */ src = &hlsl_ir_load(var_instr)->src; } else @@ -3060,11 +3064,27 @@ static struct list *add_call(struct hlsl_ctx *ctx, const char *name,
if (decl->return_var) { + struct hlsl_ir_store *store; struct hlsl_ir_load *load; + struct hlsl_ir_var *var; + + if (!(var = hlsl_new_synthetic_var(ctx, "retval-copy", decl->return_var->data_type, loc))) + { + vkd3d_free(decl); + return NULL; + }
if (!(load = hlsl_new_var_load(ctx, decl->return_var, *loc))) goto fail; list_add_tail(args->instrs, &load->node.entry); + + if (!(store = hlsl_new_simple_store(ctx, var, &load->node))) + goto fail; + list_add_tail(args->instrs, &store->node.entry); + + if (!(load = hlsl_new_var_load(ctx, var, *loc))) + goto fail; + list_add_tail(args->instrs, &load->node.entry); } else { diff --git a/tests/hlsl-function.shader_test b/tests/hlsl-function.shader_test index a7cad6d7..4d4c2e2b 100644 --- a/tests/hlsl-function.shader_test +++ b/tests/hlsl-function.shader_test @@ -276,7 +276,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (2.0, 3.0, 6.0, 7.0) +probe all rgba (2.0, 3.0, 6.0, 7.0)
[pixel shader] @@ -294,4 +294,4 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (2.0, 3.0, 6.0, 7.0) +probe all rgba (2.0, 3.0, 6.0, 7.0)
From: Francisco Casas fcasas@codeweavers.com
Otherwise we are may create nodes of different dimensions than the ones we are replacing.
"count" is the number of components of the source deref (without considering the swizzle), while "instr_component_count" is the actual number of components of the instruction to be replaced. --- libs/vkd3d-shader/hlsl.c | 3 +++ libs/vkd3d-shader/hlsl_codegen.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/libs/vkd3d-shader/hlsl.c b/libs/vkd3d-shader/hlsl.c index 97713343..c5ae57c2 100644 --- a/libs/vkd3d-shader/hlsl.c +++ b/libs/vkd3d-shader/hlsl.c @@ -2292,6 +2292,9 @@ void hlsl_replace_node(struct hlsl_ir_node *old, struct hlsl_ir_node *new) { struct hlsl_src *src, *next;
+ assert(old->data_type->dimx == new->data_type->dimx); + assert(old->data_type->dimy == new->data_type->dimy); + LIST_FOR_EACH_ENTRY_SAFE(src, next, &old->uses, struct hlsl_src, entry) { hlsl_src_remove(src); diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 80d52dfc..87559502 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -1031,7 +1031,7 @@ static bool copy_propagation_replace_with_single_instr(struct hlsl_ctx *ctx, { struct hlsl_ir_swizzle *swizzle_node;
- if (!(swizzle_node = hlsl_new_swizzle(ctx, ret_swizzle, count, new_instr, &instr->loc))) + if (!(swizzle_node = hlsl_new_swizzle(ctx, ret_swizzle, instr_component_count, new_instr, &instr->loc))) return false; list_add_before(&instr->entry, &swizzle_node->node.entry); new_instr = &swizzle_node->node;
I really don't like this solution, it feels like spaghetti—working around an architecturally broken bit of code by making it worse.
If we're passing a hlsl_ir_node to add_load_index() then it should be making use of that node's SSA value, not reaching through to the underlying variable.
Also, I think this is a problem we should deal with for other reasons. For a dumb contrived example, this actually works with native and does what it looks like:
float4 x; x.yxwz[2] = 1;
I hate to suggest it, but if we need that unwrapping logic on the LHS of an assignment, I think the most safe and maintainable thing to do is have an instr type that acts as a deref on any source type. Once again, ironically, that's what HLSL_IR_LOAD used to do. I guess Matteo was prescient when he designed the compiler like that the first time. I don't know if we want that to be HLSL_IR_LOAD or if we want a separate type that we'll lower to HLSL_IR_LOAD.
Okay, I agree that the solution in 2/3 is a workaround of a more complex underlying issue, and it will take some time to write a proper solution. So I will separate 3/3 into another MR, because it solves a different bug.