[PATCH v2 0/2] MR93: Draft: vkd3d-shader/hlsl: Fix assumptions in add_load_index() and add_load_component().
Currently, we can't use `add_load_index()` or `add_load_component()` on the loads to the `"retval"` variables that come from `add_call()` because these functions assume that the loaded value (`var_instr`) won't change between its location and the location and the new load to be created. We can't get rid of this assumptions either, because, at least `add_load_index()` may be used in the lhs of an assignment, and in that case we can't store to the "deref" synthetic: ``` x[0] = 20; ``` Here I implemented the alternative solution of copying `"retval"` into a synthetic variable after each function call, but we may want to discuss this approach. -- v2: https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93
From: Francisco Casas <fcasas(a)codeweavers.com> --- tests/hlsl-function.shader_test | 34 +++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/tests/hlsl-function.shader_test b/tests/hlsl-function.shader_test index c836808d..a7cad6d7 100644 --- a/tests/hlsl-function.shader_test +++ b/tests/hlsl-function.shader_test @@ -261,3 +261,37 @@ float4 main() : sv_target func(true); return 0; } + + +[pixel shader] +float func(float a) +{ + return a + 1; +} + +float4 main() : sv_target +{ + return float4(func(1.0), func(2.0), func(5.0), func(6.0)); +} + +[test] +draw quad +todo probe all rgba (2.0, 3.0, 6.0, 7.0) + + +[pixel shader] +float func(float a) +{ + return a + 1; +} + +float4 main() : sv_target +{ + float4 a = {func(1.0), func(2.0), func(5.0), func(6.0)}; + + return a; +} + +[test] +draw quad +todo probe all rgba (2.0, 3.0, 6.0, 7.0) -- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93
From: Francisco Casas <fcasas(a)codeweavers.com> Otherwise we can't use add_load_index() or add_load_component() on these loads, becase they assume that the loaded value (var_instr) won't change between its location and the new load. We can't get rid of this assumptions either, because, at least add_load_index() may be used in the lhs of a store, and in that case we can't store to the "deref" synthetic. --- libs/vkd3d-shader/hlsl.y | 20 ++++++++++++++++++++ tests/hlsl-function.shader_test | 4 ++-- 2 files changed, 22 insertions(+), 2 deletions(-) diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 95c1c0fa..e8ea0f30 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -641,6 +641,8 @@ static struct hlsl_ir_load *add_load_index(struct hlsl_ctx *ctx, struct list *in if (var_instr->type == HLSL_IR_LOAD) { + /* When var_instr is a load instruction, the value at the index is expected to not be + * modified between this var_instr and the point where the new load will be inserted. */ src = &hlsl_ir_load(var_instr)->src; } else @@ -674,6 +676,8 @@ static struct hlsl_ir_load *add_load_component(struct hlsl_ctx *ctx, struct list if (var_instr->type == HLSL_IR_LOAD) { + /* When var_instr is a load instruction, the value at the component is expected to not be + * modified between this var_instr and the point where the new load will be inserted. */ src = &hlsl_ir_load(var_instr)->src; } else @@ -3060,11 +3064,27 @@ static struct list *add_call(struct hlsl_ctx *ctx, const char *name, if (decl->return_var) { + struct hlsl_ir_store *store; struct hlsl_ir_load *load; + struct hlsl_ir_var *var; + + if (!(var = hlsl_new_synthetic_var(ctx, "retval-copy", decl->return_var->data_type, loc))) + { + vkd3d_free(decl); + return NULL; + } if (!(load = hlsl_new_var_load(ctx, decl->return_var, *loc))) goto fail; list_add_tail(args->instrs, &load->node.entry); + + if (!(store = hlsl_new_simple_store(ctx, var, &load->node))) + goto fail; + list_add_tail(args->instrs, &store->node.entry); + + if (!(load = hlsl_new_var_load(ctx, var, *loc))) + goto fail; + list_add_tail(args->instrs, &load->node.entry); } else { diff --git a/tests/hlsl-function.shader_test b/tests/hlsl-function.shader_test index a7cad6d7..4d4c2e2b 100644 --- a/tests/hlsl-function.shader_test +++ b/tests/hlsl-function.shader_test @@ -276,7 +276,7 @@ float4 main() : sv_target [test] draw quad -todo probe all rgba (2.0, 3.0, 6.0, 7.0) +probe all rgba (2.0, 3.0, 6.0, 7.0) [pixel shader] @@ -294,4 +294,4 @@ float4 main() : sv_target [test] draw quad -todo probe all rgba (2.0, 3.0, 6.0, 7.0) +probe all rgba (2.0, 3.0, 6.0, 7.0) -- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93
After thinking about this for a while, and grabbing some ideas from when we discussed this issue, I think that we should split `HLSL_IR_LOAD` into two: - `HLSL_IR_LOAD`: represents a load from a variable to a temp, cannot be used as an lhs. - `HLSL_IR_REFERENCE`: Merely contains a deref to a variable (or a part of it) but doesn't have an SSA value and cannot be used as a source. So, while parsing we always work with an `HLSL_IR_REFERENCE`, until a value is required (for instance, as part of an expression), in which case, an `HLSL_IR_LOAD` has to be created from the `HLSL_IR_REFERENCE` (basically just copying the deref). `HLSL_IR_REFERENCE` would probably also need an additional field with resource locations for creating `HLSL_IR_RESOURCE_LOAD` and `HLSL_IR_RESOURCE_STORE`, when a Texture or UAV is indexed. This could also be the solution for SM 5.1 resource arrays. A side-effect of these changes would be changing all the ``` /* Only HLSL_IR_LOAD can return an object. */ ``` comments to ``` /* Only HLSL_IR_REFERENCE can return an object. */ ``` or would allow to remove them. I am currently trying this approach. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93#note_25614
After thinking about this for a while, and grabbing some ideas from when we discussed this issue, I think that we should split `HLSL_IR_LOAD` into two:
* `HLSL_IR_LOAD`: represents a load from a variable to a temp, cannot be used as an lhs. * `HLSL_IR_REFERENCE`: Merely contains a deref to a variable (or a part of it) but doesn't have an SSA value and cannot be used as a source.
Well, it'd have to have an SSA value. E.g. you can have expressions like "x[1].xyzw". HLSL_IR_REFERENCE isn't a bad name, I'll also throw out HLSL_IR_INDEX [to match e.g. hlsl_new_load_index().]
So, while parsing we always work with an `HLSL_IR_REFERENCE`, until a value is required (for instance, as part of an expression), in which case, an `HLSL_IR_LOAD` has to be created from the `HLSL_IR_REFERENCE` (basically just copying the deref).
I'd just do them all at once in a pass immediately after parsing; that way you don't have to worry about different instruction types.
`HLSL_IR_REFERENCE` would probably also need an additional field with resource locations for creating `HLSL_IR_RESOURCE_LOAD` and `HLSL_IR_RESOURCE_STORE`, when a Texture or UAV is indexed. This could also be the solution for SM 5.1 resource arrays.
Wait, why? -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93#note_25679
Well, it'd have to have an SSA value. E.g. you can have expressions like "x\[1\].xyzw".
Hmmp, in this case `x` would be an `HLSL_IR_REFERENCE`, and then `x[1]` would be another `HLSL_IR_REFERENCE` (note that `x[1]` doesn't point to `x`, because, to create `x[1]` the deref is copied and then an src to the `1` node is appended to it). Then, the swizzle should indeed have point to `x[1]`, but there are either 2 possibilities: - If this is in a lhs, we should invert the swizzle into the rhs as we are doing now, leaving only the `HLSL_IR_REFERENCE` at the lhs (in case there are multiple swizzles, we fold them). - If this is an rhs or part of an expression, then `x[1]` must be turned into a `HLSL_IR_LOAD` in order to be used by the swizzle. The "evaluation" function that replaces the `HLSL_IR_REFERENCE` with `HLSL_IR_LOADS` should also consider this. A scary alternative: adding swizzles to the derefs.
`HLSL_IR_REFERENCE` isn't a bad name, I'll also throw out `HLSL_IR_INDEX` \[to match e.g. hlsl_new_load_index().\]
I am not sure about `HLSL_IR_INDEX`, at first I interpreted it as a node that references 2 nodes: A value node and an index node (`x` and `1` for `x[1]`), I am not sure if that was the intent. I tried to implement that, but gets complicated fast (and it doesn't cover the base case: the reference of the variable alone, `x`). I realized that is far simpler to have the whole deref into one node (basically the same way as `HLSL_IR_LOAD`).
`HLSL_IR_REFERENCE` would probably also need an additional field with resource locations for creating `HLSL_IR_RESOURCE_LOAD` and `HLSL_IR_RESOURCE_STORE`, when a Texture or UAV is indexed. This could also be the solution for SM 5.1 resource arrays.
Wait, why?
I mean, it is not mandatory, but at the moment seems like a logical step. Currently we are using an `HLSL_IR_LOAD` to initialize the lhs of an `HLSL_IR_STORE`, but we only care about the deref, and now `HLSL_IR_REFERENCE` will care about that. In the same way, we are currently using an `HLSL_IR_RESOURCE_LOAD`'s deref and `coords` to initialize `HLSL_IR_RESOURCE_STORE`, but not the `HLSL_IR_RESOURCE_LOAD` itself; so we could use `HLSL_IR_REFERENCE` for that too, adding the additional fields. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93#note_25688
`HLSL_IR_REFERENCE` isn't a bad name, I'll also throw out `HLSL_IR_INDEX` \[to match e.g. hlsl_new_load_index().\]
I am not sure about `HLSL_IR_INDEX`, at first I interpreted it as a node that references 2 nodes: A value node and an index node (`x` and `1` for `x[1]`), I am not sure if that was the intent. I tried to implement that, but gets complicated fast (and it doesn't cover the base case: the reference of the variable alone, `x`). I realized that is far simpler to have the whole deref into one node (basically the same way as `HLSL_IR_LOAD`).
That's exactly my intent, though. The base case could be covered by a union (as with the old HLSL_IR_DEREF) or simply by making it HLSL_IR_LOAD. What complications did you run into? -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93#note_25698
On Tue Feb 28 04:52:27 2023 +0000, Zebediah Figura wrote:
`HLSL_IR_REFERENCE` isn't a bad name, I'll also throw out `HLSL_IR_INDEX` \[to match e.g. hlsl_new_load_index().\]
I am not sure about `HLSL_IR_INDEX`, at first I interpreted it as a node that references 2 nodes: A value node and an index node (`x` and `1` for `x[1]`), I am not sure if that was the intent. I tried to implement that, but gets complicated fast (and it doesn't cover the base case: the reference of the variable alone, `x`). I realized that is far simpler to have the whole deref into one node (basically the same way as `HLSL_IR_LOAD`). That's exactly my intent, though. The base case could be covered by a union (as with the old HLSL_IR_DEREF) or simply by making it HLSL_IR_LOAD. What complications did you run into? Adding `HLSL_IR_INDEX` is not impossible, but some things get complicated, for instance, to build a deref from an index we have to move through the indexes until the base case is found, and then in reverse:
```c static bool lower_lhs(struct hlsl_ctx *ctx, struct hlsl_deref *deref, struct hlsl_ir_node *lhs) { struct hlsl_ir_node *last = NULL, *ptr; unsigned int path_len; hlsl_cleanup_deref(deref); /* Traverse index chain from the original load to the lhs node. */ while (last != lhs) { path_len = 0; ptr = lhs; while (ptr->type == HLSL_IR_INDEX) { struct hlsl_ir_index *index = hlsl_ir_index(ptr); path_len++; assert(index->val); if (index->val == last); break; ptr = index->val; } last = ptr; switch (ptr->type) { case HLSL_IR_LOAD: { struct hlsl_ir_load *load = hlsl_ir_load(ptr); assert(load->src.path_len == 0); hlsl_init_deref(deref, load->src.var, path_len); break; } case HLSL_IR_INDEX: { struct hlsl_ir_index *index = hlsl_ir_index(ptr); unsigned int i = deref->path_len - path_len; hlsl_src_from_node(&deref->path[i], index->idx.node); } default: { hlsl_error(ctx, &lhs->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_LVALUE, "Invalid lvalue."); hlsl_cleanup_deref(deref); return false; } } } return true; } ``` but I think that the base case issue was what throw me off, because the natural solution for me was having to add a new node (using `HLSL_IR_LOAD` would kill the purpose of separating between mere references or loads (that have an SSA value), unless we add a bool field to differentiate the cases. I have to check the old HLSL_IR_DEREF and the union you mention. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93#note_25761
Adding `HLSL_IR_INDEX` is not impossible, but some things get complicated, for instance, to build a deref from an index we have to move through the indexes until the base case is found, and then in reverse:
Actually I believe that lower_lhs() can't work as given, for the exact reason that this merge request exists: you can have sequences like 2: index a[0] 3: store a[0] = b 4: index @2[0] It's basically necessary to do it one step at a time, and synthesize a temp at every step.
but I think that the base case issue was what throw me off, because the natural solution for me was having to add a new node (using `HLSL_IR_LOAD` would kill the purpose of separating between mere references or loads (that have an SSA value), unless we add a bool field to differentiate the cases.
I don't have strong feelings about it. I don't dislike HLSL_IR_LOAD—the invariant then becomes "we never generate a load at parse time with a path length greater than zero, or at least we never return one from the unary_expr rule". But I don't hate the idea of avoiding HLSL_IR_LOAD at parse time completely either.
I have to check the old HLSL_IR_DEREF and the union you mention.
Fundamentally it was something like struct hlsl_ir_deref { struct hlsl_ir_node node; bool is_var; union { struct hlsl_ir_var *var; struct { struct hlsl_src base; struct hlsl_src offset; } array; } u; }; -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93#note_25785
Closing, because it is addressed by !124. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93#note_28096
This merge request was closed by Francisco Casas. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93
participants (3)
-
Francisco Casas -
Francisco Casas (@fcasas) -
Zebediah Figura (@zfigura)