Currently, we can't use `add_load_index()` or `add_load_component()` on the loads to the `"retval"` variables that come from `add_call()` because these functions assume that the loaded value (`var_instr`) won't change between its location and the location and the new load to be created.
We can't get rid of this assumptions either, because, at least `add_load_index()` may be used in the lhs of an assignment, and in that case we can't store to the "deref" synthetic: ``` x[0] = 20; ```
Here I implemented the alternative solution of copying `"retval"` into a synthetic variable after each function call, but we may want to discuss this approach.
-- v2: https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/93
From: Francisco Casas fcasas@codeweavers.com
--- tests/hlsl-function.shader_test | 34 +++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+)
diff --git a/tests/hlsl-function.shader_test b/tests/hlsl-function.shader_test index c836808d..a7cad6d7 100644 --- a/tests/hlsl-function.shader_test +++ b/tests/hlsl-function.shader_test @@ -261,3 +261,37 @@ float4 main() : sv_target func(true); return 0; } + + +[pixel shader] +float func(float a) +{ + return a + 1; +} + +float4 main() : sv_target +{ + return float4(func(1.0), func(2.0), func(5.0), func(6.0)); +} + +[test] +draw quad +todo probe all rgba (2.0, 3.0, 6.0, 7.0) + + +[pixel shader] +float func(float a) +{ + return a + 1; +} + +float4 main() : sv_target +{ + float4 a = {func(1.0), func(2.0), func(5.0), func(6.0)}; + + return a; +} + +[test] +draw quad +todo probe all rgba (2.0, 3.0, 6.0, 7.0)
From: Francisco Casas fcasas@codeweavers.com
Otherwise we can't use add_load_index() or add_load_component() on these loads, becase they assume that the loaded value (var_instr) won't change between its location and the new load.
We can't get rid of this assumptions either, because, at least add_load_index() may be used in the lhs of a store, and in that case we can't store to the "deref" synthetic. --- libs/vkd3d-shader/hlsl.y | 20 ++++++++++++++++++++ tests/hlsl-function.shader_test | 4 ++-- 2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 95c1c0fa..e8ea0f30 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -641,6 +641,8 @@ static struct hlsl_ir_load *add_load_index(struct hlsl_ctx *ctx, struct list *in
if (var_instr->type == HLSL_IR_LOAD) { + /* When var_instr is a load instruction, the value at the index is expected to not be + * modified between this var_instr and the point where the new load will be inserted. */ src = &hlsl_ir_load(var_instr)->src; } else @@ -674,6 +676,8 @@ static struct hlsl_ir_load *add_load_component(struct hlsl_ctx *ctx, struct list
if (var_instr->type == HLSL_IR_LOAD) { + /* When var_instr is a load instruction, the value at the component is expected to not be + * modified between this var_instr and the point where the new load will be inserted. */ src = &hlsl_ir_load(var_instr)->src; } else @@ -3060,11 +3064,27 @@ static struct list *add_call(struct hlsl_ctx *ctx, const char *name,
if (decl->return_var) { + struct hlsl_ir_store *store; struct hlsl_ir_load *load; + struct hlsl_ir_var *var; + + if (!(var = hlsl_new_synthetic_var(ctx, "retval-copy", decl->return_var->data_type, loc))) + { + vkd3d_free(decl); + return NULL; + }
if (!(load = hlsl_new_var_load(ctx, decl->return_var, *loc))) goto fail; list_add_tail(args->instrs, &load->node.entry); + + if (!(store = hlsl_new_simple_store(ctx, var, &load->node))) + goto fail; + list_add_tail(args->instrs, &store->node.entry); + + if (!(load = hlsl_new_var_load(ctx, var, *loc))) + goto fail; + list_add_tail(args->instrs, &load->node.entry); } else { diff --git a/tests/hlsl-function.shader_test b/tests/hlsl-function.shader_test index a7cad6d7..4d4c2e2b 100644 --- a/tests/hlsl-function.shader_test +++ b/tests/hlsl-function.shader_test @@ -276,7 +276,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (2.0, 3.0, 6.0, 7.0) +probe all rgba (2.0, 3.0, 6.0, 7.0)
[pixel shader] @@ -294,4 +294,4 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (2.0, 3.0, 6.0, 7.0) +probe all rgba (2.0, 3.0, 6.0, 7.0)
After thinking about this for a while, and grabbing some ideas from when we discussed this issue, I think that we should split `HLSL_IR_LOAD` into two:
- `HLSL_IR_LOAD`: represents a load from a variable to a temp, cannot be used as an lhs. - `HLSL_IR_REFERENCE`: Merely contains a deref to a variable (or a part of it) but doesn't have an SSA value and cannot be used as a source.
So, while parsing we always work with an `HLSL_IR_REFERENCE`, until a value is required (for instance, as part of an expression), in which case, an `HLSL_IR_LOAD` has to be created from the `HLSL_IR_REFERENCE` (basically just copying the deref).
`HLSL_IR_REFERENCE` would probably also need an additional field with resource locations for creating `HLSL_IR_RESOURCE_LOAD` and `HLSL_IR_RESOURCE_STORE`, when a Texture or UAV is indexed. This could also be the solution for SM 5.1 resource arrays.
A side-effect of these changes would be changing all the ``` /* Only HLSL_IR_LOAD can return an object. */ ``` comments to ``` /* Only HLSL_IR_REFERENCE can return an object. */ ``` or would allow to remove them.
I am currently trying this approach.
After thinking about this for a while, and grabbing some ideas from when we discussed this issue, I think that we should split `HLSL_IR_LOAD` into two:
- `HLSL_IR_LOAD`: represents a load from a variable to a temp, cannot be used as an lhs.
- `HLSL_IR_REFERENCE`: Merely contains a deref to a variable (or a part of it) but doesn't have an SSA value and cannot be used as a source.
Well, it'd have to have an SSA value. E.g. you can have expressions like "x[1].xyzw".
HLSL_IR_REFERENCE isn't a bad name, I'll also throw out HLSL_IR_INDEX [to match e.g. hlsl_new_load_index().]
So, while parsing we always work with an `HLSL_IR_REFERENCE`, until a value is required (for instance, as part of an expression), in which case, an `HLSL_IR_LOAD` has to be created from the `HLSL_IR_REFERENCE` (basically just copying the deref).
I'd just do them all at once in a pass immediately after parsing; that way you don't have to worry about different instruction types.
`HLSL_IR_REFERENCE` would probably also need an additional field with resource locations for creating `HLSL_IR_RESOURCE_LOAD` and `HLSL_IR_RESOURCE_STORE`, when a Texture or UAV is indexed. This could also be the solution for SM 5.1 resource arrays.
Wait, why?
Well, it'd have to have an SSA value. E.g. you can have expressions like "x[1].xyzw".
Hmmp, in this case `x` would be an `HLSL_IR_REFERENCE`, and then `x[1]` would be another `HLSL_IR_REFERENCE` (note that `x[1]` doesn't point to `x`, because, to create `x[1]` the deref is copied and then an src to the `1` node is appended to it).
Then, the swizzle should indeed have point to `x[1]`, but there are either 2 possibilities: - If this is in a lhs, we should invert the swizzle into the rhs as we are doing now, leaving only the `HLSL_IR_REFERENCE` at the lhs (in case there are multiple swizzles, we fold them). - If this is an rhs or part of an expression, then `x[1]` must be turned into a `HLSL_IR_LOAD` in order to be used by the swizzle. The "evaluation" function that replaces the `HLSL_IR_REFERENCE` with `HLSL_IR_LOADS` should also consider this. A scary alternative: adding swizzles to the derefs.
`HLSL_IR_REFERENCE` isn't a bad name, I'll also throw out `HLSL_IR_INDEX` [to match e.g. hlsl_new_load_index().]
I am not sure about `HLSL_IR_INDEX`, at first I interpreted it as a node that references 2 nodes: A value node and an index node (`x` and `1` for `x[1]`), I am not sure if that was the intent. I tried to implement that, but gets complicated fast (and it doesn't cover the base case: the reference of the variable alone, `x`). I realized that is far simpler to have the whole deref into one node (basically the same way as `HLSL_IR_LOAD`).
`HLSL_IR_REFERENCE` would probably also need an additional field with resource locations for creating `HLSL_IR_RESOURCE_LOAD` and `HLSL_IR_RESOURCE_STORE`, when a Texture or UAV is indexed. This could also be the solution for SM 5.1 resource arrays.
Wait, why?
I mean, it is not mandatory, but at the moment seems like a logical step.
Currently we are using an `HLSL_IR_LOAD` to initialize the lhs of an `HLSL_IR_STORE`, but we only care about the deref, and now `HLSL_IR_REFERENCE` will care about that. In the same way, we are currently using an `HLSL_IR_RESOURCE_LOAD`'s deref and `coords` to initialize `HLSL_IR_RESOURCE_STORE`, but not the `HLSL_IR_RESOURCE_LOAD` itself; so we could use `HLSL_IR_REFERENCE` for that too, adding the additional fields.
`HLSL_IR_REFERENCE` isn't a bad name, I'll also throw out `HLSL_IR_INDEX` [to match e.g. hlsl_new_load_index().]
I am not sure about `HLSL_IR_INDEX`, at first I interpreted it as a node that references 2 nodes: A value node and an index node (`x` and `1` for `x[1]`), I am not sure if that was the intent. I tried to implement that, but gets complicated fast (and it doesn't cover the base case: the reference of the variable alone, `x`). I realized that is far simpler to have the whole deref into one node (basically the same way as `HLSL_IR_LOAD`).
That's exactly my intent, though. The base case could be covered by a union (as with the old HLSL_IR_DEREF) or simply by making it HLSL_IR_LOAD. What complications did you run into?
On Tue Feb 28 04:52:27 2023 +0000, Zebediah Figura wrote:
`HLSL_IR_REFERENCE` isn't a bad name, I'll also throw out
`HLSL_IR_INDEX` [to match e.g. hlsl_new_load_index().]
I am not sure about `HLSL_IR_INDEX`, at first I interpreted it as a
node that references 2 nodes: A value node and an index node (`x` and `1` for `x[1]`), I am not sure if that was the intent. I tried to implement that, but gets complicated fast (and it doesn't cover the base case: the reference of the variable alone, `x`). I realized that is far simpler to have the whole deref into one node (basically the same way as `HLSL_IR_LOAD`). That's exactly my intent, though. The base case could be covered by a union (as with the old HLSL_IR_DEREF) or simply by making it HLSL_IR_LOAD. What complications did you run into?
Adding `HLSL_IR_INDEX` is not impossible, but some things get complicated, for instance, to build a deref from an index we have to move through the indexes until the base case is found, and then in reverse:
```c static bool lower_lhs(struct hlsl_ctx *ctx, struct hlsl_deref *deref, struct hlsl_ir_node *lhs) { struct hlsl_ir_node *last = NULL, *ptr; unsigned int path_len;
hlsl_cleanup_deref(deref); /* Traverse index chain from the original load to the lhs node. */ while (last != lhs) { path_len = 0; ptr = lhs; while (ptr->type == HLSL_IR_INDEX) { struct hlsl_ir_index *index = hlsl_ir_index(ptr);
path_len++; assert(index->val); if (index->val == last); break; ptr = index->val; } last = ptr;
switch (ptr->type) { case HLSL_IR_LOAD: { struct hlsl_ir_load *load = hlsl_ir_load(ptr);
assert(load->src.path_len == 0); hlsl_init_deref(deref, load->src.var, path_len); break; }
case HLSL_IR_INDEX: { struct hlsl_ir_index *index = hlsl_ir_index(ptr); unsigned int i = deref->path_len - path_len;
hlsl_src_from_node(&deref->path[i], index->idx.node); }
default: { hlsl_error(ctx, &lhs->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_LVALUE, "Invalid lvalue."); hlsl_cleanup_deref(deref); return false; } } }
return true; } ```
but I think that the base case issue was what throw me off, because the natural solution for me was having to add a new node (using `HLSL_IR_LOAD` would kill the purpose of separating between mere references or loads (that have an SSA value), unless we add a bool field to differentiate the cases.
I have to check the old HLSL_IR_DEREF and the union you mention.
Adding `HLSL_IR_INDEX` is not impossible, but some things get complicated, for instance, to build a deref from an index we have to move through the indexes until the base case is found, and then in reverse:
Actually I believe that lower_lhs() can't work as given, for the exact reason that this merge request exists: you can have sequences like
2: index a[0] 3: store a[0] = b 4: index @2[0]
It's basically necessary to do it one step at a time, and synthesize a temp at every step.
but I think that the base case issue was what throw me off, because the natural solution for me was having to add a new node (using `HLSL_IR_LOAD` would kill the purpose of separating between mere references or loads (that have an SSA value), unless we add a bool field to differentiate the cases.
I don't have strong feelings about it. I don't dislike HLSL_IR_LOAD—the invariant then becomes "we never generate a load at parse time with a path length greater than zero, or at least we never return one from the unary_expr rule". But I don't hate the idea of avoiding HLSL_IR_LOAD at parse time completely either.
I have to check the old HLSL_IR_DEREF and the union you mention.
Fundamentally it was something like
struct hlsl_ir_deref { struct hlsl_ir_node node; bool is_var; union { struct hlsl_ir_var *var; struct { struct hlsl_src base; struct hlsl_src offset; } array; } u; };
Closing, because it is addressed by !124.
This merge request was closed by Francisco Casas.