copy_propagation_transform_object_load() currently retrieves true, which indicates that there was progress, even if the input dereference remains the same.
This can happen repeatedly if an uninitialized object is copied onto itself, as in the tests added in this patch. This results on the compilation getting stuck on an endless loop.
This patch checks if the deref didn't change, to retrieve false in that case.
-- v7: vkd3d-shader/hlsl: Avoid infinite loop in copy propagation. tests: Test correct copy-prop object replacement.
From: Francisco Casas fcasas@codeweavers.com
--- tests/object-references.shader_test | 37 +++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)
diff --git a/tests/object-references.shader_test b/tests/object-references.shader_test index 5e8e2641..0c203d5a 100644 --- a/tests/object-references.shader_test +++ b/tests/object-references.shader_test @@ -1,6 +1,43 @@ [require] shader model >= 4.0
+ +[texture 0] +size (1, 1) +0.77 0.77 0.77 0.77 + +[texture 1] +size (1, 1) +0.64 0.64 0.64 0.64 + +[sampler 0] +filter linear linear linear +address clamp clamp clamp + +[pixel shader] +Texture2D t_good, t_bad; +sampler sam; + +float4 main() : sv_target +{ + Texture2D a, b[1]; + + // This is basically a 'b[0] = t_good' but so that the copy-prop is delayed + int4 co = {0, 0, 0, 0}; + b[(int) co.x + (int) co.y] = t_good; + + a = b[0]; + b[0] = t_bad; + + // 'a' should be 't_good', not 't_bad' at this point + return 100 * a.Sample(sam, float2(0, 0)) + t_good.Sample(sam, float2(0, 0)); +} + +[test] +draw quad +todo probe all rgba (77.77, 77.77, 77.77, 77.77) + + [texture 0] size (2, 2) 0.1 0.2 0.3 0.4 0.5 0.7 0.6 0.8
From: Giovanni Mascellani gmascellani@codeweavers.com
Co-authored-by: Francisco Casas fcasas@codeweavers.com
Because copy_propagation_transform_object_load() replaces a deref instead of an instruction, it is currently prone to two problems:
1- It can replace a deref with the same deref, returning true every time and getting the compilation stuck in an endless loop of copy-propagation iterations.
2- When performed multiple times in the same deref, the second time it can replace the deref with a deref from a temp that is only valid in another point of the program execution, resulting in an incorrect value.
This patch preempts this by avoiding propagating object loads when the variable is a real object (i.e., a uniform variable) which cannot have a replacement. --- libs/vkd3d-shader/hlsl_codegen.c | 6 +++++ tests/object-references.shader_test | 39 ++++++++++++++++++++++++++++- 2 files changed, 44 insertions(+), 1 deletion(-)
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 5e7cfff4..fa96ca4b 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -927,6 +927,12 @@ static bool copy_propagation_transform_object_load(struct hlsl_ctx *ctx, /* Only HLSL_IR_LOAD can produce an object. */ load = hlsl_ir_load(value->node);
+ if (!load->src.var->is_uniform) + { + TRACE("Ignoring load from non-uniform object variable %s\n", load->src.var->name); + return false; + } + hlsl_cleanup_deref(deref); hlsl_copy_deref(ctx, deref, &load->src);
diff --git a/tests/object-references.shader_test b/tests/object-references.shader_test index 0c203d5a..bc74ccd4 100644 --- a/tests/object-references.shader_test +++ b/tests/object-references.shader_test @@ -1,3 +1,15 @@ +[pixel shader fail] +sampler sam; + +float4 main() : sv_target +{ + Texture2D tex; + + tex = tex; // Uninitialized assignment to self. + return tex.Sample(sam, float2(0, 0)); +} + + [require] shader model >= 4.0
@@ -35,7 +47,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (77.77, 77.77, 77.77, 77.77) +probe all rgba (77.77, 77.77, 77.77, 77.77)
[texture 0] @@ -201,3 +213,28 @@ float4 main(Texture2D tex2) : sv_target tex2 = tex1; return tex2.Load(int3(0, 0, 0)); } + + +[require] +shader model >= 5.0 + + +[texture 0] +size (1, 1) +1.0 2.0 3.0 4.0 + +[pixel shader todo] +struct apple { + Texture2D tex; + float4 fo : COLOR; +}; + +float4 main(struct apple input) : sv_target +{ + input.tex = input.tex; // Assignment to self. + return input.tex.Load(int3(0, 0, 0)); +} + +[test] +todo draw quad +todo probe (0, 0) rgba (1.0, 2.0, 3.0, 4.0)
:arrow_up: Giovanni's solution again, which consists in only allowing to replace derefs if they point to a uniform variable.
I added the tgood/tbad test and it passes it, but the recently merged !51 was required.
For simplicity I will send the uninitialized object references tests in another MR.
This really deserves an explanation in the code.
Note also that what makes this work is not so much that the load src is uniform (and hence "real") but that it's constant. In fact, more broadly, we can safely copy-prop an object load *if* the load var doesn't change between the prior load and this resource instruction. We don't have the infrastructure right now to do that with all variables (we could if we wanted to, but it's probably not worthwhile—doing it with uniforms alone should probably be enough?)
We could, hence, check for HLSL_MODIFIER_CONST on the variable instead of is_uniform. That would, notably, require that we create uniforms with a constant data type, which we don't currently. I dunno if it's worth changing that yet, but there is a bit of a "least surprise" element here.
This really deserves an explanation in the code.
Yes, a comment is not a bad idea here.
Note also that what makes this work is not so much that the load src is uniform (and hence "real") but that it's constant. In fact, more broadly, we can safely copy-prop an object load *if* the load var doesn't change between the prior load and this resource instruction. We don't have the infrastructure right now to do that with all variables (we could if we wanted to, but it's probably not worthwhile—doing it with uniforms alone should probably be enough?)
I think I agree, but this feels more subtle (for example, you have to be sure of what happens when you have a constant object initialized to a mutable one, which is later mutated, and be sure that the right thing happens). So I would avoid it until we have an indication that is required for feature parity with native.
OTOH, I think that the current implementation is still somewhat problematic. I couldn't write an actual failing example, but I am concerned by something like: ``` a = b; b = c; d = a; ``` Here `d` is supposed to be equal to the uniform `b`, but if you skip the first load of `b` during the first pass you end up pointing to the uniform `c`, in basically the same way as Zeb's example. I don't think this MR fixed that, because it is only concerned with object loads (and there there aren't). So either you do the check also for "regular" loads, or (better, I think) you don't even store a non-uniform object in the copy propagation state, with something like this patch (on top of this MR): ```patch diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index fa96ca4b..bab17c21 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -927,12 +927,6 @@ static bool copy_propagation_transform_object_load(struct hlsl_ctx *ctx, /* Only HLSL_IR_LOAD can produce an object. */ load = hlsl_ir_load(value->node);
- if (!load->src.var->is_uniform) - { - TRACE("Ignoring load from non-uniform object variable %s\n", load->src.var->name); - return false; - } - hlsl_cleanup_deref(deref); hlsl_copy_deref(ctx, deref, &load->src);
@@ -975,7 +969,20 @@ static void copy_propagation_record_store(struct hlsl_ctx *ctx, struct hlsl_ir_s unsigned int writemask = store->writemask;
if (store->rhs.node->data_type->type == HLSL_CLASS_OBJECT) + { + struct hlsl_ir_load *load; + writemask = VKD3DSP_WRITEMASK_0; + + /* Only HLSL_IR_LOAD can produce an object. */ + load = hlsl_ir_load(store->rhs.node); + if (!load->src.var->is_uniform) + { + TRACE("Ignoring store of non-uniform object variable %s\n", load->src.var->name); + copy_propagation_invalidate_variable_from_deref(ctx, var_def, lhs, store->writemask); + return; + } + } copy_propagation_set_value(var_def, start, writemask, store->rhs.node); } else ```
On Wed Jan 25 12:34:56 2023 +0000, Giovanni Mascellani wrote:
This really deserves an explanation in the code.
Yes, a comment is not a bad idea here.
Note also that what makes this work is not so much that the load src
is uniform (and hence "real") but that it's constant. In fact, more broadly, we can safely copy-prop an object load *if* the load var doesn't change between the prior load and this resource instruction. We don't have the infrastructure right now to do that with all variables (we could if we wanted to, but it's probably not worthwhile—doing it with uniforms alone should probably be enough?) I think I agree, but this feels more subtle (for example, you have to be sure of what happens when you have a constant object initialized to a mutable one, which is later mutated, and be sure that the right thing happens). So I would avoid it until we have an indication that is required for feature parity with native. OTOH, I think that the current implementation is still somewhat problematic. I couldn't write an actual failing example, but I am concerned by something like:
a = b; b = c; d = a;
Here `d` is supposed to be equal to the uniform `b`, but if you skip the first load of `b` during the first pass you end up pointing to the uniform `c`, in basically the same way as Zeb's example. I don't think this MR fixed that, because it is only concerned with object loads (and there there aren't). So either you do the check also for "regular" loads, or (better, I think) you don't even store a non-uniform object in the copy propagation state, with something like this patch (on top of this MR):
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index fa96ca4b..bab17c21 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -927,12 +927,6 @@ static bool copy_propagation_transform_object_load(struct hlsl_ctx *ctx, /* Only HLSL_IR_LOAD can produce an object. */ load = hlsl_ir_load(value->node); - if (!load->src.var->is_uniform) - { - TRACE("Ignoring load from non-uniform object variable %s\n", load->src.var->name); - return false; - } - hlsl_cleanup_deref(deref); hlsl_copy_deref(ctx, deref, &load->src); @@ -975,7 +969,20 @@ static void copy_propagation_record_store(struct hlsl_ctx *ctx, struct hlsl_ir_s unsigned int writemask = store->writemask; if (store->rhs.node->data_type->type == HLSL_CLASS_OBJECT) + { + struct hlsl_ir_load *load; + writemask = VKD3DSP_WRITEMASK_0; + + /* Only HLSL_IR_LOAD can produce an object. */ + load = hlsl_ir_load(store->rhs.node); + if (!load->src.var->is_uniform) + { + TRACE("Ignoring store of non-uniform object variable %s\n", load->src.var->name); + copy_propagation_invalidate_variable_from_deref(ctx, var_def, lhs, store->writemask); + return; + } + } copy_propagation_set_value(var_def, start, writemask, store->rhs.node); } else
I don't think the problem is currently happening with regular stores/loads, because when we do a replacement, we are removing the whole instruction (and replacing all subsequent references to it), so there is not the possibility of copy-prop doing an incorrect replacement for the instruction being replaced a second time.
Writing directly ``` a = b; b = c; d = a; ``` could be misleading, since on regular stores what we have on the rhs is an hlsl_src, a reference to another node, so it would be more accurate to write it as: ``` 1 : load (b) 2 : a = @1; 3 : load (c) 4 : b = @3; 5 : load (a) 6 : d = @5; ``` which would be transformed to ``` 1 : load (b) 2 : a = @1; 3 : load (c) 4 : b = @3; 5 : <deleted> 6 : d = @1; ```
Note also that what makes this work is not so much that the load src is uniform (and hence "real") but that it's constant. In fact, more broadly, we can safely copy-prop an object load *if* the load var doesn't change between the prior load and this resource instruction. We don't have the infrastructure right now to do that with all variables (we could if we wanted to, but it's probably not worthwhile—doing it with uniforms alone should probably be enough?)
I think I agree, but this feels more subtle (for example, you have to be sure of what happens when you have a constant object initialized to a mutable one, which is later mutated, and be sure that the right thing happens). So I would avoid it until we have an indication that is required for feature parity with native.
I think we can probably assume that variables act like variables and not pointers ;-)
OTOH, I think that the current implementation is still somewhat problematic. I couldn't write an actual failing example, but I am concerned by something like:
Normal variables work fine, because what we're propagating is not the hlsl_deref but the hlsl_ir_load instruction pointer. That is, we're propagating an SSA pointer to a load that happened at a previous point in time.
Object loads in resource instructions are special and require this patch, because they shortcut the hlsl_ir_load instruction and use the hlsl_deref directly.