If a hlsl_ir_load loads a variable whose components are stored from different instructions, copy propagation doesn't replace it.
But if all these instructions are constants (which currently is the case for value constructors), the load can be replaced with a constant value, which is what the first patch of this series does.
For instance, this shader:
``` sampler s; Texture2D t;
float4 main() : sv_target { return t.Gather(s, float2(0.6, 0.6), int2(0, 0)); } ```
results in the following IR before applying the patch: ``` float | 6.00000024e-01 float | 6.00000024e-01 uint | 0 | = (<constructor-2>[@4].x @2) uint | 1 | = (<constructor-2>[@6].x @3) float2 | <constructor-2> int | 0 int | 0 uint | 0 | = (<constructor-5>[@11].x @9) uint | 1 | = (<constructor-5>[@13].x @10) int2 | <constructor-5> float4 | gather_red(resource = t, sampler = s, coords = @8, offset = @15) | return | = (<output-sv_target0> @16) ```
and this IR afterwards: ``` float2 | {6.00000024e-01 6.00000024e-01 } int2 | {0 0 } float4 | gather_red(resource = t, sampler = s, coords = @2, offset = @3) | return | = (<output-sv_target0> @4) ```
This is required to write texel_offsets as aoffimmi modifiers in the sm4 backend, since it expects the texel_offset arguments to be hlsl_ir_constant.
This series also: * Allows Gather() methods to use aoffimmi modifiers instead of an additional source register (which is the only way allowed for shader model 4.1), when possible. * Adds support to texel_offsets in the Load() method via aoffimmi modifiers (the only allowed method).
-- v4: vkd3d-shader/hlsl: Propagate swizzle chains in copy propagation. vkd3d-shader/hlsl: Replace swizzles with constants in copy prop. tests: Test constant propagation through swizzles. vkd3d-shader/hlsl: Support offset argument for the texture Load() method. tests: Test offset argument for the texture Load() method. vkd3d-shader/hlsl: Use aoffimmis when writing gather methods. vkd3d-shader/hlsl: Replace loads with constants in copy prop.
From: Francisco Casas fcasas@codeweavers.com
If a hlsl_ir_load loads a variable whose components are stored from different instructions, copy propagation doesn't replace it.
But if all these instructions are constants (which currently is the case for value constructors), the load could be replaced with a constant value. Which is expected in some other instructions, e.g. texel_offsets when using aoffimmi modifiers.
For instance, this shader:
``` sampler s; Texture2D t;
float4 main() : sv_target { return t.Gather(s, float2(0.6, 0.6), int2(0, 0)); } ```
results in the following IR before applying the patch: ``` float | 6.00000024e-01 float | 6.00000024e-01 uint | 0 | = (<constructor-2>[@4].x @2) uint | 1 | = (<constructor-2>[@6].x @3) float2 | <constructor-2> int | 0 int | 0 uint | 0 | = (<constructor-5>[@11].x @9) uint | 1 | = (<constructor-5>[@13].x @10) int2 | <constructor-5> float4 | gather_red(resource = t, sampler = s, coords = @8, offset = @15) | return | = (<output-sv_target0> @16) ```
and this IR afterwards: ``` float2 | {6.00000024e-01 6.00000024e-01 } int2 | {0 0 } float4 | gather_red(resource = t, sampler = s, coords = @2, offset = @3) | return | = (<output-sv_target0> @4) ``` --- libs/vkd3d-shader/hlsl_codegen.c | 42 ++++++++++++++++++++++ tests/hlsl-initializer-objects.shader_test | 8 ++--- tests/object-references.shader_test | 6 ++-- tests/sampler-offset.shader_test | 12 +++---- tests/shader_runner_d3d12.c | 2 +- 5 files changed, 56 insertions(+), 14 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 6e4168fc..9bdbd57c 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -718,6 +718,41 @@ static struct hlsl_ir_node *copy_propagation_compute_replacement(struct hlsl_ctx return instr; }
+static struct hlsl_ir_node *copy_propagation_compute_load_constant_replacement(struct hlsl_ctx *ctx, + const struct copy_propagation_state *state, const struct hlsl_ir_load *load) +{ + const struct hlsl_ir_var *var = load->src.var; + union hlsl_constant_value values[4] = {0}; + struct hlsl_ir_constant *cons; + unsigned int start, count, i; + + if (load->node.data_type->type != HLSL_CLASS_SCALAR && load->node.data_type->type != HLSL_CLASS_VECTOR) + return NULL; + + if (!hlsl_component_index_range_from_deref(ctx, &load->src, &start, &count)) + return NULL; + + for (i = 0; i < count; ++i) + { + struct copy_propagation_value *value = copy_propagation_get_value(state, var, start + i); + + if (!value || value->node->type != HLSL_IR_CONSTANT) + return NULL; + + values[i] = hlsl_ir_constant(value->node)->value[value->component]; + } + + if (!(cons = hlsl_new_constant(ctx, load->node.data_type, &load->node.loc))) + return NULL; + cons->value[0] = values[0]; + cons->value[1] = values[1]; + cons->value[2] = values[2]; + cons->value[3] = values[3]; + + TRACE("Load from %s[%u-%u] turned into a constant %p.\n", var->name, start, start + count, cons); + return &cons->node; +} + static bool copy_propagation_transform_load(struct hlsl_ctx *ctx, struct hlsl_ir_load *load, struct copy_propagation_state *state) { @@ -746,6 +781,13 @@ static bool copy_propagation_transform_load(struct hlsl_ctx *ctx, return false; }
+ if ((new_instr = copy_propagation_compute_load_constant_replacement(ctx, state, load))) + { + list_add_before(&instr->entry, &new_instr->entry); + hlsl_replace_node(instr, new_instr); + return true; + } + if (!(new_instr = copy_propagation_compute_replacement(ctx, state, &load->src, &swizzle))) return false;
diff --git a/tests/hlsl-initializer-objects.shader_test b/tests/hlsl-initializer-objects.shader_test index d40ede46..d9c0bc91 100644 --- a/tests/hlsl-initializer-objects.shader_test +++ b/tests/hlsl-initializer-objects.shader_test @@ -29,7 +29,7 @@ draw quad probe all rgba (0.2, 0.2, 0.2, 0.1)
-[pixel shader todo] +[pixel shader] Texture2D tex;
struct foo @@ -48,11 +48,11 @@ float4 main() : sv_target }
[test] -todo draw quad -todo probe all rgba (31.1, 41.1, 51.1, 61.1) 1 +draw quad +probe all rgba (31.1, 41.1, 51.1, 61.1) 1
-[pixel shader todo] +[pixel shader] Texture2D tex1; Texture2D tex2;
diff --git a/tests/object-references.shader_test b/tests/object-references.shader_test index 12f745e6..ba9b1235 100644 --- a/tests/object-references.shader_test +++ b/tests/object-references.shader_test @@ -132,7 +132,7 @@ float4 main() : sv_target }
-[pixel shader todo] +[pixel shader] Texture2D tex; uniform float f;
@@ -153,5 +153,5 @@ float4 main() : sv_target
[test] uniform 0 float 10.0 -todo draw quad -todo probe (0, 0) rgba (11.0, 12.0, 13.0, 11.0) +draw quad +probe (0, 0) rgba (11.0, 12.0, 13.0, 11.0) diff --git a/tests/sampler-offset.shader_test b/tests/sampler-offset.shader_test index 2aa8f9b3..6f8357df 100644 --- a/tests/sampler-offset.shader_test +++ b/tests/sampler-offset.shader_test @@ -12,7 +12,7 @@ size (3, 3) 0.0 0.2 0.0 0.4 0.1 0.2 0.5 0.0 0.2 0.2 0.0 0.4
-[pixel shader todo] +[pixel shader] sampler s; Texture2D t;
@@ -22,11 +22,11 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (0.1, 0.2, 0.5, 0.0)
-[pixel shader todo] +[pixel shader] sampler s; Texture2D t;
@@ -36,11 +36,11 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (0.2, 0.2, 0.0, 0.4)
-[pixel shader todo] +[pixel shader] sampler s; Texture2D t;
@@ -50,5 +50,5 @@ float4 main() : sv_target }
[test] -todo draw quad +draw quad probe all rgba (0.0, 0.2, 0.0, 0.4) diff --git a/tests/shader_runner_d3d12.c b/tests/shader_runner_d3d12.c index bb4d9c5a..bd94b4c9 100644 --- a/tests/shader_runner_d3d12.c +++ b/tests/shader_runner_d3d12.c @@ -167,7 +167,7 @@ static ID3D12RootSignature *d3d12_runner_create_root_signature(struct d3d12_shad ID3D12GraphicsCommandList *command_list, unsigned int *uniform_index) { D3D12_ROOT_SIGNATURE_DESC root_signature_desc = {0}; - D3D12_ROOT_PARAMETER root_params[3], *root_param; + D3D12_ROOT_PARAMETER root_params[4], *root_param; D3D12_STATIC_SAMPLER_DESC static_samplers[1]; ID3D12RootSignature *root_signature; HRESULT hr;
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl_sm4.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl_sm4.c b/libs/vkd3d-shader/hlsl_sm4.c index ae5bb1ac..4059d618 100644 --- a/libs/vkd3d-shader/hlsl_sm4.c +++ b/libs/vkd3d-shader/hlsl_sm4.c @@ -2110,11 +2110,19 @@ static void write_sm4_gather(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer
sm4_src_from_node(&instr.srcs[instr.src_count++], coords, VKD3DSP_WRITEMASK_ALL);
- /* FIXME: Use an aoffimmi modifier if possible. */ if (texel_offset) { - instr.opcode = VKD3D_SM5_OP_GATHER4_PO; - sm4_src_from_node(&instr.srcs[instr.src_count++], texel_offset, VKD3DSP_WRITEMASK_ALL); + if (!encode_texel_offset_as_aoffimmi(&instr, texel_offset)) + { + if (ctx->profile->major_version < 5) + { + hlsl_error(ctx, &texel_offset->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TEXEL_OFFSET, + "Offset must resolve to integer literal in the range -8 to 7 for profiles < 5."); + return; + } + instr.opcode = VKD3D_SM5_OP_GATHER4_PO; + sm4_src_from_node(&instr.srcs[instr.src_count++], texel_offset, VKD3DSP_WRITEMASK_ALL); + } }
sm4_src_from_deref(ctx, &instr.srcs[instr.src_count++], resource, resource_type, instr.dsts[0].writemask);
From: Francisco Casas fcasas@codeweavers.com
--- Makefile.am | 1 + tests/texture-load-offset.shader_test | 51 +++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) create mode 100644 tests/texture-load-offset.shader_test
diff --git a/Makefile.am b/Makefile.am index 85cd4642..84d75497 100644 --- a/Makefile.am +++ b/Makefile.am @@ -147,6 +147,7 @@ vkd3d_shader_tests = \ tests/swizzle-6.shader_test \ tests/swizzle-7.shader_test \ tests/texture-load.shader_test \ + tests/texture-load-offset.shader_test \ tests/texture-load-typed.shader_test \ tests/trigonometry.shader_test \ tests/uav.shader_test \ diff --git a/tests/texture-load-offset.shader_test b/tests/texture-load-offset.shader_test new file mode 100644 index 00000000..ab233c58 --- /dev/null +++ b/tests/texture-load-offset.shader_test @@ -0,0 +1,51 @@ +[require] +shader model >= 4.0 + +[texture 0] +size (3, 3) +0 0 0 1 1 0 0 1 2 0 0 1 +0 1 0 1 1 1 0 1 2 1 0 1 +0 2 0 1 1 2 0 1 2 2 0 1 + + +[pixel shader] +Texture2D t; + +float4 main(float4 pos : sv_position) : sv_target +{ + return t.Load(int3(pos.xy, 0), int2(0, 1)); +} + + +[test] +draw quad +todo probe (0, 0) rgba (0, 1, 0, 1) +todo probe (1, 0) rgba (1, 1, 0, 1) +todo probe (0, 1) rgba (0, 2, 0, 1) +todo probe (1, 1) rgba (1, 2, 0, 1) + + +[pixel shader] +Texture2D t; + +float4 main(float4 pos : sv_position) : sv_target +{ + return t.Load(int3(pos.xy, 0), int2(-2, 0)); +} + + +[test] +draw quad +todo probe (3, 0) rgba (1, 0, 0, 1) +todo probe (4, 0) rgba (2, 0, 0, 1) +todo probe (3, 1) rgba (1, 1, 0, 1) +todo probe (4, 1) rgba (2, 1, 0, 1) + + +[pixel shader fail todo] +Texture2D t; + +float4 main(float4 pos : sv_position) : sv_target +{ + return t.Load(int3(pos.xy, 0), int2(8, 1)); +}
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl_sm4.c | 16 ++++++++++++++-- tests/texture-load-offset.shader_test | 18 +++++++++--------- 2 files changed, 23 insertions(+), 11 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl_sm4.c b/libs/vkd3d-shader/hlsl_sm4.c index 4059d618..1595e5be 100644 --- a/libs/vkd3d-shader/hlsl_sm4.c +++ b/libs/vkd3d-shader/hlsl_sm4.c @@ -1418,7 +1418,8 @@ static void write_sm4_constant(struct hlsl_ctx *ctx,
static void write_sm4_ld(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *buffer, const struct hlsl_type *resource_type, const struct hlsl_ir_node *dst, - const struct hlsl_deref *resource, const struct hlsl_ir_node *coords) + const struct hlsl_deref *resource, const struct hlsl_ir_node *coords, + const struct hlsl_ir_node *texel_offset) { bool uav = (resource_type->base_type == HLSL_TYPE_UAV); struct sm4_instruction instr; @@ -1427,6 +1428,16 @@ static void write_sm4_ld(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *buf memset(&instr, 0, sizeof(instr)); instr.opcode = uav ? VKD3D_SM5_OP_LD_UAV_TYPED : VKD3D_SM4_OP_LD;
+ if (texel_offset) + { + if (!encode_texel_offset_as_aoffimmi(&instr, texel_offset)) + { + hlsl_error(ctx, &texel_offset->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TEXEL_OFFSET, + "Offset must resolve to integer literal in the range -8 to 7."); + return; + } + } + sm4_dst_from_node(&instr.dsts[0], dst); instr.dst_count = 1;
@@ -2179,7 +2190,8 @@ static void write_sm4_resource_load(struct hlsl_ctx *ctx, switch (load->load_type) { case HLSL_RESOURCE_LOAD: - write_sm4_ld(ctx, buffer, resource_type, &load->node, &load->resource, coords); + write_sm4_ld(ctx, buffer, resource_type, &load->node, &load->resource, + coords, texel_offset); break;
case HLSL_RESOURCE_SAMPLE: diff --git a/tests/texture-load-offset.shader_test b/tests/texture-load-offset.shader_test index ab233c58..52b6a5f9 100644 --- a/tests/texture-load-offset.shader_test +++ b/tests/texture-load-offset.shader_test @@ -19,10 +19,10 @@ float4 main(float4 pos : sv_position) : sv_target
[test] draw quad -todo probe (0, 0) rgba (0, 1, 0, 1) -todo probe (1, 0) rgba (1, 1, 0, 1) -todo probe (0, 1) rgba (0, 2, 0, 1) -todo probe (1, 1) rgba (1, 2, 0, 1) +probe (0, 0) rgba (0, 1, 0, 1) +probe (1, 0) rgba (1, 1, 0, 1) +probe (0, 1) rgba (0, 2, 0, 1) +probe (1, 1) rgba (1, 2, 0, 1)
[pixel shader] @@ -36,13 +36,13 @@ float4 main(float4 pos : sv_position) : sv_target
[test] draw quad -todo probe (3, 0) rgba (1, 0, 0, 1) -todo probe (4, 0) rgba (2, 0, 0, 1) -todo probe (3, 1) rgba (1, 1, 0, 1) -todo probe (4, 1) rgba (2, 1, 0, 1) +probe (3, 0) rgba (1, 0, 0, 1) +probe (4, 0) rgba (2, 0, 0, 1) +probe (3, 1) rgba (1, 1, 0, 1) +probe (4, 1) rgba (2, 1, 0, 1)
-[pixel shader fail todo] +[pixel shader fail] Texture2D t;
float4 main(float4 pos : sv_position) : sv_target
From: Francisco Casas fcasas@codeweavers.com
The Load() method offsets are used for these tests since these must solve to constants in order to pass. --- Makefile.am | 1 + tests/swizzle-constant-prop.shader_test | 45 +++++++++++++++++++++++++ 2 files changed, 46 insertions(+) create mode 100644 tests/swizzle-constant-prop.shader_test
diff --git a/Makefile.am b/Makefile.am index 84d75497..e3ff2941 100644 --- a/Makefile.am +++ b/Makefile.am @@ -146,6 +146,7 @@ vkd3d_shader_tests = \ tests/swizzle-5.shader_test \ tests/swizzle-6.shader_test \ tests/swizzle-7.shader_test \ + tests/swizzle-constant-prop.shader_test \ tests/texture-load.shader_test \ tests/texture-load-offset.shader_test \ tests/texture-load-typed.shader_test \ diff --git a/tests/swizzle-constant-prop.shader_test b/tests/swizzle-constant-prop.shader_test new file mode 100644 index 00000000..0a13b4df --- /dev/null +++ b/tests/swizzle-constant-prop.shader_test @@ -0,0 +1,45 @@ +[require] +shader model >= 4.0 + + +[texture 0] +size (4, 4) + 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 + 5 5 5 5 6 6 6 6 7 7 7 7 8 8 8 8 + 9 9 9 9 10 10 10 10 11 11 11 11 12 12 12 12 +13 13 13 13 14 14 14 14 14 15 15 15 16 16 16 16 + + +[pixel shader todo] +Texture2D tex; +uniform int i; + +float4 main() : sv_target +{ + int4 a = {1, 2, i, i}; + return 100 * a + tex.Load(int3(0, 0, 0), a.xy); +} + +[test] +uniform 0 int 4 +todo draw quad +todo probe all rgba (110, 210, 410, 410) + + +[pixel shader todo] +Texture2D tex; +uniform int i; + +float4 main() : sv_target +{ + int4 a = {0, 1, 2, i}; + int4 b = a.yxww; + int3 c = b.wyx; + return 100 * b + tex.Load(int3(0, 0, 0), c.yz); +} + + +[test] +uniform 0 int 3 +todo draw quad +todo probe all rgba (105, 5, 305, 305)
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl_codegen.c | 68 +++++++++++++++++++++++++ tests/swizzle-constant-prop.shader_test | 6 +-- 2 files changed, 71 insertions(+), 3 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 9bdbd57c..4cd08be9 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -753,6 +753,56 @@ static struct hlsl_ir_node *copy_propagation_compute_load_constant_replacement(s return &cons->node; }
+static struct hlsl_ir_node *copy_propagation_compute_swizzle_constant_replacement(struct hlsl_ctx *ctx, + const struct copy_propagation_state *state, const struct hlsl_ir_swizzle *swizzle) +{ + unsigned int start, count, i, c, swizzle_bits, n_comps; + union hlsl_constant_value values[4] = {0}; + const struct hlsl_ir_load *load; + struct hlsl_ir_constant *cons; + const struct hlsl_ir_var *var; + + if (swizzle->val.node->type != HLSL_IR_LOAD) + return NULL; + load = hlsl_ir_load(swizzle->val.node); + var = load->src.var; + + if (load->node.data_type->type != HLSL_CLASS_SCALAR && load->node.data_type->type != HLSL_CLASS_VECTOR) + return NULL; + + if (!hlsl_component_index_range_from_deref(ctx, &load->src, &start, &count)) + return NULL; + + swizzle_bits = swizzle->swizzle; + n_comps = swizzle->node.data_type->dimx; + for (i = 0; i < n_comps; ++i) + { + struct copy_propagation_value *value; + + c = swizzle_bits & 3; + assert(c < count); + value = copy_propagation_get_value(state, var, start + c); + + if (!value || value->node->type != HLSL_IR_CONSTANT) + return NULL; + + values[i] = hlsl_ir_constant(value->node)->value[value->component]; + + swizzle_bits >>= 2; + } + + if (!(cons = hlsl_new_constant(ctx, swizzle->node.data_type, &swizzle->node.loc))) + return NULL; + cons->value[0] = values[0]; + cons->value[1] = values[1]; + cons->value[2] = values[2]; + cons->value[3] = values[3]; + + TRACE("Swizzle from %s[%u-%u]%s turned into a constant %p.\n", var->name, start, start + count, + debug_hlsl_swizzle(swizzle->swizzle, n_comps), cons); + return &cons->node; +} + static bool copy_propagation_transform_load(struct hlsl_ctx *ctx, struct hlsl_ir_load *load, struct copy_propagation_state *state) { @@ -832,6 +882,20 @@ static bool copy_propagation_transform_resource_load(struct hlsl_ctx *ctx, return progress; }
+static bool copy_propagation_transform_swizzle(struct hlsl_ctx *ctx, + struct hlsl_ir_swizzle *swizzle, struct copy_propagation_state *state) +{ + struct hlsl_ir_node *instr = &swizzle->node, *new_instr; + + if ((new_instr = copy_propagation_compute_swizzle_constant_replacement(ctx, state, swizzle))) + { + list_add_before(&instr->entry, &new_instr->entry); + hlsl_replace_node(instr, new_instr); + return true; + } + return false; +} + static bool copy_propagation_transform_resource_store(struct hlsl_ctx *ctx, struct hlsl_ir_resource_store *store, struct copy_propagation_state *state) { @@ -995,6 +1059,10 @@ static bool copy_propagation_transform_block(struct hlsl_ctx *ctx, struct hlsl_b copy_propagation_record_store(ctx, hlsl_ir_store(instr), state); break;
+ case HLSL_IR_SWIZZLE: + copy_propagation_transform_swizzle(ctx, hlsl_ir_swizzle(instr), state); + break; + case HLSL_IR_IF: progress |= copy_propagation_process_if(ctx, hlsl_ir_if(instr), state); break; diff --git a/tests/swizzle-constant-prop.shader_test b/tests/swizzle-constant-prop.shader_test index 0a13b4df..86eb3f19 100644 --- a/tests/swizzle-constant-prop.shader_test +++ b/tests/swizzle-constant-prop.shader_test @@ -10,7 +10,7 @@ size (4, 4) 13 13 13 13 14 14 14 14 14 15 15 15 16 16 16 16
-[pixel shader todo] +[pixel shader] Texture2D tex; uniform int i;
@@ -22,8 +22,8 @@ float4 main() : sv_target
[test] uniform 0 int 4 -todo draw quad -todo probe all rgba (110, 210, 410, 410) +draw quad +probe all rgba (110, 210, 410, 410)
[pixel shader todo]
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl_codegen.c | 22 ++++++++++++++++++++++ tests/swizzle-constant-prop.shader_test | 6 +++--- 2 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 4cd08be9..3e7c7850 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -886,6 +886,8 @@ static bool copy_propagation_transform_swizzle(struct hlsl_ctx *ctx, struct hlsl_ir_swizzle *swizzle, struct copy_propagation_state *state) { struct hlsl_ir_node *instr = &swizzle->node, *new_instr; + struct hlsl_ir_node *next_val = swizzle->val.node; + unsigned int combined_swizzle = swizzle->swizzle;
if ((new_instr = copy_propagation_compute_swizzle_constant_replacement(ctx, state, swizzle))) { @@ -893,6 +895,26 @@ static bool copy_propagation_transform_swizzle(struct hlsl_ctx *ctx, hlsl_replace_node(instr, new_instr); return true; } + + while (next_val->type == HLSL_IR_SWIZZLE) + { + combined_swizzle = hlsl_combine_swizzles(hlsl_ir_swizzle(next_val)->swizzle, + combined_swizzle, instr->data_type->dimx); + next_val = hlsl_ir_swizzle(next_val)->val.node; + } + if (next_val != swizzle->val.node) + { + struct hlsl_ir_swizzle *new_swizzle; + + if (!(new_swizzle = hlsl_new_swizzle(ctx, combined_swizzle, instr->data_type->dimx, next_val, &instr->loc))) + return false; + + new_instr = &new_swizzle->node; + list_add_before(&instr->entry, &new_instr->entry); + hlsl_replace_node(instr, new_instr); + return true; + } + return false; }
diff --git a/tests/swizzle-constant-prop.shader_test b/tests/swizzle-constant-prop.shader_test index 86eb3f19..468a66b6 100644 --- a/tests/swizzle-constant-prop.shader_test +++ b/tests/swizzle-constant-prop.shader_test @@ -26,7 +26,7 @@ draw quad probe all rgba (110, 210, 410, 410)
-[pixel shader todo] +[pixel shader] Texture2D tex; uniform int i;
@@ -41,5 +41,5 @@ float4 main() : sv_target
[test] uniform 0 int 3 -todo draw quad -todo probe all rgba (105, 5, 305, 305) +draw quad +probe all rgba (105, 5, 305, 305)
Wrt patch 7/7, why not just add a pass that combines multiple swizzles outside of copy-prop?
On Mon Dec 12 21:53:22 2022 +0000, Zebediah Figura wrote:
Wrt patch 7/7, why not just add a pass that combines multiple swizzles outside of copy-prop?
Hmm, good point, this could indeed be separated from copy-prop, as part of the `while(progress)` loop.
I think that this process is very related to copy-prop though, since, strictly speaking, it doesn't combine swizzles, but instead makes swizzle nodes that reference other swizzles to directly reference the source of the value.
e.g. if we have multiple swizzles: ``` b = a.xz; c = b.yy; ``` what this does is keeping the same number of swizzles but it lowers 'c' so that it directly references the first non-swizzle node (`a` in this case): ``` b = a.xz; c = a.zz; ```
I think `swizzle_propagation` could be a good name for this pass, but I am not totally sure.
On Mon Dec 12 21:53:50 2022 +0000, Francisco Casas wrote:
Hmm, good point, this could indeed be separated from copy-prop, as part of the `while(progress)` loop. I think that this process is very related to copy-prop though, since, strictly speaking, it doesn't combine swizzles, but instead makes swizzle nodes that reference other swizzles to directly reference the source of the value. e.g. if we have multiple swizzles:
b = a.xz; c = b.yy;
what this does is keeping the same number of swizzles but it lowers 'c' so that it directly references the first non-swizzle node (`a` in this case):
b = a.xz; c = a.zz;
I think `swizzle_propagation` could be a good name for this pass, but I am not totally sure.
Sure, but a separate pass would also deal with "a.xz.yy". Granted, probably nobody would ever write that intentionally (well, maybe with macros it's plausible, though?)
On Mon Dec 12 22:20:54 2022 +0000, Zebediah Figura wrote:
Sure, but a separate pass would also deal with "a.xz.yy". Granted, probably nobody would ever write that intentionally (well, maybe with macros it's plausible, though?)
Okay, writing it as a separated pass!
I think the current version also deals with swizzles like "a.xz.yy" btw. Gotta add tests for those too.
Francisco Casas (@fcasas) commented about libs/vkd3d-shader/hlsl_codegen.c:
copy_propagation_record_store(ctx, hlsl_ir_store(instr), state); break;
case HLSL_IR_SWIZZLE:
copy_propagation_transform_swizzle(ctx, hlsl_ir_swizzle(instr), state);
I missed adding `progress |=` here, I will also add that.