Re: [PATCH v10 0/8] MR84: vkd3d-shader/spirv: Merge all shader IR fork and join phases into a single phase.

14 Feb 2023

Patch 7/8 ("vkd3d-shader/spirv: Delete an unnecessary call to spirv_compiler_get_register_id().") is pretty much independent from this series, and could just go in as a separate MR.
Patches 8/8 ("vkd3d-shader/normalise: Insert hull shader control point input declarations if no control point phase is defined.") and 6/8 ("vkd3d-shader/normalise: Normalise control point phase output registers to include the control point id.") are probably significant enough on their own that they could be split off as well.
The changes from patch 5/8 ("vkd3d-shader: Introduce an internal sm6 signature structure.") don't seem needed by this series, and I imagine they won't be needed until the actual shader model 6 bits start getting introduced.
That leaves the first four patches in this series.
From patch 1/8:
...

diff --git a/libs/vkd3d-shader/normalise.c b/libs/vkd3d-shader/normalise.c
new file mode 100644
index 00000000..cb0c48a0
--- /dev/null
+++ b/libs/vkd3d-shader/normalise.c
@@ -0,0 +1,121 @@

I imagine the scope of this file may end up being a bit broader than normalisation; it seems like we'd want to put (LL)IR transformations in general in here. I don't think we necessarily need to get the naming right on the first try though; I just wanted to mention it.
...
+static inline bool shader_normaliser_new_instructions(struct vkd3d_shader_normaliser *normaliser, unsigned int extra)
+{
+    return shader_instruction_array_reserve(&normaliser->instructions, normaliser->instructions.count + extra);
+}

I'd argue for shader_normaliser_reserve_instructions() instead, analogous to shader_instruction_array_reserve() and vkd3d_array_reserve(). Mostly for internal consistency of interfaces.
...
+static bool shader_register_clone_relative_addresses(struct vkd3d_shader_register *reg,
+        struct vkd3d_shader_instruction_array *instructions)
+{
+    struct vkd3d_shader_src_param *src;
+    unsigned int i;
+
+    for (i = 0; i < ARRAY_SIZE(reg->idx) && reg->idx[i].offset != ~0u; ++i)
+    {
+        if (!reg->idx[i].rel_addr)
+            continue;
+
+        if (!(src = shader_src_param_allocator_get(&instructions->src_params, 1)))
+            return false;
+
+        memcpy(src, reg->idx[i].rel_addr, sizeof(*src));
+        reg->idx[i].rel_addr = src;
+        assert(!src->reg.idx[0].rel_addr && src->reg.idx[1].offset == ~0u);
+    }
+
+    return true;
+}

Should we perhaps introduce _clone_src_param() and _clone_dst_param() helpers? We could then just use those to (recursively) clone the "rel_addr" parameters. I.e.:
```c
    reg->idx[i].rel_addr = shader_register_clone_src_param(reg->idx[i].rel_addr);
```
...
+static bool shader_instruction_array_clone_instruction(struct vkd3d_shader_instruction_array *instructions,
+        unsigned int dst, const struct vkd3d_shader_instruction *src)
+{
+    struct vkd3d_shader_instruction *ins = &instructions->elements[dst];
 [...]
+}

I don't like how we're passing "dst" as the location of the instruction to clone to here. There are probably different ways to address that; one approach would to make this function always append to the end of the instruction array. This ties a bit into some of the later comments about the general structure of the normaliser code as well.
...
+enum vkd3d_result shader_normaliser_normalise(struct vkd3d_shader_normaliser *normaliser,
+        const struct vkd3d_shader_parser *parser)
+{
+    const struct vkd3d_shader_instruction_array *instructions = &parser->instructions;
 [...]
+}

Do we need "parser" here? It might be nicer to directly pass in an instruction array. The other thing I find awkward about this function is that it takes an uninitialised normaliser and initialises it, but it's not an _init() or _create() function. One way to address that would be to separate initialisation and cloning from the actual transformations. Another approach would be to do something like "struct vkd3d_shader_instruction_array *dst = shader_instruction_array_normalise(&parser->instructions);". That would imply keeping the vkd3d_shader_normaliser structure internal to normalise.c, which may not be a bad thing.
Relatedly, exposing a single shader_normaliser_normalise() function means that users of this code can't decide to only run some passes but not others, something that is possible with the different transformation passes in the HLSL compiler. That's not a problem as long as the SPIR-V backend is the only user of this code, but it seems like a desirable property regardless.
...
@@ -9955,16 +9955,27 @@ int spirv_compiler_generate_spirv(struct spirv_compiler *compiler,
 [...]
+    if (compiler->shader_type == VKD3D_SHADER_TYPE_HULL)
     {
-        if ((result = spirv_compiler_handle_instruction(compiler, &instructions->elements[i])) < 0)
+        if ((result = shader_normaliser_normalise(&normaliser, parser)) < 0)
             return result;
+        instructions = &normaliser.instructions;
     }
 [...]
+    if (compiler->shader_type == VKD3D_SHADER_TYPE_HULL)
+        shader_normaliser_destroy(&normaliser);

I'd propose
```c
    if (instructions != &parser->instructions)
        shader_normaliser_destroy(&normaliser);
```
avoiding the need to have matching conditions in the two if-statements.
From patch 2/8:
...
@@ -109,7 +286,21 @@ enum vkd3d_result shader_normaliser_normalise(struct vkd3d_shader_normaliser *no
         if (!shader_instruction_array_clone_instruction(&normaliser->instructions, normaliser->instructions.count, ins))
             return VKD3D_ERROR_OUT_OF_MEMORY;

-        ++normaliser->instructions.count;
+        instruction_count = 1;
+        if ((result = shader_normaliser_eliminate_phase_instance_id(normaliser, &instruction_count)) < 0)
+            return result;
+
+        normaliser->instructions.count += instruction_count;
+    }

Something that's a bit ugly here is that we first clone the instruction (including source and destination parameters), but then essentially orphan it if shader_normaliser_eliminate_phase_instance_id() decides to skip it. That ends up working out because destroying the allocators also frees anything allocated from them, but it's not very pretty and feels a bit fragile.
...
+static enum vkd3d_result shader_normaliser_eliminate_phase_instance_id(struct vkd3d_shader_normaliser *normaliser,
+        unsigned int *instruction_count)
+{
 [...]
+    if (ins->handler_idx == VKD3DSIH_MOV && shader_register_is_phase_instance_id(&ins->src->reg))
+    {
+        if (ins->dst->reg.type != VKD3DSPR_TEMP)
+        {
+            FIXME("Instance id not assigned to temp register.\n");
+            return VKD3D_OK;
+        }
+        /* Instance id always seems to be assigned to r{n}.x */
+        if (ins->dst->write_mask != VKD3DSP_WRITEMASK_0)
+            FIXME("Unsupported write mask %#x.\n", ins->dst->write_mask);
+        if (ins->dst->reg.idx[0].offset >= sizeof(normaliser->temp_is_instance_id) * CHAR_BIT)
+            FIXME("Unsupported temp idx %u.\n", ins->dst->reg.idx[0].offset);
+        else
+            normaliser->temp_is_instance_id |= 1ull << ins->dst->reg.idx[0].offset;
+        *instruction_count = 0;
+        return VKD3D_OK;
+    }
 [...]
+    if (ins->handler_idx != VKD3DSIH_RET)
+        return VKD3D_OK;
+
+    count = normaliser->instructions.count - normaliser->phase_body_idx;
+
+    if (!shader_instruction_array_reserve(&normaliser->instructions, count * (normaliser->instance_count - 1)))
+        return VKD3D_ERROR_OUT_OF_MEMORY;
+
+    /* Make a copy of the non-dcl instructions for each instance. */
+    ins = &normaliser->instructions.elements[normaliser->phase_body_idx];
+    for (i = 1; i < normaliser->instance_count; ++i)
+    {
+        for (j = 0; j < count; ++j)
+        {
+            if (!shader_instruction_array_clone_instruction(&normaliser->instructions,
+                    normaliser->phase_body_idx + count * i + j, &ins[j]))
+                return VKD3D_ERROR_OUT_OF_MEMORY;
+        }
+    }
 [...]
+}

Relatedly, the "*instruction_count" handling above feels awkward to me; it seems preferable to either just clone instructions inside shader_normaliser_eliminate_phase_instance_id() if we don't skip them, or always clone them on the outside and replace them with e.g. NOPs in shader_normaliser_eliminate_phase_instance_id() is we do skip them. Note that we do clone instructions for phase bodies inside shader_normaliser_eliminate_phase_instance_id().
Why do we have explicitly handling for MOVs above? Does shader_instruction_eliminate_phase_instance_id() not catch those? Or is this an attempt at doing constant propagation? In the latter case, I'm not sure that's worth it, but it should probably be a separate patch.
...
@@ -6870,30 +6813,12 @@ static void spirv_compiler_emit_hull_shader_main(struct spirv_compiler *compiler
     else
         spirv_compiler_emit_default_control_point_phase(compiler);

-    if (compiler->use_vocp)
-        spirv_compiler_emit_hull_shader_barrier(compiler);
 [...]
+    /* TODO: only call the patch constant function for invocation 0. The simplest way
+     * is to avoid use of private variables there, otherwise we would need a separate
+     * patch constant epilogue also only called from invocation 0. */
+    spirv_compiler_emit_hull_shader_barrier(compiler);

This eliminates the "use_vocp" condition on spirv_compiler_emit_hull_shader_barrier(). Is that intentional?
From patch 4/8:
...
Subject: [PATCH 4/8] vkd3d-shader/spirv: Move the function declaration from
 spirv_compiler_begin_shader_phase() to spirv_compiler_enter_shader_phase().
The subject line probably has some room for improvement. Specifically, this patch is about the phase function declarations in the generated SPIR-V, not C function declarations in the vkd3d-shader source.
-- 
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/84#note_24087

    

2025

2024

2023

2022

Re: [PATCH v10 0/8] MR84: vkd3d-shader/spirv: Merge all shader IR fork and join phases into a single phase.