Re: [PATCH 0/2] MR37: vkd3d-shader: Add some DXIL support.

2 Nov 2022

      ...
After reviewing the possibilities, I think converting TPF to the DXIL-like IR is a far better option in the long run, at least for SM6 (I'm not sure how this would impact HLSL work). For reference, Microsoft has released code for such a conversion. This option would mean having an extra SM6 SPIR-V backend for a while at least, until the conversion code is written.
We generally try very hard to avoid big rewrites like that, and I'm not convinced this should be an exception. I think the right way to approach this would be to make an (exhaustive) list of issues with the current IR for the purpose of representing DXIL, as well as a corresponding list with potential solutions to those issues. (Ideally with concrete examples.) The main point would be to get everyone involved on the same page in terms of understanding the issues and possible solutions. You have significantly more experience with translating DXIL, and are probably aware of issues that the rest of us aren't; at the same time, we may be able to come up with solutions/approaches you hadn't considered yet.
...
If we want DXIL trace output to look similar to that from 'dxc -dumpbin' we need a separate trace backend. I'd argue this is more useful for debugging that emitting something very similar to the current TPF trace.
Sure, broadly. It could perhaps be argued whether this should be a completely separate backend or if this could instead be an alternate output mode for the existing backend, but the difference would largely be trivial.
It does however depend on the approach taken for the IR. Specifically, if we're going to do significant transformations on DXIL IR before turning it into vkd3d IR (e.g. vectorisation, eliminating PHI-instructions, running it through the structuriser), that implies it would make more sense to run the disassembler directly on the parsed DXIL IR, instead of on the vkd3d IR.
...
...

Declarations being part of the same instruction stream as the rest of the shader can be a bit awkward; particular for shader model 1-3 which doesn't necessarily have them. I suspect you're either going to run into this with your d3dbc->spirv efforts, or already have. In wined3d that's somewhat addressed by constructing the wined3d_shader_reg_maps structure and then ignoring declaration instructions in the GLSL backend.

In terms of translating sm1->spirv (and in general translating out of vkd3d_shader_instruction), sort of, although for the most part spirv.c should be capable of lazily initializing varyings, and most of my difficulty with sm1 thus far has been rearranging it so that it will. (And also so that it doesn't demand an input signature.)
I'd argue the frontend should just generate the required information during parsing. The "reg_maps" approach from wined3d would be one option; another option would be to generate dcl_ instructions as needed into a separate instruction stream and merge the two at the end, similar to how the SPIR-V backend has "global_stream" and "function_stream". Taking that option one step further, we could decide to not merge them, and that would then allow getting rid of the "after_declarations_section" flag in the SPIR-V backend.
...
In terms of translating hlsl->smX (and in general translating from something more high-level into vkd3d_shader_instruction), I don't think declarations are awkward at all?
Sure, it's probably fine for HLSL.
...
I can definitely see this being an improvement—it'd make the spirv code for handling declarations less complicated—but as below, it'd also mean doing more passes, and probably more allocations as well.
I don't think we'd end up with more passes, certainly not in total. It would probably mean allocating more memory, but I don't think it would be prohibitively more.
...
...

Somewhat similar to declaration instructions, it probably makes sense to make hull shader phases available as separate blocks of instructions, instead of a single instruction stream.

I can't much comment on this as I have thus far avoided touching or understanding tessellation. God only knows why it's so complicated...
Mostly just for illustrative purposes, here's a random hull shader from my collection:
```
hs_5_0
hs_decls
dcl_input_control_point_count 4
dcl_output_control_point_count 4
dcl_tessellator_domain domain_quad
dcl_tessellator_partitioning partitioning_integer
dcl_tessellator_output_primitive output_triangle_ccw
dcl_globalFlags refactoringAllowed
dcl_constantBuffer cb0[2], immediateIndexed
hs_fork_phase
dcl_output_siv o0.x, finalQuadUeq0EdgeTessFactor
mov o0.x, cb0[0].x
ret
hs_fork_phase
dcl_output_siv o1.x, finalQuadVeq0EdgeTessFactor
mov o1.x, cb0[0].y
ret
hs_fork_phase
dcl_output_siv o2.x, finalQuadUeq1EdgeTessFactor
mov o2.x, cb0[0].z
ret
hs_fork_phase
dcl_output_siv o3.x, finalQuadVeq1EdgeTessFactor
mov o3.x, cb0[0].w
ret
hs_fork_phase
dcl_output_siv o4.x, finalQuadUInsideTessFactor
mov o4.x, cb0[1].x
ret
hs_fork_phase
dcl_output_siv o5.x, finalQuadVInsideTessFactor
mov o5.x, cb0[1].y
ret
```
We end up turning phases into their own functions and then invoking them in vkd3d_dxbc_compiler_emit_hull_shader_main().
...
...
Note that I'm quite explicitly not suggesting to throw out vkd3d_shader_instruction and replacing it with something new; the suggestion is that the vkd3d_shader_instruction interface could likely be made to work for both HLSL and DXIL with a reasonable number of adjustments. If HLSL could use it as-is, that's all the better.
Potentially. The main reason I haven't thus far is that it's extra work (if not per se a *lot* of extra work) for no clear benefit, versus our more ad-hoc struct smX_instruction infrastructure. If there were other frontends that wanted to generate sm1/sm4, or a reason for hlsl to feed directly to glsl or spirv instead of going through sm4 first, that'd more easily tip the scales.
The assembler comes to mind. In terms of benefits, there would of course be not having to deal with 3 similar, but slightly different low-level IRs. Also, I think the discussion that prompted this was that there were passes the HLSL compiler would like to do on a common low-level IR, instead of either duplicating those passes for SM1/4 or doing them on the HLSL IR.
...
If we do want to adopt it in more places, I support throwing out the vkd3d_shader_instruction naming and replacing it with something new :D
Good suggestions are always welcome. :)
...
...
...
Maybe there's an argument to have one or more unified IRs across all of vkd3d-shader? It would be nice in some respects, but I imagine that compilation speed would be a concern. I gather that one reason that the dxbc->glsl/spirv path is arranged the way it is, is that we only want to do one pass over the dxbc, and want to avoid allocating memory as much as we can.
Well, things started out as directly translating shader model 1 bytecode to ARB_vertex_program/ARB_fragment_program instructions. (Compare e.g. IWineD3DVertexShaderImpl_GenerateProgramArbHW() from dlls/wined3d/vertexshader.c in wine-0.9.) Certain abstractions were introduced as needed; the introduction of the GLSL backend was an important event, as was the introduction of shader model 4 support. We've probably reached a similar point again, although this time it seems both DXIL and HLSL are getting there at roughly the same time.
I guess my point is, if we have a unified IR and it's basically vkd3d_shader_instruction, that's probably fine (and it may honestly be possible to do that, the way things are). But if we want to change vkd3d_shader_instruction, that might mean making the sm4->spirv path slower, which doesn't seem desirable.
Sure, the intention would be for these changes to be fairly minor. Though in terms of shader compilation speed, the SPIR-V->GPU part is likely much worse than anything we could do to the vkd3d IR. There are also things in the existing code, like get_opcode_info() that could be improved.
-- 
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/37#note_12805

2025

2024

2023

2022

Re: [PATCH 0/2] MR37: vkd3d-shader: Add some DXIL support.