Here's one RFC regarding use of vkd3d_shader_instruction (still working on a new name for that) in the HLSL compiler (and, at the same time, elsewhere): how CISC should the IR be? I'm thinking about things like source/dest modifiers, complex instructions like "div", restrictions on what register types can be used in the same instruction, and so on. I see four options:
(1) Maximally CISC, i.e. as CISC as the union of sm1, sm4, sm6, and anything else we translate into it. The disadvantage of this is that this means we're going to have to have specific hlsl backends for sm1 and sm4 anyway (this may be unavoidable, frankly, but it at least increases the extent of that code), which conceptually defeats the point of using a common IR—all we really get out of it is structure definitions. This is basically the current state as far as HLSL is concerned, except we don't even use the structure definitions.
(2) Minimally CISC, i.e. as CISC as the intersection of sm1 and sm4 [and sm6?], or less. This is great for HLSL, and fits the model of vkd3d_shader_instruction as a generic IR, but it means that we need to modify the instruction stream in some nontrivial ways when translating dxbc -> vazir. (Although maybe not that nontrivial? All we really need to do is split up some instructions, and that can be done as we read—i.e. it doesn't require inserting into the middle of an already built instruction list.) The other disadvantage is that we can't use vkd3d_shader_instruction for disassembly anymore.
(3) Maximally CISC, but we don't use the whole CISC set when translating from vkd3d-shader, and instead do a lot of peepholing *after* translating into vkd3d_shader_instruction. This is conceptually nice as far as the HLSL -> smX translation is concerned, and avoids the potential overhead of (2), but I feel like it also adds a bit of extra mental burden on understanding what's legal IR for the HLSL compiler to output.
(4) Maximally CISC, but we lower some instructions into multiple instructions when converting out of vkd3d_shader_instruction. This avoids all the disadvantages mentioned thus far, but then means that adding a new backend might imply modifying all the other backends to explicitly lower or fail on new CISC constructions.
I think (2) is my favourite solution. I was originally worried about the overhead of modifying the instruction stream, but then I realized that we don't really need to modify it, and it doesn't really increase the number of IR passes either. It's also generally nicer the simpler an IR is. I also think that it's worth having to write a separate disassembly pass, but I can also anticipate disagreement there...