On Thu Apr 27 01:58:24 2023 +0000, Zebediah Figura wrote:
Among other things, this means that it's still not completely clear to
me what you mean by "maximally CISC" or "minimally CISC" (and while I think I sort of get the general idea for the terms, that's not nearly enough to understand options (3) and (4)). Ah, I'm sorry. To put it into more concrete terms: HLSL is intentionally a quite simple IR. I would describe it as erring on the side of RISC in design. Everything operates on SSA values except for extern and resource loads; single instructions do pretty much maximally modular things; we avoid adding new expression types or instruction types if we can instead just lower them to simpler expressions immediately. By contrast, both sm1 and sm4 are more CISC. One instruction can do multiple arithmetic operations (abs, neg, sat) in addition to whatever else it's doing; instructions load directly from multiple types of registers (instead of always going through an SSA value, or even just always going through a temporary register). Part of being able to make statements like "HLSL IR is RISC" is the knowledge that it used to be less so (e.g. we used to have HLSL_IR_CONSTRUCTOR, which I assume you can guess the idea of) and also that we've considered making it less so. On the basis that one HLSL instruction corresponds to one smX instruction we've considered adding some of those features to HLSL IR (for example, making hlsl_src hold a union that includes not just SSA values but also immediate constants or something similar to hlsl_deref). We eventually decided against those on the grounds that it would make the IR more complex, and harder to reason about when doing optimization passes.
- In general, I think that code processing should be done in the form
of IR passes as much as possible, rather than be embedded in the frontends or backends. This helps modularity and code sharing. I think Conor's patches go into this direction, which makes sense to me. Frontends and backends already care about serialization and deserialization and should not be loaded with excessive other duties. Yeah. Well, in a sense the question can be rephrased as: should v-s IR exist just as a common, "neutral" format, and then we'd have basically frontend- and/or backend-specific IR that we'd do passes over? E.g. in this case the backend IR would probably be adapted from struct sm4_instruction, which currently exists just to be a slightly more structured version of the byte code, but could grow to be more than that. It seems we probably won't be moving this way, but that's kind of what I was envisioning with that proposal. In that case struct vkd3d_shader_instruction wouldn't have any optimization passes done over it, the only raison d'être would be to help allow mixing any frontend with any backend.
- Currently my understanding is that `vkd3d_shader_instrucion` is
basically modeled after SM4. When converting SM4 -> SPIR-V, the SM4 code is basically deserialized to `vkd3d_shader_instruction` and then rewritten to SPIR-V in a rather naive way. The deserialization step is very syntactical, to the point that the original SM4 code can be faithfully disassembled from the IR. I believe that vkd3d_shader_instruction is modeled after (or even "adapted directly from") struct wined3d_shader_instruction, which was designed to handle both sm1 and sm4. Fundamentally the formats are relatively similar, enough that it's possible to write a single disassembly routine, and a single GLSL shader backend, that mostly handles both, although there did need to be a lot of version-specific code in the latter case.
If we want `vkd3d_shader_instruction` to be flexible enough to support
different frontends and backends, I think it must somehow be unchained from SM4. In particular, SM4 disassembling has to go through a different path. Right, and there's part of the rub. If we want it to be usable for things like disassembly (and assembly), we need v_s_i to be able to express everything that *any* frontend or backend can. This ends up kind of bloating the structure, which is one of the reasons I'm not sure we want that "maximally CISC" kind of IR. Like I mentioned, the less complicated an IR is, the easier it is to work with. On the other hand, the kind of passes we'd be potentially doing over a maximally CISC v_s_ir aren't the same as the work we do with HLSL IR. HLSL has to bridge the gap from text all the way down to byte code, but v_s_ir would potentially just be a bunch of peepholes. The fact that it doesn't have to deal with *types*, or well, doesn't have to deal with data structures, is already quite a benefit. So I'm not sure anymore that that per se is a concern. And of course when trading off one complex IR against multiple (ideally less complex) IRs, it takes some judgement to decide which is the best option.
Thanks for the explanation. Keeping in mind my remark about salt and cargo ships from yesterday, here are my thoughts.
If we want it to be usable for things like disassembly (and assembly), we need v_s_i to be able to express everything that *any* frontend or backend can.
And I don't think we want it, it would be unmanageable.
should v-s IR exist just as a common, "neutral" format, and then we'd have basically frontend- and/or backend-specific IR that we'd do passes over?
Ideally this would feel a good design to me. Though I would say that it is not a hard requirement for frontends and backends to have their specific IR, even if it is conceivable that past a certain complexity threshold it is difficult to do without. Also, my idea is that you can do passes over {front,back}end specific IRs, but also over the common IR. At least, I'd design the common IR in such a way that they can be done.
In practice I see a number of difficulties:
* Shaders are not just code; they also embed interface information (input and output signatures, uniforms, buffers, object bindings, ...) and other random metadata (for example for the hull shaders). Should this be representable in the common IR?
* How essential (or "RISC") should the IR be? Should it be scalar or vector? Making it more essential means that frontends have to do more work and common IR passes are easier to write; make it less means that backends have to do more work, though it also has the advantage that passes that make sense for different frontends and backends can be deduplicated.
Maybe I am just rehashing the same elements around and around. It's quite hard to design this thing...