On Mon May 29 10:08:54 2023 +0000, Giovanni Mascellani wrote:
The thing that bothers me is not really that the generated IR is different, but that the IR semantics themselves are different. Specifically, it seems that in SM4 an instruction of type `struct hlsl_ir_jump` means "jump if `arg`, which is supposed to be a scalar, is non-zero", while in SM1 it is "jump if `arg` has any negative component". This must be taken in consideration by all the downstream code, and it is not a good idea in my opinion. Incidentally, `arg` might probably use a better name (e.g., `condition`?). Also, I would add a comment in `struct hlsl_ir_jump` specifying that `arg` is only considered for jumps of type `HLSL_IR_JUMP_DISCARD`, which is otherwise non obvious. On top of that, there is probably a decision that should at least be more explicit here: when I proposed to add an IR operator corresponding to the C ternary operator this was rejected because, if I understood well, we don't want to make the IR too rich, so that it's easier to write the optimizer (because there are fewer possible ways to express the same constructs). So, it is intended that HLSL `a ? b : c` must be translated to IR `if (a) { store(x, b); } else { store(x, c); } load(x)` rather than using an IR-level ternary operator. Here we've got essentially the same dilemma, so according to the same principle we should avoid adding `arg`, but rather use an `if` construct; and the task of lowering that to a better TPF program should be delegated to some other optimizer that might in principle appear at some point in the future. Now, I was against this decision at the time, so I should be in favor to what Nikolay is doing here. And in a sense I am, except that I feel we should a bit more consistent about these design decisions. Is this case different from the ternary operator? Or did we decide that since the lower-level optimizer doesn't exist (or, more precisely, while it doesn't exist yet) we agree that it makes sense for the current IR to be a little more flexible, so that some optimizations can still be done on it? That wouldn't be similar in spirit to !215.
Why downstream code needs to be aware of this? If SM4+ condition is explicitly written out here, or later, it will only matter if we wanted to remove dead discard calls, by evaluating arguments differently. But even then we might scan for resulting "texkill/discard_nz" later and remove them, and not at IR level.
Calling this a design decision is a bit too much, it's too minor for that. Having separate texkill and discard jump types wouldn't bother me and won't add a new type of IR node, but will also need separate paths depending on profile.