On 4/28/20 4:22 PM, Matteo Bruni wrote:
I have another proposal, as a way to sidestep the whole deref story. As usual, with the caveat that this is all armchair planning and it is known that plans rarely survive contact with the enemy. Anyway...
With the current IR we end up having arbitrarily complex deref chains to represent the composition of multiple levels of array and struct "indexing", which are necessarily cumbersome to handle. What if we transform all of that into offsetting instead? An arbitrary variable dereference would be represented as (variable, offset), with offset an arbitrary expression with an integer scalar value. That requires that we know the layout of complex data types early on, which now we can do, I think. The offsets could be in scalar or vec4 units, I can see benefits for either choice. Ideally we would directly generate offsets right during parsing and skip derefs altogether, although that might not be practical. An obvious downside coming from this is that constant folding becomes pretty much required, to compute a register index at codegen time. It does have the advantage that non-constant offsets are no different from constant ones as far as IR is concerned. Obviously we can skip supporting those initially, regardless of all this.
I like this inasmuch as it's basically "hlsl_deref chains, but simpler"; I guess it requires no recursion anywhere, and so kind of sidesteps the constraints that led us here in the first place.
Yeah, I didn't really love any of the options coming from derefs so decided to throw them away and see what happens :P
What do you think? I have a complication right away, which is the fact that in SM3 struct variables can be allocated into multiple register sets (see http://www.winehq.org/pipermail/wine-devel/2013-August/100705.html). I don't think it's a blocker but it is annoying...
The way I'd solve that is just to allow for potentially multiple hlsl_reg structures to any given uniform. That could be as simple as including three separate hlsl_reg structures inside hlsl_var. sm2-3 apparently allocates structs by limiting the size to include the last used element (whereas prior used elements remain unused). That'd just have to be done per type.
Hmm, not sure if it does allocate "unnecessary" elements but, if that's the case, that's quite nice indeed since it means offsets (when counting in units of actual SM2 registers) are the same regardless of the register set.
From my testing it essentially does, yes, i.e. if you have
struct { int unused; float f; bool b; } s; float4 main(float4 pos : POSITION) : POSITION { if (s.b) return s.f; return pos; }
then "s" gets allocated to registers b0-b2 and c0-c1, but only b2 and c1 are ever used.
So yeah, it makes things pretty simple. I can see how it would have been a lot uglier otherwise.
I'm admittedly not sure how changing our representation of derefs affects this specific problem, but maybe I'm missing something.
It doesn't seem insurmountable either way, just something that's maybe a bit more of a hassle with offsets vs derefs. Or not, if the above is how it works.
To my simultaneous delight and chagrin I can't come up with anything else that this approach would break.
I call it a success! Until you find it while updating the code :D