On Thu, Nov 11, 2021 at 4:41 PM Giovanni Mascellani gmascellani@codeweavers.com wrote:
Hi,
On 11/11/21 14:43, Matteo Bruni wrote:
Continuing from your example: assuming a, b and c are temporaries, you split them into 4 vectors each and update the LOAD and STORE instructions to point to the specific vector. Once that is done, it becomes explicit that those groups of 4 instructions (LOAD x2, ADD, STORE) are in fact entirely independent from each other. That alone might help further transformations down the road. It's also pretty nice for register allocation, as it's easier to allocate 4 groups of 4 registers rather than a single contiguous group of 16. Sometimes you can even find out that whole rows / columns are unused and drop them altogether.
The same applies to all the complex types of course, not just matrices. There is a complication with the above in that sometimes it can be impossible to split the vars. That is, when the load / store offset is not always known at compile time. That's a bit unfortunate but it should also be pretty rare in HLSL shaders. I think it's worthwhile to optimize for the common case and accept that we won't necessarily have the best code when things align badly for us.
Ok, that is the part I was missing: it's possible that a variable cannot be split, and we want to handle that case as well (meaning that it's sensible to optimize for the common case, but we also want to be correct in _all_ cases).
Exactly. At least that's how I see it.
With all that said: WRT copy propagation and this patch specifically, I think it's a good idea to only handle vector variables if it makes things easier (as it should). Notice that you don't have to bail out entirely even in the "bad" case, as a non-vector is perfectly fine as a "value". It's only when the complex variable is the destination of a store that we're in trouble.
Though I don't understand why we should arbitrarily ignore non-vector variables. They don't require any special handling at all, just using a dynamically-sized array instead of a static one. All the rest is perfectly identical, the is no simplification in the algorithm itself. Arguably, we would make it more complicated, because we would arbitrarily drop some cases, which would make a reader why we are doing that.
I think focusing on vectors is a good general direction and keeping that in mind might lead to simpler passes down the line. I guess it's not so useful when taken into consideration after the fact, like in this particular case. It's not a big deal either way.
In the specific case of my copy propagation pass, this would make things more complicated. For example, if right now I cannot reconstruct the offset of a store, I can just invalidate the whole variable. In your model, as I get it, I'd have to also invalidate other variables, that are unrelated by that point.
I don't think that's the case? A STORE is always directed to a specific temporary variable and will affect that one alone. I guess you were thinking of a model where you always split variables into vectors no matter what, in which case you're right, it quickly becomes a mess...
Yeah, I has understood that way.
Thanks, Giovanni.