We want (a) at least since there are many cases where we index arrays with swizzles like `arr[a.x]` and we want to avoid relative addressing (for SM1 for instance). Also for instructions that expect constant values like the gather offsets.
But copy_propagation_transform_swizzle() only does something useful when a variable has components from different sources *and* those components aren't all constant. Unless I'm mistaken that is.
Even the example of "arr[a.x]" you give shouldn't actually need copy_propagation_transform_swizzle(). Consider the following:
``` const uint2 indices = {1, 2};
return array[indices.x]; ```
which in IR should be
``` 2: 1 3: 2 4: indices.x = @2 5: indices.y = @3 6: indices 7: @6.x ```
I'd argue this shader is still a little pathological (why would someone define that as a vector?) but even it is covered by current passes. copy_propagation_transform_load() -> copy_propagation_replace_with_constant_vector() should vectorize @6 into a constant array, and then hlsl_fold_constant_swizzles() deals with @7.
b7d34e83077 originated from discussion in 51. I'm not really sure why it was written, but I think it came down to something like "we should vectorize instead of extending copy-prop" followed by "well, that won't handle this case", but the case in question wasn't really realistic.
the only problem I have is that an implementation may be complicated enough for us not to want it during the code freeze, and a fix to the infinite loops is more priority.
So, summarizing we have 4 options:
- Going for this MR as it is.
- Removing (b) alone, with a slight chance that some unknown apps may break.
- Removing the whole copy_propagation_transform_swizzle(), breaking some tests and more chance of breaking apps.
- Implementing the "log" proposal.
I would prefer (1) during the freeze and (4) after the freeze, in fact I better start writing (4) now that I have the copy-prop cartridge loaded in my head.
I'm inclined to disagree honestly. This fix makes me very nervous. I would be a lot less nervous about accepting a proper fix. I also suspect that