I have several immediate thoughts on this:
(1) Q: Why does this only affect object loads? A: Because only object loads grab the deref directly instead of just referring to the LOAD instruction. Q: But don't regular loads have the same problem? A: No, because a sequence like
2: var 3: var = @2 4: var
gets turned into
... 4: @2
(modulo replicate swizzle, which I think we have a pass to remove), and we can't ever turn that *back* into "4: var" because var was written in the previous instruction.
In theory the same restriction *should* be what's preventing object loads from being affected, but that's not really feasible, is it...
Still, we should have tests for that case. We should also have tests for assignment-to-self where swizzles are involved, probably including both swizzles that have "loops" (e.g. "var.xy = var.xz") and those that don't (e.g. "var.xy = var.yx").
I also feel like there's some way that we should be able to remember why this only matters for object loads. Maybe we can write a code comment to the effect of the above. But it's also possible that we should be handling this not in copy_propagation_transform_object_load() but rather in copy_propagation_record_store(), even though that requires peeking at the RHS. Not sure about that one.
(2) We're going to need to get rid of these assignments anyway, though. Consider the last test, which still fails—I assume because we're left with object load/stores in the IR and we can't translate those to sm4.
So I'd advocate for doing this in a pass before copy-prop. We don't actually need copy-prop to achieve it, we just want to look for stores whose rhs is a load from the same variable.
We could leave an "assert(!hlsl_derefs_are_equal())" in copy-prop though.
(3) Why does that test require sm5? For that matter, can't we write it so that it uses a sample rather than a load, and that way it can work on sm1 as well.