Naturally, the proper solution is detecting usage with a component level granularity. That means making the fields first_write and last_read from hlsl_ir_var component-wise arrays.
Yeah, I think so. The other option is that we actually split temporary variables into smaller pieces, which I think had been proposed at some point, but thinking about it further, that doesn't really have much of an advantage versus doing things component-wise. In particular you'd have to either completely split up vectors, or forgo doing per-component passes (DCE, copy-prop), and if you have to implement per-component passes anyway then there's not really a difference between that and doing them for larger variables.