The idea is, if you have something like that float4(f, 1, 2, 3) sequence, you can't copy-prop loads from `<constructor-2>` with the current pass because they don't all have the same instruction as a source. So what you do instead is, any time that `<constructor-2>` is itself used as an rhs to a store, you split up that store into multiple stores. You could say one per unique source but you could also do it more simply by saying one per component (and then letting vectorization clean that up later). That way you're guaranteed to be able to copy-prop them. You increase the number of stores but (hopefully) reduce the number of intermediate variables.
My way to read your proposal (for the second step) is as a sort of *variable deduplication* (or maybe *dealiasing*?). What's happening in your test program is that `<constructor-2>` and `<output-sv_target0>` are essentially the same variable: as soon as they both are initialized they have the same value, and they will keep it for their whole lifespan. So you can basically replace one with the other one. In theory you can do it either way, but in this case you need `<output-sv_target0>` to survive because it has an externally visible semantic.
Kind of, but it's more general than that. In theory the "intermediate" variable (in this case the constructor) can be used elsewhere, and it doesn't have to have the same component count as the "final" one. Although in practice it'll probably just be useful for the more restricted case of equal and "duplicate" variables.
Though I am not sure of what you mean by "generic vectorization of stores". The peculiar feature that is exploited by this patch is the fact that immediate constants can be vectorized, i.e., you can replace
a.x = 1; a.y = 2; a.z = 3;
with
a.xyz = {1, 2, 3};
But as I said that's rather a property of immediate constants, rather than of stores. In other words, I would describe a "generic vectorization of stores" as something that replaces
a.x = b.x; a.y = b.y; a.z = b.z;
with
a = b;
That's not something you can do in Francisco's example unless you know that immediate constants are special, i.e., you can vectorize them even if they come from different sources. Which is basically what my patch was doing (and what I believe this patch does; I say "believe" because I haven't read it yet in detail). So it seems to me that in one way or another you always need to have something like this patch, even if it would be good to also handle the cases in which only some components are immediate constants.
No, I think language is just hard :D
By "vectorization of stores" I meant the store *to* `a` in this example. I suppose it's redundant since it's not like it's really possible to vectorize anything else.
The pass I describe is kind of specific to constants, although there are other vectorization passes you can do that might end up reusing the same infrastructure. E.g. in your second example you can vectorize loads from the same variable. There's also a possibility of vectorizing stores from the same node by creating a broadcast swizzle.