I just remembered that we probably want to run this pass after folding constant expressions, to avoid adding instructions that we can't fold, e.g. just because of an unfolded cast in the index. This, unless we make the passes smart enough so they can fold `x + 0 = x` and `x * 0 = 0`.
I came to thinking the same while reviewing. Even more, I might be not computationally optimal, but in the end the compiler structure might be simpler if we just lower *all* vector indexing operations to a dot product and them let constant folding and other passes recover the indexing operations that can be made constant.
Just saying, though, I'd first have this MR accepted and then this can be revisited later.