In principle we could avoid touching the IR and do everything while lowering to SM1. The only difficulty is AFAIK you cannot write literal constants in SM1, so you'd need to ensure that you have an available zero constant to stick in when calling `write_sm1_ternary_op()`, and that's probably uselessly complicated. So I agree that the best solution seems adding `HLSL_OP_DP2ADD` and a lowering pass.