On Mon Jan 30 14:57:33 2023 +0000, Henri Verbeet wrote:
There is of course another option: translate this to mul+mad. And that's something we'd have to do anyway; dp2add is only supported in shader model 2 and up for pixel shaders, and not at all for vertex shaders.
This would need to be a lowering pass, because otherwise you would have to add a new temp in the bytecode. It can work but I am not sure if there will be discrepancies in precision, or if they would matter.
I would go for creating the new `DP2ADD` op (maybe it should be called `HLSL_OP3_DP2ADD`?) and adding the lowering pass for SM1.