On Mon Jan 30 15:02:57 2023 +0000, Francisco Casas wrote:
huh, I just realized that you could use the same temp for storing the result of both instructions, so it wouldn't require a new temp. I don't think we do that somewhere else though.
small correction: it seems that in vertex shaders the dot product is translated to mul+add (not mul+mad):
https://shader-playground.timjones.io/5a12e96e87f2170080123a81305b8444
I would suggest lowering dp2 to mul+add or dp2add, according to the shader model, in the same lowering pass.