The following shouldn't affect this MR but I found it worth noting:
When compiling ```hlsl uniform float2 a, b;
float4 main() : sv_target { return dot(a, b); } ``` I see a little discrepancy between how the swizzles are written in native: ``` dp2add r0.x, r0, r1, c2.x ```` versus ours: ``` dp2add r0, r1, r0, r2 ``` The [documentation of `dp2add`](https://learn.microsoft.com/en-us/windows/win32/direct3dhlsl/dp2add---ps) says that the `.x` is a " required replicate swizzle". I have not idea of what that means...
Regardless, that swizzle would appear if we just made sure that `arg3->reg.writemask = 1`, but it seems that we are not setting the writemasks to 1 when allocating scalar types in SM1. This may be a bug.