In DXIL handling there will be many more cases analogous to those which shader_normaliser_new_instructions() serves. I think it's less hazardous to eliminate the need to repeat the `count + 1` formula every time.
The special MOV handler tracks the fork/join instance id. Theoretically fxc could use a FORKINSTID register directly as a register address, but instead it emits `mov r0, vForkInstanceId` and then uses r0 as the address.