On Wed Apr 10 14:22:12 2024 +0000, Conor McCarthy wrote:
We have no other passes which may need to insert instructions. It's a compromise between adding yet another instruction copying pass, versus favouring performance even if it's a little messy. If the former is the priority I'll add a pass for all the small operations.
I added it to `lower_texkills()`. Moving instructions for each occurrence takes quadratic time, so we could see pathological cases. For example, Cyberpunk has a colossal compute shader with over 40,000 SSA values, so a shader of similar size which uses precise `MAD` could occur. If necessary we could convert `lower_texkills_and_precise_mad()` to copy instead of insert.