Alright, I pushed something that implements conditional discard_nz. I'm sure it has to be doable to move everything sm4 specific from clip() to some later pass, but that involves sharing more expression helpers, that are not otherwise used outside of hlsl.y.