I see your point. AFAIK hlsl_sm4_write() always gets called by a single thread
Why? If the client calls `vkd3d_shader_compile()` from two different threads at the same time it can happen that the two `hlsl_sm4_write()` calls execute at the same time, can't it?
I agree that it's probable that in practice nothing bad happens, at least on the architectures we care about, because you're just going to do aligned pointer accesses. But technically concurrent access to the same memory location is UB in C (if at least one of the accesses is for writing), so I think we'd better stay away from that. Not sure if others have different opinions on the matter.