Note though that if we introduce such a thread on the vkd3d-shader side of the API (as opposed to the user side), that would imply a thread per opened cache with the current API design.
I don't think it necessitates that. We could use a worker thread that accepts work items from any cache. That worker API could even use host functionality like RtlQueueWorkItem where available.
That said, using one thread per cache might just be simpler.