This looks fine, as far as I can say. The only part I'm a bit concerned about is how many threads this is going to create. Is it possible that an application creates a lot of descriptor heaps, each of those spawning a new thread? That wouldn't be very gentle on the OS. Would it be possible to have on thread per device, taking care of all the heaps belonging to that device? Or, if parallel processing is advisable, a per-device pool of threads taking care of all the heaps belonging to that device?