Using a callback here works, but I'd prefer having explicit operation IDs.
This series has a small but significant performance cost, so I would like to avoid loading the handler pointer from a table.
I'm not sure it makes a difference, but I'm also not sure loading that pointer from the command buffer instead is necessarily better/faster than loading it from a fixed table. If we want to optimise this code we should perhaps look at the threaded dispatch techniques typically used by JIT compilers and interpreters. I have a suspicion that the kind of operations we're doing here are complex enough that dispatch isn't going to be much of a bottleneck, but I could of course be wrong. In any case, the way to go about it would be to start with the naive implementation to establish a baseline, and then show that making it more complicated improves things.
In a later patch the buffer cannot be reused simply, so this will be changed so the minimum is the largest of the command list's previous capacities, and the previous capacity is set to an initial value in `d3d12_command_heap_init()`.
Is that patch in this series? I wasn't immediately able to find it. It's generally best to avoid depending on later patches for justifying code though.