These submits would happen from the CSMT thread right?
Yes.
Does the command stream queue ever run empty? Should we submit any pending vulkan commands in this case?
In Grounded? I can try that, though it's not clear it's better than the current approach...
Did you benchmark the impact of the full NT syscall every d3d draw?
I believe I tested it with this game at least, though ultimately I don't know how much that says. On the other hand, as often, I don't know how to fashion a benchmark that will prove meaningfully that it does or doesn't matter.
In theory, QueryPerformanceCounter() can be modified to avoid the syscall, at least on some architectures, though I'm told this is kind of hard.