On my machine, one call to QueryPerformanceCounter() seems to take about 80 ns.
How does that compare to e.g. QueryInterruptTime()?
If we are worried about performance—and I'm not really sure whether to be worried—then I'm open to trying the aforementioned approach with a separate submit thread.
How about something like CreateTimerQueueTimer() or SetWaitableTimer()?