I constructed a simple benchmark that just makes a bunch of draw calls per frame (and nothing else). That benchmark is artificial, shows a consistent performance hit from this commit:
10000 draws: 217 vs 183 2000 draws: 860 vs 772 500 draws: 1900 vs 1840
Is that just the `QueryPerformanceCounter()` overhead? I.e., what does the benchmark say if you always return false from `should_periodic_submit()`?