I constructed a simple benchmark that just makes a bunch of draw calls per frame (and nothing else). That benchmark is artificial, shows a consistent performance hit from this commit:
10000 draws: 217 vs 183 2000 draws: 860 vs 772 500 draws: 1900 vs 1840
Is that just the `QueryPerformanceCounter()` overhead? I.e., what does the benchmark say if you always return false from `should_periodic_submit()`?
So... I don't know what I was testing originally, but I can't reproduce that huge difference anymore. Skipping should_periodic_submit entirely, I see 217 vs 213 fps for 10000 draws, and no consistent measurable difference for 2000 draws.
I'm pretty sure my original tests were at least missing patch 1/2, so maybe that made the difference.
Oops, it did. By calling wined3d_context_vk_begin_render_pass() and then accessing the command buffer directly, it bypassed the periodic submit logic. With that fixed I'm getting the same numbers as before.
You already have a benchmark, see if it improves? :) The hit from this doesn't necessarily seem bad enough to block the MR, but I think the benchmark results do seem to hint at adding roughly 10% draw call overhead. If we could get that down to e.g. 1% or less, that seems worth pursuing.
I tried this [set a waitable timer to 1 ms, and use QueryInterruptTime()], and it gets me about 211 fps. The difference between that and the original 217 fps is mostly the submit, but QueryInterruptTime() does hurt a bit, probably because of the loop? Directly accessing user_shared_data->InterruptTime.LowPart (which is the only part we need) is better; I'm not seeing any measurable difference between that and upstream Wine.
So I see three potential ways forward:
* Decide we don't care about an artificial benchmark, and just use this merge request as-is.
* Decide the server is okay with raising the tick count interval to 1 ms after all, at least if guarded by timeBeginPeriod()/timeEndPeriod(), and then use the interrupt time or tick count here.
* Try an approach that uses a separate thread. This may be a good idea anyway, since my benchmark shows that submitting does have some nonnegligible overhead.