On 03/22/2011 05:02 AM, Henri Verbeet wrote:
On 21 March 2011 21:56, Adam Martinsonamartinson@codeweavers.com wrote:
Cuts CPU time in context_apply_fbo_state() in half.
This is meaningless. Which applications, and how much time of the total is spent in context_apply_fbo_state()?
In the 3DMark06 batch size 8 test, as of Friday's tip, it was 5.5% of wined3d; 1.2% of total wine CPU time. Searching the FBO list is almost all of that. It's called every time drawPrimitive() is (assuming ORM_FBO), so that shouldn't be much of a shock.
How does this translate into concrete thing like frame time?
It's an improvement, but I can't give you very meaningful FPS numbers there for individual patches. In my testing raw FPS is less reliable than profile results; it's dependent on GPU time as well, which I can't profile. The range of normal FPS variability is larger than the effect of most individual patches, including this one.
More importantly, why does this change come before the one for tracking FBO dirty state?
Really? I sent both at the same time, and the order bothers you? Really??
Avoiding redundant FBO entry comparisons completely is likely to have a much more significant effect than making the comparisons themselves slightly cheaper in very specific applications.
Actually, it would take a much more specific FBO configuration for this to make no difference. The whole list would have to have the same number of render targets, and almost all of them full. memcmp() is expensive. Avoiding unnecessary comparisons is cheap. This makes both successful and unsuccessful FBO entry comparisons cheaper. The other patch helps too, that's why I submitted both.