http://bugs.winehq.org/show_bug.cgi?id=11674
--- Comment #266 from Pierre-Loup Griffais bugs.winehq.org@plagman.net 2012-12-21 11:59:47 CST --- (In reply to comment #265)
The specific implementation would probably need some work even if it did go in, but as far as I'm concerned it won't go in at all unless it either results in a general improvement on other drivers as well, or NVIDIA can be bothered to at least answer questions. I'm still particularly unhappy about the botched RandR implementation, and NVIDIA's unresponsiveness in general.
Hi Henri,
Thanks for your interest; I'm sorry for the confusion, I didn't realize Stefan didn't keep you in the loop following our discussion on that topic. The point I was making is that BufferSubData inherently maps better to dynamic buffer workloads than MapBuffer-based updating in threaded usecases. Since BufferSubData requests can be immediately queued in-band in both commands streams, overhead is kept at a minimum and maximal GPU throughput can be achieved. Stefan pointed out that memory usage was a big concern for Wine, and I also believe this approach maps the best to keeping memory usage low. Having a fast implementation of MapBufferRange is threaded mode would require either more memory/VM or more overhead than directly using BufferSubData, depending on the approach used.
For reference, here are the numbers I get from Stefan's test case which maps to D3D dynamic buffer updating patterns, modified with my recommended path:
Not Threaded Threaded MapBuffer 124 23-67 (very bursty) BufferSubData 156 550
Don't hesitate to contact me should you have any more questions; thanks! - Pierre-Loup