-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Am 2013-11-25 18:21, schrieb Henri Verbeet:
I haven't quite decided yet. One consideration is that if we want the usage hint to glBufferDataARB() to make sense, we can't necessarily use the same PBO both for downloading and uploading data.
The usage hint is gone from GL_ARB_buffer_storage. Do you really think it matters? Do we have any evidence that suggests that using one PBO to download and upload data causes problems? Even if it does cause problems we should check if it causes more problems than using one surface to download and upload in d3d9.
You know the AMD GPU and r600g driver pretty well. Is there anything in the hw or driver that suggests that the usage really matters and/or that using a PBO for PACK_BUFFER and UNPACK_BUFFER causes problems?
Another is essentially the various variants of d3d10's UpdateSubResource(), and d3d9's GetRenderTargetData() / GetFrontBufferData(). We'd obviously like to avoid stalling on those, but it's not clear to me if that's always possible. For example, always creating PBO's for sysmem surfaces and then uploading from there probably isn't going to work very well if the application just creates a single sysmem surface and uses that for updating different default pool surfaces with UpdateSurface(), unless perhaps it also uses DISCARD on the sysmem surface. Perhaps that's ok, but I don't think we really know at this point.
Most applications that use UpdateSurface / UpdateTexture that I've worked with in the command Stream development use DISCARD on the sysmem surface. I'd expect the application to stall on native d3d as well if it doesn't use DISCARD, but it is difficult to test.
The exception here are intro videos. Apps do all sorts of stupid things, including just using a D3DPOOL_MANAGED texture and calling it a day. It doesn't really matter because pure video playback doesn't queue up lots of commands anyway.
The one thing that matters performance-wise is not stalling the pipeline on map. I have not seen a single application that uses D3DPOOL_DEFAULT D3DUSAGE_DYNAMIC textures with DISCARD, except for one case of intro video playback. Not sure which game this was, I'd have to re-check them all. Even the applications that use UpdateSurface / UpdateTexture in real rendering don't stream a lot of data compared to the use of dynamic buffers, usually it's a texture update every other frame.
According to my testing, the only applications that profit from PBOs(in their current state, and in my cs tree) are Warcraft 3 and UT2004. They abuse backbuffer->map(READONLY) as a glFinish replacement, and loading the data into a PBO instead of sysmem gives them a ~10% performance boost.
At some point I considered a scheme where the device has a pool of PBOs and the resource update function grab one from there when needed. That probably has its own problems, but I haven't thought it all the way through either.
I plan to add a BO pool to the command stream patches to handle DISCARD, but I think that's only partially related to what you've considered.