IDirect3DDevice9::StretchRect is used to stretch-blit between video memory surfaces. It's implemented by calling IWineD3DSurfaceImpl_Blt, which itself will attempt IWineD3DSurfaceImpl_BltOverride to accelerate it. However, BltOverride will not accelerate the blit if neither surface is a swapchain or active render target 0, and will fall back to a sysmem->sysmem blit with possible conversions. This seems to happen in between every post-processing (screen-space shader) pass with Oblivion Graphics Extender and Morrowind Graphics Extender, which causes a severe loss in performance.
After some testing and talking with Henri, it seems a number of checks aren't needed if FBO blits are available, particularly the swapchain/active render target checks. The attached patch makes BltOverride check for FBO blits earlier, which helps it catch more cases where blits can be accelerated. It provides a significant improvement with the aforementioned programs.
I'm not sure of the full consequences of this move, however, particularly with earlier DX versions. Hence a request for comments.
On Sunday 27 March 2011 06:26:47 Chris Robinson wrote:
After some testing and talking with Henri, it seems a number of checks aren't needed if FBO blits are available, particularly the swapchain/active render target checks. The attached patch makes BltOverride check for FBO blits earlier, which helps it catch more cases where blits can be accelerated. It provides a significant improvement with the aforementioned programs.
BltOverride is a horrible mess, and I'm afraid the patch doesn't make it any better. A long time ago Roderick started a cleanup to give the blit selection routine a more structured approach, but he never finished the work. Unfortunately a proper fix is not a weekend task :-(
Besides simple unconverted blits we should also be able to do converted blits via FBOs + shader draws, e.g. to blit a YUV surface to an offscreen render target. This is needed for HW accelerated video playback with quicktime. And I am sure there are many other things that need to be considered.
On Sunday, March 27, 2011 1:57:17 AM Stefan Dösinger wrote:
BltOverride is a horrible mess, and I'm afraid the patch doesn't make it any better. A long time ago Roderick started a cleanup to give the blit selection routine a more structured approach, but he never finished the work. Unfortunately a proper fix is not a weekend task :-(
Ultimately, I think the best approach would be to properly split up the blitters. Currently, WineD3D uses a catch-all Blt method that handles almost all blitting possibilities (including color-keying), with a fallback to do manual conversions and blits if an "accelerated" path is missed or not available, and also handling color-fill (and patterned drawing?).
The way D3D9 does it, there's different methods to handle blitting to and from various types of surfaces (eg, StretchRect for vidmem->vidmem only, no color- keying, and acts as a resolver for multisampled surfaces). If I could, I'd add methods to IWineD3DDevice to handle it that way. Different methods for different purposes, and making the d3d/ddraw dlls select the appropriate blitter. Although it'd result in a bit of duplication until the monolithic Blt method can be removed, and I'm not all too familiar with wined3d's resource handling, nor do I know how to properly add new methods to wined3d interfaces.
Given how daunting a proper cleanup is, I figured it'd still be a good idea to handle more cases for accelerated blits with a "simple" patch, instead of letting it go vidmem->sysmem[->conversion->sysmem]->stretch->sysmem->vidmem, in addition to necessitating more converters that aren't otherwise needed (without this patch, it needs to use argb8->xrgb8 and rgba16f->xrgb8 converters, the latter of which doesn't exist and adding it would be just as much of a hack as playing with BltOverride, IMO, not to mention being left with really crappy performance).
On 27 March 2011 10:57, Stefan Dösinger stefandoesinger@gmx.at wrote:
BltOverride is a horrible mess, and I'm afraid the patch doesn't make it any better.
Yeah, I'm not so convinced this is really an improvement either.