Hi,
For a while we have been discussing on how to clean up / rewrite some parts of the surface code in WineD3D. I did submit a number of patches to initiate a rewrite of some parts but not all patches were correct and it wasn't clear what final design I had in mind. I'm posting my ideas to here to get some feedback on how to proceed. For people outside wined3d this might not be that interesting but I'll give a short overview of the problems, so that you could also understand it a little bit.
The main issue is that pieces of modern code and legacy DirectDraw code are entangled in a way which is not easy to untangle. There are various places which can be improved but for now I want to limit myself to 'BltOverride'. It is a big function shared between DirectDraw/D3D7 blitting and D3D8 CopyRects / D3D9 StretchRect and over the years it has become large and very ugly.
The way I thought about simplifying this call is to move some parts to helper functions and to remove legacy functionality (e.g. move colorfill back to Blt). A lot of the work which has to be done is not just about moving code around. Some functionality can these days be implemented in a nicer way thanks to functionality like shaders.
Below I have some pseudo code on how I think a chunk of BltOverride and the blit_shader calls it uses, should be implemented. The main idea is to move the majority of the code to the blit_shader. Unless color keying / 8-bit is required, I expect that no actual shader is needed. In case we need a shader, there are two cases namely when blit_shader is implemented using GPU shaders and the case we need a fall back. In case of the fall back, I think we need big chunks of the 'old' BltOverride code which needs to be moved over to ffp_blit.
BltOverride (color keyed) offscreen surface -> render target blitting { if(orm_fbo && can_stretch) stretch_rect_fbo <-- this check is around now; but I guess we ultimately want a IWineD3DDevice_StretchRect which contains this code and which can fall back to BltOverride) context = context_acquire(CTXUSAGE_BLIT) blit_shader->set_shader(src_surface) blit_shader->draw_quad(src_surface, src_rect, dst_rect) blit_shader->unset_shader(); }
arb_blit_shader based implementation { blit_shader->set_shader(src_surface) { generate_shader (if needed)
activate_shader if(color_keying_active) upload_colorkey_to_shader if(p8_active) load_p8_palette } blit_shader->draw(src_surface, src_rect, dst_rect) { surface_internal_preload(src_surface) { under the hood, LoadTexture/LoadLocation should not perform surface conversion for us basically the issue is how to distinguish between 2d and 3d color keying and how to separete this }
/* drawing */ if(fbo_blit) fbo_blit() else draw_quad() } blit_shader->unset_shader(); }
ffp_blit implementation { blit_shader->set_shader(src_surface) { if(p8_active) load_palette else not_much_todo } blit_shader->draw(src_surface, src_rect, dst_rect) { if(need_color_keying) play_with_color_keying_flags (like current BltOverride does) surface_internal_preload(src_surface); if(need_color_keying) restore_color_keying flags activate_alpha_test_magic
draw_quad() } blit_shader->unset_shader(); }
I think the design described above could work (perhaps I missed something, please correct me then). One of the issues is color keying. Right now color key conversion is performed in LoadLocation based on some flag magic. In case of a 2d blit using shaders this is not needed but for Direct3D purposes it is needed. I'm not sure how to handle this in a clear way. (I fear that we also get some new issues when a game alternates between 2d and 3d color keying).
I don't expect issues regarding 8-bit as long as we implement them in software and use opengl only during the upload (UnlockRect). If we would allow hardware blitting, it would need to be 8-bit -> 8-bit blits and no converted blits else we need hacks like paletteOverride which I would like to eliminate.
Roderick
On 23 March 2010 18:41, Roderick Colenbrander thunderbird2k@gmail.com wrote:
Hi,
...
How about something along these general lines:
blitter_setup_context(src, dst, rop, ...) { context = context_acquire(src/dst/NULL); select shader / setup ffp state / etc. return context; }
blitter_release_context(context); { /* add some cleanup here */ context_release(context); }
blitter->blit(...) { context = blitter_setup_context(); /* potential color key conversions go somewhere around here. */ LoadLocation(); draw rect / fbo blit / etc. ModifyLocation(); blitter_release_context(context); }
blitter->color_fill() { ... }
blitter->can_blit() { /* check e.g. surface pool (CPU/GPU/both), color fixups, rop support. */ }
device->blitters[] = { fbo_blit, shader_blit, ffp_blit, cpu_blit, };
surface_select_blitter(gl_info, op, src, dst, rop, ...) { foreach (device->blitters) { if (blitter->can_blit(...)) return blitter; } }
surface_blit(gl_info, src, dst, rop, ...) { blitter = select_blitter(...); blitter->blit(...); }
surface_color_fill(gl_info, dst, color, rop, ...) { blitter = select_blitter(...); blitter->color_fill(...); }
I think the design described above could work (perhaps I missed something, please correct me then). One of the issues is color keying. Right now color key conversion is performed in LoadLocation based on some flag magic. In case of a 2d blit using shaders this is not needed but for Direct3D purposes it is needed. I'm not sure how to handle this in a clear way. (I fear that we also get some new issues when a game alternates between 2d and 3d color keying).
D3D color keying is something that should be handled by the fragment pipe.
On Tue, Mar 23, 2010 at 11:05 PM, Henri Verbeet hverbeet@gmail.com wrote:
On 23 March 2010 18:41, Roderick Colenbrander thunderbird2k@gmail.com wrote:
Hi,
...
How about something along these general lines:
blitter_setup_context(src, dst, rop, ...) { context = context_acquire(src/dst/NULL); select shader / setup ffp state / etc. return context; } .. .. surface_color_fill(gl_info, dst, color, rop, ...) { blitter = select_blitter(...); blitter->color_fill(...); }
This is indeed more drastic than I thought about but it seems a nicer approach. I was already thinking about how to handle different 'fallbacks' since the current blitter isn't switchable and this approach looks like a good idea. I'll try to work it out a bit and experiment with it. When it starts working a bit, I will send some patches to this thread (unpolished ones likely though).
I think the design described above could work (perhaps I missed something, please correct me then). One of the issues is color keying. Right now color key conversion is performed in LoadLocation based on some flag magic. In case of a 2d blit using shaders this is not needed but for Direct3D purposes it is needed. I'm not sure how to handle this in a clear way. (I fear that we also get some new issues when a game alternates between 2d and 3d color keying).
D3D color keying is something that should be handled by the fragment pipe.
Sure color keying needs to be handled by the fragment pipe. Though I would like to avoid touching that area right now since I don't know anything about it and I had a feeling when I talked about it with Stefan that this code is a bit fragile. So in the meantime I would like to have a way to disable color key fixups in LoadLocation when we perform a 2d blit.
Roderick
On 24 March 2010 11:55, Roderick Colenbrander thunderbird2k@gmail.com wrote:
Sure color keying needs to be handled by the fragment pipe. Though I would like to avoid touching that area right now since I don't know anything about it and I had a feeling when I talked about it with Stefan that this code is a bit fragile. So in the meantime I would like to have a way to disable color key fixups in LoadLocation when we perform a 2d blit.
Perhaps it's possible to implement as a different surface location.
Am Mittwoch 24 März 2010 12:00:48 schrieb Henri Verbeet:
On 24 March 2010 11:55, Roderick Colenbrander thunderbird2k@gmail.com
wrote:
Sure color keying needs to be handled by the fragment pipe. Though I would like to avoid touching that area right now since I don't know anything about it and I had a feeling when I talked about it with Stefan that this code is a bit fragile. So in the meantime I would like to have a way to disable color key fixups in LoadLocation when we perform a 2d blit.
Perhaps it's possible to implement as a different surface location.
I'd leave color keying out of the blitter redesign and stick to the current alpha based color keying emulation. Fixing the blitter is complicated enough as it is.
Implementing it in the fragment pipeline is certainly possible, if we are prepared to drop support for d3d color keying on cards that don't at least support NV register combiners(texture shader is not needed), ati fragment shader or ARBfp. Requiring either of those is not unreasonable, support for that goes back to TNT2 cards, ATI radeon 8500 and I think all Intel GPUs.
However, implementing it in the fragment pipeline when possible and having a texture loading fallback like we currently have won't help a lot. All issues with the current setup know affect the code design. If we keep the fallback, we have to deal with the load time alpha replacement in some way anyway, and we don't gain anything substantial from the alternative nicer fragment pipeline based codepaths.
On Wed, Mar 24, 2010 at 12:35 PM, Stefan Dösinger stefan@codeweavers.com wrote:
Am Mittwoch 24 März 2010 12:00:48 schrieb Henri Verbeet:
On 24 March 2010 11:55, Roderick Colenbrander thunderbird2k@gmail.com
wrote:
Sure color keying needs to be handled by the fragment pipe. Though I would like to avoid touching that area right now since I don't know anything about it and I had a feeling when I talked about it with Stefan that this code is a bit fragile. So in the meantime I would like to have a way to disable color key fixups in LoadLocation when we perform a 2d blit.
Perhaps it's possible to implement as a different surface location.
I'd leave color keying out of the blitter redesign and stick to the current alpha based color keying emulation. Fixing the blitter is complicated enough as it is.
Implementing it in the fragment pipeline is certainly possible, if we are prepared to drop support for d3d color keying on cards that don't at least support NV register combiners(texture shader is not needed), ati fragment shader or ARBfp. Requiring either of those is not unreasonable, support for that goes back to TNT2 cards, ATI radeon 8500 and I think all Intel GPUs.
However, implementing it in the fragment pipeline when possible and having a texture loading fallback like we currently have won't help a lot. All issues with the current setup know affect the code design. If we keep the fallback, we have to deal with the load time alpha replacement in some way anyway, and we don't gain anything substantial from the alternative nicer fragment pipeline based codepaths.
If you are okay in keeping the current code, As a slight improvement the shader could do the 'alpha testing' which is nicer design wise.
Now back to the blitter design. Henri's proposal is to have several blitters for this discussion reduce it to shader_blit and cpu_blit. The cpu_blit would be the software fall back for a blit, colorfill or whatever operation. A potential (but perhaps ugly) design of a cpu fallback, could be to just call IWineD3DBaseSurface. A more drastic approach would basically be to kill the IWineD3DBaseSurface fallbacks and have ONE Blt and BltFast in surface.c which would just fall back to software when OpenGL isn't around. What do you think?
Roderick
On 24 March 2010 12:53, Roderick Colenbrander thunderbird2k@gmail.com wrote:
Now back to the blitter design. Henri's proposal is to have several blitters for this discussion reduce it to shader_blit and cpu_blit. The cpu_blit would be the software fall back for a blit, colorfill or whatever operation.
It's not necessarily a fallback, it's certainly possible that if e.g. both surfaces are in sysmem doing the blit on the CPU is going to be faster than doing a upload/blit/download sequence. (Which is only allowed because we're sloppy about resource pools in the first place.) A sysmem->GPU blit is essentially just an upload with an appropriate buffer, format flags, etc. There's also the possibility for e.g. an SSE based implementation, behind proper capability checking.
A potential (but perhaps ugly) design of a cpu fallback, could be to just call IWineD3DBaseSurface. A more drastic approach would basically be to kill the IWineD3DBaseSurface fallbacks and have ONE Blt and BltFast in surface.c which would just fall back to software when OpenGL isn't around. What do you think?
Calling IWineD3DBaseSurface is probably good enough for a start. I'd probably start with identifying the different blit functions inside BltOverride() and the logic for selecting them. Then you can separate those into functions, after which you can create separate blitter implementations. Once we have those we can work on cleaning up individual implementations, selection logic, etc.
On Wed, Mar 24, 2010 at 1:19 PM, Henri Verbeet hverbeet@gmail.com wrote:
On 24 March 2010 12:53, Roderick Colenbrander thunderbird2k@gmail.com wrote:
Now back to the blitter design. Henri's proposal is to have several blitters for this discussion reduce it to shader_blit and cpu_blit. The cpu_blit would be the software fall back for a blit, colorfill or whatever operation.
It's not necessarily a fallback, it's certainly possible that if e.g. both surfaces are in sysmem doing the blit on the CPU is going to be faster than doing a upload/blit/download sequence. (Which is only allowed because we're sloppy about resource pools in the first place.) A sysmem->GPU blit is essentially just an upload with an appropriate buffer, format flags, etc. There's also the possibility for e.g. an SSE based implementation, behind proper capability checking.
A potential (but perhaps ugly) design of a cpu fallback, could be to just call IWineD3DBaseSurface. A more drastic approach would basically be to kill the IWineD3DBaseSurface fallbacks and have ONE Blt and BltFast in surface.c which would just fall back to software when OpenGL isn't around. What do you think?
Calling IWineD3DBaseSurface is probably good enough for a start. I'd probably start with identifying the different blit functions inside BltOverride() and the logic for selecting them. Then you can separate those into functions, after which you can create separate blitter implementations. Once we have those we can work on cleaning up individual implementations, selection logic, etc.
I have experimented a bit with the design (intially only color fill since that's quite independent). Note this is just a proof of concept (some of the helper functions would become separate patches). This patch is the current code in the new form. In a later stage an FBO blit_shader can be added which can call the FBO based color fill but that's quite easy to extend later on. At this point no context setup is needed since the helpers already do that but after this is all over, they could be moved over to this framework to make it a bit nicer.
Roderick
On 24 March 2010 20:03, Roderick Colenbrander thunderbird2k@gmail.com wrote:
I have experimented a bit with the design (intially only color fill since that's quite independent). Note this is just a proof of concept (some of the helper functions would become separate patches). This patch is the current code in the new form. In a later stage an FBO blit_shader can be added which can call the FBO based color fill but that's quite easy to extend later on. At this point no context setup is needed since the helpers already do that but after this is all over, they could be moved over to this framework to make it a bit nicer.
The general direction looks ok. As a first step you probably don't even need to call the functions through struct blit_shader though.
On Wed, Mar 24, 2010 at 9:09 PM, Henri Verbeet hverbeet@gmail.com wrote:
On 24 March 2010 20:03, Roderick Colenbrander thunderbird2k@gmail.com wrote:
I have experimented a bit with the design (intially only color fill since that's quite independent). Note this is just a proof of concept (some of the helper functions would become separate patches). This patch is the current code in the new form. In a later stage an FBO blit_shader can be added which can call the FBO based color fill but that's quite easy to extend later on. At this point no context setup is needed since the helpers already do that but after this is all over, they could be moved over to this framework to make it a bit nicer.
The general direction looks ok. As a first step you probably don't even need to call the functions through struct blit_shader though.
I have worked on some more parts of the blit_shader code. Before I clean it up and submit it, I'm posting it here to see if the direction is ok. Below I summarize each patch and also mention when I think a patch is ready (if that's the case more thorough feedback is welcome). 0007 - Renames is_color_fixup_supported to blit_supported and adds some parameters. I think this patch is ready.
0008 - Adds a blit method to arbfp_blit and limits the arbfp shader to complex blits (yuv/p8); for now this disables hw accelerated 8-bit color keying but that can easily be readded once the transition is over by modifying the p8 shader. I think this patch is ready. (I'm not that happy with the name arbfp_blit_surface but I had to use something else because arbfp_blit won't work and arbfp_blit_blit is a bit silly I think)
0009 - This moves the remaining 'offscreen -> render target' code to ffp_blit. I have thought long about whether to move the color key flag voodoo over as well. You can reason that it is up to the blit_shader how and if it implements color keying. If you move the code over, you would need some of the DDBLTFX information in the blit_shader (either through a parameter or some surface variable). I'm not sure if we want to go that way. I think it is better to leave the voodoo in one place. Later on I want to use it for the p8 arbfp code as well. As mentioned in a next patch the moved code, should likely become a helper function, so that ffp_blit.blit can decide whether to call 'ffp_blit_offscreen_to_render_target' or the fb_copy* calls.
0010 - Adds an initial fbo_blit implementation which supports blitting. I think this patch is ready but I'm only not sure where we want to have the fbo_blit code. Right now stretch_rect_fbo is in device.c (and the same for fbo color fill but Henri said that we shouldn't use it). I would say that the code should be in surface.c unless it is fine to use stretch_rect_fbo (some types have to be unified to be able to get rid of casts) as a helper function.
0011 - This patch is work in progress and moves the fb_copy_* calls to ffp_blit. It adds initial selection logic to ffp_blit.blit to decide which helper function to call. The logic tries to use SFLAG_SWAPCHAIN and WINED3DUSAGE_RENDERTARGET which I think works to get rid of some of the swapchain/surface checks which are around in BltOverride. Note this patch patch won't compile, so I also haven't tested it ;)
If this work is in the right direction, I plan to work on the blitter selection next. I would apply it to 'render_target -> texture' and 'offscreen -> render_target'.
Thanks, Roderick
On 31 March 2010 15:07, Roderick Colenbrander thunderbird2k@gmail.com wrote:
I have worked on some more parts of the blit_shader code. Before I clean it up and submit it, I'm posting it here to see if the direction is ok.
I already commented on some issues with 0007 and 0008 since you already submitted those. As for the general direction, I think you're still trying to take fairly large steps. E.g., it's much easier to first introduce things like arbfp_blit_supported() and arbfp_blit_surface() as a separate functions, work them out properly for each implementation, and only then add them to the blitter interface. (I think I mentioned that before.) That's much nicer to review too, since I wouldn't be reviewing 3 or 4 different things at the same time.
0008 - Adds a blit method to arbfp_blit and limits the arbfp shader to complex blits (yuv/p8); for now this disables hw accelerated 8-bit color keying but that can easily be readded once the transition is over by modifying the p8 shader. I think this patch is ready. (I'm not that happy with the name arbfp_blit_surface but I had to use something else because arbfp_blit won't work and arbfp_blit_blit is a bit silly I think)
I didn't look too close at this one due to the reasons mentioned above, but my general feeling is that it may need some more work. (Beyond what I already mentioned in the reply to that patch.)
0009 - This moves the remaining 'offscreen -> render target' code to ffp_blit. I have thought long about whether to move the color key flag voodoo over as well. You can reason that it is up to the blit_shader how and if it implements color keying. If you move the code over, you would need some of the DDBLTFX information in the blit_shader (either through a parameter or some surface variable). I'm not sure if we want to go that way. I think it is better to leave the voodoo in one place.
You could also try something like introducing a function to the blitter interface to set color key information. I don't think this is the most important thing to worry about at this point though, so it's ok to leave it where it is for the moment. Note that that does imply that you can't have BLIT_OP_COLOR_KEYED_BLIT either yet.
0010 - Adds an initial fbo_blit implementation which supports blitting. I think this patch is ready but I'm only not sure where we want to have the fbo_blit code. Right now stretch_rect_fbo is in device.c (and the same for fbo color fill but Henri said that we shouldn't use it). I would say that the code should be in surface.c unless it is fine to use stretch_rect_fbo (some types have to be unified to be able to get rid of casts) as a helper function.
I think surface.c is ok. The reason it's in device.c is mostly historic.