Towards that aim, I've ported Stefan Dösinger's initial implementation of this from October last year [1] to what turned out to be 1.1.14.
I think they passed the last time I tested it. You could try to check out the old wine version and test with the original patch. I could be wrong though.
It fails two more D3D9 tests than 1.1.14: visual.c:914: Test failed: Transformed vertex with linear vertex fog has color 0000ff00 visual.c:986: Test failed: Transformed vertex with linear vertex fog has color 0000ff00
(I don't know if these failed when it was first implemented, so I don't know if the problem is the patch, the port, or even the test. This is on my nVidia 8600 w/180.22 binary drivers, in case it makes a difference.)
I haven't run it with any games yet, or profiled it. Stefan's commented since that it was slower than the current system, albeit unoptimised, and that was the reason he didn't undertake any further work on it.
The main performance problem is I think the lighting code. It looks horrible and probably uses much more instructions than needed. One app that is hit pretty hard by drawStridedSlow is 3DMark 2000(esp. the heli test in high detail). This app actually slows down instead of gaining speed.
I don't know what games use the fixed-function pipeline, so I'm open to suggestions and bug report references.
Basically everything <= d3d7 has to. But many newer games still do. I have seen heavy ffp drawing in even very new games. The D3D SDK samples are also very good for testing.
All but the last patch in this tarball are ports of Stefan's original, and were done using git-am so still retain his original headers. The sorta-nasty no_d3dcolor_swizzle code in get_color and its caller in the last patch is entirely my fault though. ^_^
There's a potential place for your fog test problems. You could disable the extension in directx.c for testing/development(just comment out the line in the ext table) and later on work on properly using it with the ffp code.
I'm not totally sure I've got all the hard parts of the port right. The two relevant changes since these patches were applied were the implementation of EXT_vertex_buffer_bgra and the rearrangement of the fog code.
For EXT_vertex_array_bgra and vertex_pipe->can_convert_d3dcolor, I'm not sure if there are places that check one and should check the other, and I'm not sure if the test in state.c line 4280 (streamsrc) is doing the right thing. I think it is doing the right thing. The vertex program recognises and handles a position_transformed vertex declaration itself.
I think we should make this two different flags in the pipeline description, and only use those in the rest of the code(or functions that return TRUE and FALSE). e.g.
can_convert_d3dcolor fixed func: GL_SUPPORT(EXT_vertex_array_bgra); replacement: TRUE;
can_handle_rhw: fixed func: FALSE replacement: TRUE
Ideally only the fixed func's can_convert_d3dcolor() function will check GL_SUPPORT(EXT_vertex_array_bgra), the rest of the code just asks the selected pipeline implementation.