Hi, I haven't reviewed in detail yet, because my feeling is that right now we're adding features to the shader runner without a clear idea of what it should eventually look like, so I'd like to have a discussion about that first.
My personal idea is that eventually all the components in the shader runner pipeline should be as configurable as possible. I had already written something in the [vkd3d-todo](https://wiki.winehq.org/Vkd3d-todo), but let me repeat that here. We basically have the following variables: * the library used to compile the shaders: vkd3d-shader, d3dcompiler_xx.dll or dxcompiler.dll (in particular dxcompiler.dll would have no special treatment as it has now); * the Shader Model to target; * the API to use to run the shader (d3d9, d3d10, d3d11, d3d12, Vulkan, or even none if we just want to test the compiler); * when applicable, whether to execute the shaders using the native implementation or vkd3d; * possibly, one day, the library to use to recompile the shaders (it currently only makes sense on macOS once we support the Apple shader compiler for Metal; in all the other cases the choice is forced).
Ideally there would be a function in the shader runner that takes these options and runs a .shader_test file with that configuration. The shader runner executable would then accept command line options that describe the desired configuration, or run a number of sensible configurations depending on which are available (e.g., whether this is a Linux or MinGW build: clearly native libraries aren't available on Linux, with the exception of dxcompiler.dll). This latest mode would be what would be run on the CI.
By multiplying the number of available configurations the language to define which outcomes are expected might become somewhat more complicated. The usual convention should remain in force: the native implementations are assumed to be the "ground truth" on which `fail` and `notimpl` (and similar ones, if needed) are gauged, while `todo` is used to measure the difference of any other configuration with respect to native. It might at some point turn out to be necessary to evaluate `todo` on something more complicated than just the Shader Model, which is just one of the variables above, but I hope this complexity to be manageable.
This is more of a side thing, but if we eventually come to the point I'm describing the MinGW tests will likely be strictly more powerful than the crosstests, so the crosstests could be dropper. At least for the shader runner, but probably similar considerations could be done for the other crosstests.
What's your view on that? I guess @zfigura, @Mystral and @hverbeet might have something to say too.