Normalise the incoming vkd3d_shader_instruction IR to the shader model 6 pattern where only one patch constant function is emitted. This allows generation of a single patch constant function in SPIR-V.
--
v5: vkd3d-shader: Introduce an internal sm6 signature structure.
vkd3d-shader/spirv: Move the function declaration from spirv_compiler_begin_shader_phase() to spirv_compiler_enter_shader_phase().
vkd3d-shader/spirv: Remove the hull shader phase array.
vkd3d-shader/spirv: Merge all shader IR fork and join phases into a single phase.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/84
Sorry to let this stagnate for months, but I can confirm that an NVidia GPU which exposes 32768 as a limit for Vulkan will still refuse to create d3d12 textures with a dimension larger than 16384, even with feature level 12.1. (And it still exposes 16384 as a limit for d3d9). Given that, I think the right thing to do is just add a test that the limit is no higher than 16384, and probably also cap this in wined3d rather than ddraw.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/126#note_23674
I took some time to look into this to see if there's extra overhead, and while I think there are some things we could do better in draw_primitive(), there's probably not very much. Depending on what's in frame, the CS spends the majority of its time in draw_primitive(). Probably about 20% of that is spent acquiring the GL context, 40% loading the RTVs, 20% in context_apply_draw_state(); the rest is difficult to measure. This is on a relatively powerful radeonsi machine, with the swap interval hacked to zero; the total frame time is probably about 9 ms in the scenes I'm testing.
I think we can potentially cut draw_primitive() down to 10% of its current overhead if none of the state changes, but when we're doing 5000 draw calls per frame, even that may be too much. We could potentially buffer in wined3d, perhaps making use of EXT_multi_draw_arrays, but as Henri pointed out on IRC, we'd have to do a fair amount of work to invalidate (less than in ddraw itself), and this sort of thing probably doesn't perform well in newer d3d versions on Windows anyway. So buffering in ddraw is probably the right way to go. I'll look at the patch itself anon.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/2105#note_23673
First part of v2 of !27, which aims to:
* Allow allocation of variables of complex types that contain both numerics and objects across multiple register sets (regsets).
* Support the tex2D and tex3D intrinsics, inferring generic samplers dimension from usage, writing sampler declarations, and writing sample instructions.
* Support for arrays of resources for both SM1 and SM4 (not to be confused with the resource-arrays of SM 5.1, which can have non-constant indexes).
* Support for resources declared within structs.
* Support for synthetic combined samplers for SM1 and synthetic separated samplers for SM4, considering that they can be arrays or members of structs.
* Imitate the way the native compiler assigns the register indexes of the resources on allocation, which proved to be the most difficult thing.
* Support for object components within complex input parameters.
* Small fixes to corner cases.
This part consists on parsing the `tex2D()` and `tex3D()` intrinsics and beginning to support the allocation of variables across multiple regsets.
The whole series, is on my [master6](https://gitlab.winehq.org/fcasas/vkd3d/-/commits/master6) branch.
--
v7: vkd3d-shader/hlsl: Allocate register reservations in a separate pass.
vkd3d-shader/hlsl: Respect object reservations even if the object is unused.
vkd3d-shader/hlsl: Allocate objects according to register set.
vkd3d-shader/hlsl: Keep an hlsl_reg for each register set in hlsl_ir_var.
vkd3d-shader/hlsl: Store the type's register size for each register set.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/66