There are two motivations for this:
* First, the structure *almost* corresponds to D3D12_SHADER_INPUT_BIND_DESC, and
if more elements were included, it could be used as-is for shader reflection.
There is the quirk that currently we return scan information based on the
shader instructions, whereas d3dcompiler shader reflection expects to get it
from the shader reflection data (i.e. the RDEF chunk), which is particularly
relevant in the case that the RDEF chunk is stripped.
That said, even if we have to introduce an extra scan API to account for this
difference, being able to reuse the same structure seems like a benefit.
In order to reuse this structure, we need to add the following elements:
- Register ID (added in part 1 of this series)
- Sample count (added in part 2 of this series)
- Flags or resource types to distinguish between typed, raw, and structured
buffers. I have not decided which representation makes the most sense;
opinions are welcome.
* Second, I think it makes sense to use this reflection information internally
in spirv.c (and potentially other compiler backends) to declare resources in
the target environment, instead of parsing DCL instructions. The idea here is
that this allows backends to be more agnostic as to how resources are declared
(or inferred) in the frontend, while avoiding the need to synthesize those DCL
instructions in the frontend either [especially since epenthesizing
instructions is more expensive than converting them to NOPs.]
In order to do that, we will need vkd3d_shader_scan_descriptor_info1 to cover
everything that is currently covered by DCL instructions. This needs the same
elements as above (register ID and sample count), but also:
- Structure stride (added in part 2 of this series)
- Constant buffer used width (added in part 2 of this series)
I don't currently have a proof-of-concept using these new elements. On the other
hand, since it's just an extension of an existing API, I figured that seemed
less critical.
This does conflict trivially with 280; I'm submitting it now since 280 is
accepted, but due to Alexandre's vacation may not be committed soon, and since
this is new API I'd rather get comments early anyway.
--
v2: vkd3d-shader: Get rid of the uav_ranges array.
vkd3d-shader: Add register ID to struct vkd3d_shader_descriptor_info1.
vkd3d-shader: Introduce struct vkd3d_shader_scan_descriptor_info1.
vkd3d-shader: Centralize cleanup on error in scan_with_parser().
vkd3d-shader: Factor more code into vkd3d_shader_scan_get_uav_descriptor_info().
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/295
--
v9: vkd3d-shader/tpf: Handle the swizzle type bitfield in dst param tokens.
vkd3d-shader/tpf: Handle the dimension bitfield in dst param tokens.
vkd3d-shader/tpf: Use the default vec4 swizzle if a src param contains a mask.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/225
Updates reported driver versions
Fixes complaints of old drivers in Diablo IV
Note: The Intel 31.0.101.4577 driver only actually supports UHD 7XX and above. Should I be trying to track down the last supported driver for each of the previous GPU generations or is it fine to just pretend that we have a 101.4577 driver that supports HD4000+?
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/3516
This is similar to https://gitlab.winehq.org/wine/wine/-/merge_requests/2684, https://gitlab.winehq.org/wine/wine/-/merge_requests/3004 or https://gitlab.winehq.org/wine/wine/-/merge_requests/3139 but it validates the session transform node behavior with tests.
The tests are added after the changes because they otherwise don't pass and making them pass would be unnecessarily complicated.
I also have some local tests for MF_TOPONODE_WORKQUEUE_ID attributes and how it is supposed to behave. It basically creates new serialized work queues for every source node and assign them to every node downstream. Any sample request or processing operation is done in the associated queue.
When joining streams, queues are assigned downstream one after another and the last assigned queue is used when requesting samples upstream, but when samples are received and processed downstream it looks like the current queue of the source node is used for every downstream operations.
The request behavior seems to be the same when work queues are used, with round robin input requests, and single ProcessInput call followed by ProcessOutput loop until it fails.
This is yet not optimally efficient, and could be improved, for the following reasons:
1) All session operations are serialized together, even unrelated streams, and ProcessInput / ProcessOutput calls may be costly and stalling the pipeline. I believe that native probably allows parallel processing of unrelated stream requests, this needs to be confirmed.
2) MFT_MESSAGE_COMMAND_DRAIN message use isn't ideal, the message forces the transform to process all queued input synchronously, which can take a long time. I haven't checked exactly what native does but I believe it instead uses MFT_MESSAGE_NOTIFY_END_OF_STREAM messages, which would allow us to notify and drain the GStreamer decoder asynchronously.
3) MFT_MESSAGE_COMMAND_DRAIN also doesn't distinguish between input streams and needs to be sent globally. It's unclear how it should be used when multiple input streams are involved, and when one stream ends its segment then start a new segment while other streams don't have yet reached EOS. MFT_MESSAGE_NOTIFY_END_OF_STREAM messages have a stream ID parameter and would be more appropriate to handle separate input streams independently.
--
v3: mf/tests: Add more media session transform tests with multiple inputs.
mf/tests: Add more media session transform tests with multiple outputs.
mf/tests: Add some media session transform call pattern tests.
mf/session: Increase the stream request count when requests are already queued.
mf/session: Request more samples from upstream when necessary.
mf/session: Push transform input samples one by one to ProcessInput.
mf/session: Use helpers to push and pop samples for transform streams.
mf/session: Flush requests in transform_node_deliver_samples when drained.
mf/session: Use a helper to deliver transform node requested samples.
mf/session: Use local variables to access transform node streams.
https://gitlab.winehq.org/wine/wine/-/merge_requests/3245