Re: [PATCH v4 0/7] MR320: vkd3d-shader/dxil: Implement DX instruction LoadInput.

6 Sep 2023

      On Wed Sep  6 10:55:59 2023 +0000, Henri Verbeet wrote:
...
...
...
How valuable is that? If it's significant, we may want to consider
converting TPF to SSA form as well in the SPIR-V backend.
...
It seems to me quite likely that translating sm4 to SSA would be more
expensive than passing it to the driver (and letting the driver
translate it to SSA internally, probably), but translating sm6 to temps
would be more expensive than passing it through to the driver.
To be clear, my expectation would be that these redundant OpStore and
OpLoad instructions largely don't matter. I could very well be wrong
about that, of course. And perhaps there are other reasons to avoid
using temps here, but I haven't seen those.
So what I imagine we want to do, is to start with naively translating
DXIL SSA values to vsir temp assignments. That should unblock
upstreaming plenty of bits. We don't have control flow yet, we don't
have phi instructions yet either, and it's not clear to me that there's
a fundamental reason e.g. vsir phi instructions couldn't operate on vsir
temps anyway. I think there are a few scenarios that could happen from there:

It's fine.
We introduce vsir SSA registers for some other reason, e.g. because

the HLSL compiler wants to do optimisations in that form; DXIL/SPIR-V
could then just take advantage of that.

We encounter some issue that can't reasonably be resolved with vsir

temps. We discuss it when we get there, instead of now in the abstract.

We do some benchmarking/profiling and find that there are either

advantages in compile time or run time to introducing a separate
register type. We discuss it when we get there, with the hard data to
back it up, instead of speculating about it now.
And perhaps this is also a good time to reflect on the broader
upstreaming strategy. On a very broad level, the sequence that I would
have hoped/expected to see would be something along these lines:

Some shader_runner infrastructure to compile and run DXIL shaders.
The most basic, straightforward implementation of the bits required

to make a fairly minimal test pass. E.g., tests/hlsl/swizzles.shader_test.

Basic, straightforward implementations of features required to make

the rest of the tests pass.

Features required by applications, but not covered by the tests.

Writing tests for these as they're implemented.

Optimisations and other complications.

Also, I'd like to stress this because it has come up before for other
MRs, please don't pre-emptively add complications for issues that will
only come up later; whether that's in a later patch in the same series
or 400 patches later in the branch you're upstreaming from. Reviewers
generally can't or don't want to look that far ahead, and it just ends
up slowing the entire process down. Patches should make sense in
isolation, at the point in time where they're introduced.
Just to explain my acceptance, I think too that the ideal flow would be to first convert SSA to temporary load and store, and introduce later SSA if there is an advantage. Dealing with loads and stores is something I expect downstream compilers to do pretty well (even the HLSL compiler does that, to a certain extent!), so I wouldn't stress too much over that. Phi nodes shouldn't be too much of a hassle either: just before emitting a jump you look at the phi nodes of the target block and convert them to assignments, unless I am missing something.
That said, I don't think it is unreasonable to go with SSA from the get-go, so I accept this MR and leave to Henri to do the actual software architecture part.
-- 
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/320#note_44409

2025

2024

2023

2022

Re: [PATCH v4 0/7] MR320: vkd3d-shader/dxil: Implement DX instruction LoadInput.