This patch series includes an implementation of the long-pending `transpose` intrinsic and the `smoothstep` intrinsic.
While implementing `smoothstep` I realized that some intrinsics have different rules for the allowed data types than expressions:
- Vectors and matrices at the same time are not allowed, regardless of their dimensions. Even if they have the same number of components.
- Any combination of matrices is always allowed, even those when no matrix fits inside another, e.g.:
`float2x3` is compatible with `float3x2`, resulting in `float2x2`.
The common data type is the min on each dimension.
This is the case for `max`, `pow`, `ldexp`, `clamp` and `smoothstep`; which suggest that it is the case for all intrinsics where the operation is applied element-wise. So this was corrected.
A minor fix in `pow`'s type conversion is also included.
--
v2: vkd3d-shader/hlsl: Use add_unary_arithmetic_expr() in intrinsic_pow().
vkd3d-shader/hlsl: Convert elementwise intrinsics args to the proper common type.
tests: Test for common type conversion for element-wise intrinsics.
vkd3d-shader/hlsl: Support smoothstep() intrinsic.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/53
Matteo Bruni (@Mystral) commented about dlls/d3d10/effect.c:
> + unsigned int i;
> +
> + *retval = 0.0f;
> + for (i = 0; i < instr->comp_count; ++i)
> + *retval += args[0][instr->scalar ? 0 : i] * args[1][i];
> +}
> +
> +static void pres_dotswiz(float **args, unsigned int n, const struct preshader_instr *instr)
> +{
> + float *retval = args[n];
> + unsigned int i;
> +
> + *retval = 0.0f;
> + for (i = 0; i < n; ++i)
> + *retval += *args[i] * *args[i + n / 2];
> +}
This doesn't look right to me either.
In any case, I think we want a test for this right away.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1684#note_19578
Matteo Bruni (@Mystral) commented about dlls/d3d10/effect.c:
>
> typedef void (*pres_op_func)(float **args, unsigned int n, const struct preshader_instr *instr);
>
> +static void pres_mov(float **args, unsigned int n, const struct preshader_instr *instr)
> +{
> + *args[1] = *args[0];
> +}
> +
Does this do what you want? Right now I think it just copies a pointer around.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1684#note_19577
Matteo Bruni (@Mystral) commented about dlls/d3d10/effect.c:
> { 0x211, "bige", pres_bige },
> { 0x212, "bieq", pres_bieq },
> { 0x213, "bine", pres_bine },
> + { 0x214, "buge", pres_buge },
> + { 0x215, "bult", pres_bult },
> + { 0x219, "imul", pres_imul },
> { 0x21a, "udiv", pres_udiv },
> { 0x21e, "imax", pres_imax },
> + { 0x21f, "umin", pres_umin },
> + { 0x220, "umax", pres_umax },
Same for these.
I guess you have some test in queue after this MR. Maybe it could go in early with some temporary todo_wine and error handling, if it's not too messy.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1684#note_19576
If a hlsl_ir_load loads a variable whose components are stored from different
instructions, copy propagation doesn't replace it.
But if all these instructions are constants (which currently is the case
for value constructors), the load can be replaced with a constant value, which
is what the first patch of this series does.
For instance, this shader:
```
sampler s;
Texture2D t;
float4 main() : sv_target
{
return t.Gather(s, float2(0.6, 0.6), int2(0, 0));
}
```
results in the following IR before applying the patch:
```
float | 6.00000024e-01
float | 6.00000024e-01
uint | 0
| = (<constructor-2>[@4].x @2)
uint | 1
| = (<constructor-2>[@6].x @3)
float2 | <constructor-2>
int | 0
int | 0
uint | 0
| = (<constructor-5>[@11].x @9)
uint | 1
| = (<constructor-5>[@13].x @10)
int2 | <constructor-5>
float4 | gather_red(resource = t, sampler = s, coords = @8, offset = @15)
| return
| = (<output-sv_target0> @16)
```
and this IR afterwards:
```
float2 | {6.00000024e-01 6.00000024e-01 }
int2 | {0 0 }
float4 | gather_red(resource = t, sampler = s, coords = @2, offset = @3)
| return
| = (<output-sv_target0> @4)
```
This is required to write texel_offsets as aoffimmi modifiers in the sm4 backend, since it expects the texel_offset arguments to be hlsl_ir_constant.
This series also:
* Allows Gather() methods to use aoffimmi modifiers instead of an additional source register (which is the only way allowed for shader model 4.1), when possible.
* Adds support to texel_offsets in the Load() method via aoffimmi modifiers (the only allowed method).
--
v4: vkd3d-shader/hlsl: Propagate swizzle chains in copy propagation.
vkd3d-shader/hlsl: Replace swizzles with constants in copy prop.
tests: Test constant propagation through swizzles.
vkd3d-shader/hlsl: Support offset argument for the texture Load() method.
tests: Test offset argument for the texture Load() method.
vkd3d-shader/hlsl: Use aoffimmis when writing gather methods.
vkd3d-shader/hlsl: Replace loads with constants in copy prop.
https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/51