[PATCH 0/4] MR718: vkd3d-shader/dxil: Implement miscellaneous arithmetic DX intrinsics.

List overview All Threads

newer

older

[PATCH 0/1] MR5313: win32u: Fix...

Re: [PATCH v5 0/3] MR704:...

Conor McCarthy (＠cmccarthy)

14 Mar 2024 14 Mar '24

4:54 a.m.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718

Show replies by date

Conor McCarthy

14 Mar 14 Mar

4:54 a.m.

New subject: [PATCH 1/4] vkd3d-shader/spirv: Use dst register data type in spirv_compiler_emit_imad().

From: Conor McCarthy cmccarthy@codeweavers.com

--- libs/vkd3d-shader/spirv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libs/vkd3d-shader/spirv.c b/libs/vkd3d-shader/spirv.c index bd0bfb8e0..34de9b9fb 100644 --- a/libs/vkd3d-shader/spirv.c +++ b/libs/vkd3d-shader/spirv.c @@ -7474,7 +7474,7 @@ static void spirv_compiler_emit_imad(struct spirv_compiler *compiler, unsigned int i, component_count;

component_count = vsir_write_mask_component_count(dst->write_mask); - type_id = vkd3d_spirv_get_type_id(builder, VKD3D_SHADER_COMPONENT_INT, component_count); + type_id = vkd3d_spirv_get_type_id_for_data_type(builder, dst->reg.data_type, component_count);

for (i = 0; i < ARRAY_SIZE(src_ids); ++i) src_ids[i] = spirv_compiler_emit_load_src(compiler, &src[i], dst->write_mask);

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718

Conor McCarthy

4:54 a.m.

New subject: [PATCH 2/4] vkd3d-shader/dxil: Implement DX intrinsics FMa, FMad, IMad and UMad.

From: Conor McCarthy cmccarthy@codeweavers.com

--- libs/vkd3d-shader/dxil.c | 39 +++++++++++++++++++++++++ tests/hlsl/majority-pragma.shader_test | 9 +++--- tests/hlsl/majority-typedef.shader_test | 2 +- 3 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/libs/vkd3d-shader/dxil.c b/libs/vkd3d-shader/dxil.c index de51588b5..baf3341b0 100644 --- a/libs/vkd3d-shader/dxil.c +++ b/libs/vkd3d-shader/dxil.c @@ -374,6 +374,10 @@ enum dx_intrinsic_opcode DX_IMIN = 38, DX_UMAX = 39, DX_UMIN = 40, + DX_FMAD = 46, + DX_FMA = 47, + DX_IMAD = 48, + DX_UMAD = 49, DX_IBFE = 51, DX_UBFE = 52, DX_CREATE_HANDLE = 57, @@ -4080,6 +4084,37 @@ static void sm6_parser_emit_dx_create_handle(struct sm6_parser *sm6, enum dx_int ins->handler_idx = VKD3DSIH_NOP; }

+static enum vkd3d_shader_opcode sm6_dx_map_ma_op(enum dx_intrinsic_opcode op, const struct sm6_type *type) +{ + switch (op) + { + case DX_FMA: + case DX_FMAD: + return (type->u.width == 64) ? VKD3DSIH_DFMA : VKD3DSIH_MAD; + case DX_IMAD: + case DX_UMAD: + return VKD3DSIH_IMAD; + default: + vkd3d_unreachable(); + } +} + +static void sm6_parser_emit_dx_ma(struct sm6_parser *sm6, enum dx_intrinsic_opcode op, + const struct sm6_value **operands, struct function_emission_state *state) +{ + struct vkd3d_shader_instruction *ins = state->ins; + struct vkd3d_shader_src_param *src_params; + unsigned int i; + + vsir_instruction_init(ins, &sm6->p.location, sm6_dx_map_ma_op(op, operands[0]->type)); + if (!(src_params = instruction_src_params_alloc(ins, 3, sm6))) + return; + for (i = 0; i < 3; ++i) + src_param_init_from_value(&src_params[i], operands[i]); + + instruction_dst_param_init_ssa_scalar(ins, sm6); +} + static void sm6_parser_emit_dx_get_dimensions(struct sm6_parser *sm6, enum dx_intrinsic_opcode op, const struct sm6_value **operands, struct function_emission_state *state) { @@ -4833,6 +4868,8 @@ static const struct sm6_dx_opcode_info sm6_dx_op_table[] = [DX_FIRST_BIT_HI ] = {"i", "m", sm6_parser_emit_dx_unary}, [DX_FIRST_BIT_LO ] = {"i", "m", sm6_parser_emit_dx_unary}, [DX_FIRST_BIT_SHI ] = {"i", "m", sm6_parser_emit_dx_unary}, + [DX_FMA ] = {"g", "RRR", sm6_parser_emit_dx_ma}, + [DX_FMAD ] = {"g", "RRR", sm6_parser_emit_dx_ma}, [DX_FMAX ] = {"g", "RR", sm6_parser_emit_dx_binary}, [DX_FMIN ] = {"g", "RR", sm6_parser_emit_dx_binary}, [DX_FRC ] = {"g", "R", sm6_parser_emit_dx_unary}, @@ -4841,6 +4878,7 @@ static const struct sm6_dx_opcode_info sm6_dx_op_table[] = [DX_HCOS ] = {"g", "R", sm6_parser_emit_dx_unary}, [DX_HSIN ] = {"g", "R", sm6_parser_emit_dx_unary}, [DX_HTAN ] = {"g", "R", sm6_parser_emit_dx_unary}, + [DX_IMAD ] = {"m", "RRR", sm6_parser_emit_dx_ma}, [DX_IMAX ] = {"m", "RR", sm6_parser_emit_dx_binary}, [DX_IMIN ] = {"m", "RR", sm6_parser_emit_dx_binary}, [DX_ISFINITE ] = {"1", "g", sm6_parser_emit_dx_unary}, @@ -4873,6 +4911,7 @@ static const struct sm6_dx_opcode_info sm6_dx_op_table[] = [DX_TEXTURE_LOAD ] = {"o", "HiiiiCCC", sm6_parser_emit_dx_texture_load}, [DX_TEXTURE_STORE ] = {"v", "Hiiiooooc", sm6_parser_emit_dx_texture_store}, [DX_UBFE ] = {"m", "iiR", sm6_parser_emit_dx_tertiary}, + [DX_UMAD ] = {"m", "RRR", sm6_parser_emit_dx_ma}, [DX_UMAX ] = {"m", "RR", sm6_parser_emit_dx_binary}, [DX_UMIN ] = {"m", "RR", sm6_parser_emit_dx_binary}, }; diff --git a/tests/hlsl/majority-pragma.shader_test b/tests/hlsl/majority-pragma.shader_test index 84dff63e0..4d40d8f60 100644 --- a/tests/hlsl/majority-pragma.shader_test +++ b/tests/hlsl/majority-pragma.shader_test @@ -17,7 +17,7 @@ uniform 0 float4 0.1 0.2 0.0 0.0 uniform 4 float4 0.3 0.4 0.0 0.0 uniform 8 float4 0.1 0.3 0.0 0.0 uniform 12 float4 0.2 0.4 0.0 0.0 -todo(sm>=6) draw quad +draw quad probe all rgba (0.17, 0.39, 0.17, 0.39) 1

@@ -66,7 +66,7 @@ probe all rgba (0.5, 0.6, 0.7, 0.8)

% The documentation claims these strings are subject to macro expansion. -% They are not. +% In SM < 6.0 they are not.

[pixel shader]

@@ -90,8 +90,9 @@ float4 main() : sv_target [test] uniform 0 float4 0.1 0.2 0.0 0.0 uniform 4 float4 0.3 0.4 0.0 0.0 -todo(sm>=6) draw quad -probe all rgba (0.23, 0.34, 0.5, 0.5) 1 +draw quad +if(sm<6) probe all rgba (0.23, 0.34, 0.5, 0.5) 1 +if(sm>=6) probe all rgba (0.17, 0.39, 0.5, 0.5) 1

% The majority that applies to a typedef is the latent majority at the time diff --git a/tests/hlsl/majority-typedef.shader_test b/tests/hlsl/majority-typedef.shader_test index fa62dd5f7..1460e9a08 100644 --- a/tests/hlsl/majority-typedef.shader_test +++ b/tests/hlsl/majority-typedef.shader_test @@ -18,5 +18,5 @@ uniform 0 float4 0.1 0.2 0.0 0.0 uniform 4 float4 0.3 0.4 0.0 0.0 uniform 8 float4 0.1 0.3 0.0 0.0 uniform 12 float4 0.2 0.4 0.0 0.0 -todo(sm>=6) draw quad +draw quad probe all rgba (0.17, 0.39, 0.17, 0.39) 1

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718

Conor McCarthy

4:54 a.m.

New subject: [PATCH 3/4] vkd3d-shader/dxil: Implement DX intrinsic FAbs.

From: Conor McCarthy cmccarthy@codeweavers.com

diff --git a/libs/vkd3d-shader/dxil.c b/libs/vkd3d-shader/dxil.c index baf3341b0..8b2f5a08d 100644 --- a/libs/vkd3d-shader/dxil.c +++ b/libs/vkd3d-shader/dxil.c @@ -342,6 +342,7 @@ enum dx_intrinsic_opcode { DX_LOAD_INPUT = 4, DX_STORE_OUTPUT = 5, + DX_FABS = 6, DX_ISNAN = 8, DX_ISINF = 9, DX_ISFINITE = 10, @@ -4084,6 +4085,21 @@ static void sm6_parser_emit_dx_create_handle(struct sm6_parser *sm6, enum dx_int ins->handler_idx = VKD3DSIH_NOP; }

+static void sm6_parser_emit_dx_fabs(struct sm6_parser *sm6, enum dx_intrinsic_opcode op, + const struct sm6_value **operands, struct function_emission_state *state) +{ + struct vkd3d_shader_instruction *ins = state->ins; + struct vkd3d_shader_src_param *src_param; + + vsir_instruction_init(ins, &sm6->p.location, VKD3DSIH_MOV); + if (!(src_param = instruction_src_params_alloc(ins, 1, sm6))) + return; + src_param_init_from_value(src_param, operands[0]); + src_param->modifiers = VKD3DSPSM_ABS; + + instruction_dst_param_init_ssa_scalar(ins, sm6); +} + static enum vkd3d_shader_opcode sm6_dx_map_ma_op(enum dx_intrinsic_opcode op, const struct sm6_type *type) { switch (op) @@ -4865,6 +4881,7 @@ static const struct sm6_dx_opcode_info sm6_dx_op_table[] = [DX_DERIV_FINEX ] = {"e", "R", sm6_parser_emit_dx_unary}, [DX_DERIV_FINEY ] = {"e", "R", sm6_parser_emit_dx_unary}, [DX_EXP ] = {"g", "R", sm6_parser_emit_dx_unary}, + [DX_FABS ] = {"g", "R", sm6_parser_emit_dx_fabs}, [DX_FIRST_BIT_HI ] = {"i", "m", sm6_parser_emit_dx_unary}, [DX_FIRST_BIT_LO ] = {"i", "m", sm6_parser_emit_dx_unary}, [DX_FIRST_BIT_SHI ] = {"i", "m", sm6_parser_emit_dx_unary}, diff --git a/tests/hlsl/abs.shader_test b/tests/hlsl/abs.shader_test index 4d1d1e33e..46acdea85 100644 --- a/tests/hlsl/abs.shader_test +++ b/tests/hlsl/abs.shader_test @@ -8,8 +8,8 @@ float4 main() : sv_target

[test] uniform 0 float4 0.1 0.7 0.0 0.0 -todo(sm<4 | sm>=6) draw quad +todo(sm<4) draw quad probe all rgba (0.1, 0.7, 0.4, 0.4) uniform 0 float4 -0.7 0.1 0.0 0.0 -todo(sm<4 | sm>=6) draw quad +todo(sm<4) draw quad probe all rgba (0.7, 0.1, 1.2, 0.4) diff --git a/tests/hlsl/fmod.shader_test b/tests/hlsl/fmod.shader_test index d21301fee..62f7573de 100644 --- a/tests/hlsl/fmod.shader_test +++ b/tests/hlsl/fmod.shader_test @@ -8,10 +8,10 @@ float4 main() : sv_target

[test] uniform 0 float4 -0.5 6.5 0.0 0.0 -todo(sm<4 | sm>=6) draw quad +todo(sm<4) draw quad probe all rgba (-0.5, 0.0, 0.0, 0.0) 4 uniform 0 float4 1.1 0.3 0.0 0.0 -todo(sm<4 | sm>=6) draw quad +todo(sm<4) draw quad probe all rgba (0.2, 0.0, 0.0, 0.0) 4

[pixel shader todo(sm<4)] @@ -24,8 +24,8 @@ float4 main() : sv_target

[test] uniform 0 float4 -0.5 6.5 2.0 0.0 -todo(sm<4 | sm>=6) draw quad +todo(sm<4) draw quad probe all rgba (-0.5, 0.5, 0.0, 0.0) 4 uniform 0 float4 1.1 0.3 3.0 0.0 -todo(sm<4 | sm>=6) draw quad +todo(sm<4) draw quad probe all rgba (1.1, 0.3, 0.0, 0.0) 4 diff --git a/tests/hlsl/fwidth.shader_test b/tests/hlsl/fwidth.shader_test index 10ed712d2..99fb1421d 100644 --- a/tests/hlsl/fwidth.shader_test +++ b/tests/hlsl/fwidth.shader_test @@ -18,7 +18,7 @@ float4 main(float4 pos : sv_position) : sv_target }

[test] -todo(sm<4 | sm>=6) draw quad +todo(sm<4) draw quad probe (10, 10) rgba (8.0, 8.0, 8.0, 8.0) probe (11, 10) rgba (8.0, 8.0, 8.0, 8.0) probe (12, 10) rgba (10.0, 10.0, 10.0, 10.0) diff --git a/tests/hlsl/length.shader_test b/tests/hlsl/length.shader_test index ec48300f6..4080ff406 100644 --- a/tests/hlsl/length.shader_test +++ b/tests/hlsl/length.shader_test @@ -47,7 +47,7 @@ float4 main() : SV_TARGET

[test] uniform 0 float4 2.0 0.0 0.0 0.0 -todo(sm>=6) draw quad +draw quad probe all rgba (2.0, 2.0, 2.0, 2.0)

[pixel shader] @@ -60,7 +60,7 @@ float4 main() : SV_TARGET

[test] uniform 0 float4 2.0 0.0 0.0 0.0 -todo(sm>=6) draw quad +draw quad probe all rgba (2.0, 2.0, 2.0, 2.0)

[pixel shader fail]

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718

Conor McCarthy

4:54 a.m.

New subject: [PATCH 4/4] vkd3d-shader/dxil: Implement DX intrinsic Saturate.

From: Conor McCarthy cmccarthy@codeweavers.com

--- libs/vkd3d-shader/dxil.c | 25 +++++++++++++++++++++++-- tests/hlsl/saturate.shader_test | 4 ++-- 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/libs/vkd3d-shader/dxil.c b/libs/vkd3d-shader/dxil.c index 8b2f5a08d..ae9cd83e8 100644 --- a/libs/vkd3d-shader/dxil.c +++ b/libs/vkd3d-shader/dxil.c @@ -343,6 +343,7 @@ enum dx_intrinsic_opcode DX_LOAD_INPUT = 4, DX_STORE_OUTPUT = 5, DX_FABS = 6, + DX_SATURATE = 7, DX_ISNAN = 8, DX_ISINF = 9, DX_ISFINITE = 10, @@ -2356,14 +2357,18 @@ static void register_index_address_init(struct vkd3d_shader_register_index *idx, } }

-static void instruction_dst_param_init_ssa_scalar(struct vkd3d_shader_instruction *ins, struct sm6_parser *sm6) +static bool instruction_dst_param_init_ssa_scalar(struct vkd3d_shader_instruction *ins, struct sm6_parser *sm6) { - struct vkd3d_shader_dst_param *param = instruction_dst_params_alloc(ins, 1, sm6); struct sm6_value *dst = sm6_parser_get_current_value(sm6); + struct vkd3d_shader_dst_param *param; + + if (!(param = instruction_dst_params_alloc(ins, 1, sm6))) + return false;

dst_param_init_ssa_scalar(param, dst->type, dst, sm6); param->write_mask = VKD3DSP_WRITEMASK_0; dst->u.reg = param->reg; + return true; }

static void instruction_dst_param_init_ssa_vector(struct vkd3d_shader_instruction *ins, @@ -4587,6 +4592,21 @@ static void sm6_parser_emit_dx_sample(struct sm6_parser *sm6, enum dx_intrinsic_ instruction_dst_param_init_ssa_vector(ins, component_count, sm6); }

+static void sm6_parser_emit_dx_saturate(struct sm6_parser *sm6, enum dx_intrinsic_opcode op, + const struct sm6_value **operands, struct function_emission_state *state) +{ + struct vkd3d_shader_instruction *ins = state->ins; + struct vkd3d_shader_src_param *src_param; + + vsir_instruction_init(ins, &sm6->p.location, VKD3DSIH_MOV); + if (!(src_param = instruction_src_params_alloc(ins, 1, sm6))) + return; + src_param_init_from_value(src_param, operands[0]); + + if (instruction_dst_param_init_ssa_scalar(ins, sm6)) + ins->dst->modifiers = VKD3DSPDM_SATURATE; +} + static void sm6_parser_emit_dx_sincos(struct sm6_parser *sm6, enum dx_intrinsic_opcode op, const struct sm6_value **operands, struct function_emission_state *state) { @@ -4918,6 +4938,7 @@ static const struct sm6_dx_opcode_info sm6_dx_op_table[] = [DX_SAMPLE_C_LZ ] = {"o", "HHffffiiif", sm6_parser_emit_dx_sample}, [DX_SAMPLE_GRAD ] = {"o", "HHffffiiifffffff", sm6_parser_emit_dx_sample}, [DX_SAMPLE_LOD ] = {"o", "HHffffiiif", sm6_parser_emit_dx_sample}, + [DX_SATURATE ] = {"g", "R", sm6_parser_emit_dx_saturate}, [DX_SIN ] = {"g", "R", sm6_parser_emit_dx_sincos}, [DX_SPLIT_DOUBLE ] = {"S", "d", sm6_parser_emit_dx_split_double}, [DX_SQRT ] = {"g", "R", sm6_parser_emit_dx_unary}, diff --git a/tests/hlsl/saturate.shader_test b/tests/hlsl/saturate.shader_test index 2ed83cf66..e3ccce768 100644 --- a/tests/hlsl/saturate.shader_test +++ b/tests/hlsl/saturate.shader_test @@ -8,7 +8,7 @@ float4 main() : sv_target

[test] uniform 0 float4 0.7 -0.1 0.0 0.0 -todo(sm>=6) draw quad +draw quad probe all rgba (0.7, 0.0, 1.0, 0.0)

[pixel shader] @@ -22,5 +22,5 @@ float4 main() : sv_target

[test] uniform 0 float4 -2 0 2 -1 -todo(sm>=6) draw quad +draw quad probe all rgba (0.0, 0.0, 1.0, 0.0)

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718

Giovanni Mascellani (＠giomasce)

10:51 a.m.

Giovanni Mascellani (@giomasce) commented about libs/vkd3d-shader/dxil.c:

...

 ins->handler_idx = VKD3DSIH_NOP;
}

+static enum vkd3d_shader_opcode sm6_dx_map_ma_op(enum dx_intrinsic_opcode op, const struct sm6_type *type) +{
switch (op)

{
   case DX_FMA:
   case DX_FMAD:
       return (type->u.width == 64) ? VKD3DSIH_DFMA : VKD3DSIH_MAD;

So both these two can be used interchangeably for both float and doubles?

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718#note_64706

Giovanni Mascellani (＠giomasce)

10:52 a.m.

Giovanni Mascellani (@giomasce) commented about libs/vkd3d-shader/dxil.c:

...

 ins->handler_idx = VKD3DSIH_NOP;
}

+static enum vkd3d_shader_opcode sm6_dx_map_ma_op(enum dx_intrinsic_opcode op, const struct sm6_type *type) +{
switch (op)

{
   case DX_FMA:
   case DX_FMAD:
       return (type->u.width == 64) ? VKD3DSIH_DFMA : VKD3DSIH_MAD;
   case DX_IMAD:
   case DX_UMAD:
       return VKD3DSIH_IMAD;

Mmh, given your earlier patch I guess those are meant to be the signed and unsigned version and then the backend is expected to extract the right sign from the operand type. However, it seems that DXIL always emits unsigned integers, doesn't it? So the typing information here is lost anyway.

To be honest I don't even understand why here signed vs unsigned is important, given that we're truncating to the lower bits anyway. Do you have an idea of what is meant to happen here?

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718#note_64707

Conor McCarthy (＠cmccarthy)

2:12 p.m.

On Thu Mar 14 10:48:59 2024 +0000, Giovanni Mascellani wrote:

...

So both these two can be used interchangeably for both float and doubles?

Fma is fused multiply-add, with a single rounding, and Fmad is with two roundings, but it's unclear if the latter is only when marked precise. The proton fork uses Fma unless it's marked precise. We have no VSIR instruction for Fmad, but maybe that should be added now.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718#note_64733

Conor McCarthy (＠cmccarthy)

2:12 p.m.

On Thu Mar 14 10:49:00 2024 +0000, Giovanni Mascellani wrote:

...

Mmh, given your earlier patch I guess those are meant to be the signed and unsigned version and then the backend is expected to extract the right sign from the operand type. However, it seems that DXIL always emits unsigned integers, doesn't it? So the typing information here is lost anyway. To be honest I don't even understand why here signed vs unsigned is important, given that we're truncating to the lower bits anyway. Do you have an idea of what is meant to happen here?

I have no idea why there are signed and unsigned instructions, since it has no effect on the multiply or the add.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718#note_64734

Conor McCarthy (＠cmccarthy)

2:23 p.m.

On Thu Mar 14 14:12:38 2024 +0000, Conor McCarthy wrote:

...

Fma is fused multiply-add, with a single rounding, and Fmad is with two roundings, but it's unclear if the latter is only when marked precise. The proton fork uses Fma unless it's marked precise. We have no VSIR instruction for Fmad, but maybe that should be added now.

Fma is only used for double, which makes sense, and Fmad is for float, and is not fused if marked precise.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/718#note_64737

474

Age (days ago)

474

Last active (days ago)

wine-gitlab@winehq.org

9 comments

3 participants

tags (0)

participants (3)

Conor McCarthy
Conor McCarthy (＠cmccarthy)
Giovanni Mascellani (＠giomasce)