[PATCH 0/1] MR723: vkd3d-shader/spirv: Implement MAD in two operations if flagged as precise.
This would eliminate the todo for the precise mad() test in !718. Maybe we need test results on nvidia and intel to decide if we actually want this. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/723
From: Conor McCarthy <cmccarthy(a)codeweavers.com> --- libs/vkd3d-shader/spirv.c | 34 ++++++++++++++++++++++++++++++++-- 1 file changed, 32 insertions(+), 2 deletions(-) diff --git a/libs/vkd3d-shader/spirv.c b/libs/vkd3d-shader/spirv.c index 5403e5443..06408919a 100644 --- a/libs/vkd3d-shader/spirv.c +++ b/libs/vkd3d-shader/spirv.c @@ -1447,6 +1447,20 @@ static uint32_t vkd3d_spirv_build_op_isub(struct vkd3d_spirv_builder *builder, SpvOpISub, result_type, operand0, operand1); } +static uint32_t vkd3d_spirv_build_op_fadd(struct vkd3d_spirv_builder *builder, + uint32_t result_type, uint32_t operand0, uint32_t operand1) +{ + return vkd3d_spirv_build_op_tr2(builder, &builder->function_stream, + SpvOpFAdd, result_type, operand0, operand1); +} + +static uint32_t vkd3d_spirv_build_op_fmul(struct vkd3d_spirv_builder *builder, + uint32_t result_type, uint32_t operand0, uint32_t operand1) +{ + return vkd3d_spirv_build_op_tr2(builder, &builder->function_stream, + SpvOpFMul, result_type, operand0, operand1); +} + static uint32_t vkd3d_spirv_build_op_fdiv(struct vkd3d_spirv_builder *builder, uint32_t result_type, uint32_t operand0, uint32_t operand1) { @@ -7202,8 +7216,24 @@ static void spirv_compiler_emit_ext_glsl_instruction(struct spirv_compiler *comp for (i = 0; i < instruction->src_count; ++i) src_id[i] = spirv_compiler_emit_load_src(compiler, &src[i], dst->write_mask); - val_id = vkd3d_spirv_build_op_ext_inst(builder, type_id, - instr_set_id, glsl_inst, src_id, instruction->src_count); + if (instruction->handler_idx == VKD3DSIH_MAD && (instruction->flags & VKD3DSI_PRECISE_XYZW)) + { + /* The HLSL docs state: "If components of a mad instruction are tagged as precise, the + * hardware must execute a mad instruction or the exact equivalent, and it cannot split + * it into a multiply followed by an add." + * But DXIL.rst states the opposite: "Floating point multiply & add. This operation is + * not fused for "precise" operations." + * Windows drivers seem to conform with the latter, for SM 4-5 and SM 6. */ + val_id = vkd3d_spirv_build_op_fmul(builder, type_id, src_id[0], src_id[1]); + vkd3d_spirv_build_op_decorate(builder, val_id, SpvDecorationNoContraction, NULL, 0); + val_id = vkd3d_spirv_build_op_fadd(builder, type_id, val_id, src_id[2]); + vkd3d_spirv_build_op_decorate(builder, val_id, SpvDecorationNoContraction, NULL, 0); + } + else + { + val_id = vkd3d_spirv_build_op_ext_inst(builder, type_id, + instr_set_id, glsl_inst, src_id, instruction->src_count); + } if (instruction->handler_idx == VKD3DSIH_FIRSTBIT_HI || instruction->handler_idx == VKD3DSIH_FIRSTBIT_SHI) -- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/723
That todo succeeds on all the hardware I managed to test with once I cherry pick this on top of !718. I tested: * RADV, * Intel, * NVIDIA proprietary, * llvmpipe. So I would accept this after !718 is in and removing the todo. -- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/723#note_64881
On Fri Mar 15 12:14:27 2024 +0000, Giovanni Mascellani wrote:
That todo succeeds on all the hardware I managed to test with once I cherry pick this on top of !718. I tested: * RADV, * Intel, * NVIDIA proprietary, * llvmpipe. So I would accept this after !718 is in and removing the todo. I was thinking also of Windows, because if implementations vary then a subtle difference in our is not so important.
-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/723#note_64885
participants (3)
-
Conor McCarthy -
Conor McCarthy (@cmccarthy) -
Giovanni Mascellani (@giomasce)