[PATCH 0/3] MR269: vkd3d-shader/tpf: Combine struct sm4_dst_register and struct sm4_src_register.

List overview All Threads

newer

older

Re: [PATCH v7 0/1] MR3353: Make...

[PATCH 0/1] MR3357: winecoreaudio:...

Francisco Casas (＠fcasas)

6 Jul 2023 6 Jul '23

5:47 p.m.

This series is basically a single patch but that requires @cmccarthy's !225 first. So, the 2 patches from !225 are included as 1/3 and 2/3.

As can be implied from !225, in SM4 bytecode, all the information regarding whether the register uses a writemask, a swizzle, a dimension index, or none of these, is encoded in the register itself, and doesn't depend on the instruction nor argument position on which the register is used.

The possible register encodings are given by the following diagram:

![diagram3.drawio](/uploads/db330c2f7cc47801e3c3672828ae23b9/diagram3.drawio.png)

Where the swizzle_type (MASK4, VEC4, or SCALAR) only matter when the dim is VEC4.

Thus, it makes sense to merge these two types of registers into a single data type as 3/3 does. This has the added benefit of removing additional writemask and swizzle_type arguments to be initialized by pointer in several helper functions.

Also, this would help me to simplify a new version of !229.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/269

Show replies by date

Conor McCarthy

6 Jul 6 Jul

5:47 p.m.

New subject: [PATCH 1/3] vkd3d-shader/tpf: Read complete swizzle/mask info for src params.

From: Conor McCarthy cmccarthy@codeweavers.com

VKD3D_SM4_SWIZZLE_NONE is incorrect; the token contains a mask in this case. Reading the mask allows the correct swizzle to be set, detection of unexpected mask values, and factors out shader_sm4_is_scalar_register() which is somewhat of a hack. --- libs/vkd3d-shader/spirv.c | 2 +- libs/vkd3d-shader/tpf.c | 79 +++++++++++++++++++++++++-------------- 2 files changed, 51 insertions(+), 30 deletions(-)

diff --git a/libs/vkd3d-shader/spirv.c b/libs/vkd3d-shader/spirv.c index 5535a650..bfa8e3f5 100644 --- a/libs/vkd3d-shader/spirv.c +++ b/libs/vkd3d-shader/spirv.c @@ -7394,7 +7394,7 @@ static int spirv_compiler_emit_control_flow_instruction(struct spirv_compiler *c assert(compiler->control_flow_depth); assert(cf_info->current_block == VKD3D_BLOCK_SWITCH);

- assert(src->swizzle == VKD3D_SHADER_NO_SWIZZLE && src->reg.type == VKD3DSPR_IMMCONST); + assert(src->swizzle == VKD3D_SHADER_SWIZZLE(X, X, X, X) && src->reg.type == VKD3DSPR_IMMCONST); value = *src->reg.u.immconst_uint;

if (!vkd3d_array_reserve((void **)&cf_info->u.switch_.case_blocks, &cf_info->u.switch_.case_blocks_size, diff --git a/libs/vkd3d-shader/tpf.c b/libs/vkd3d-shader/tpf.c index 290fdcb3..72a55fc8 100644 --- a/libs/vkd3d-shader/tpf.c +++ b/libs/vkd3d-shader/tpf.c @@ -505,7 +505,7 @@ enum vkd3d_sm4_input_primitive_type

enum vkd3d_sm4_swizzle_type { - VKD3D_SM4_SWIZZLE_NONE = 0x0, + VKD3D_SM4_SWIZZLE_MASK4 = 0x0, VKD3D_SM4_SWIZZLE_VEC4 = 0x1, VKD3D_SM4_SWIZZLE_SCALAR = 0x2, }; @@ -1955,6 +1955,7 @@ static bool shader_sm4_validate_input_output_register(struct vkd3d_shader_sm4_pa static bool shader_sm4_read_src_param(struct vkd3d_shader_sm4_parser *priv, const uint32_t **ptr, const uint32_t *end, enum vkd3d_data_type data_type, struct vkd3d_shader_src_param *src_param) { + unsigned int dimension, mask; DWORD token;

if (*ptr >= end) @@ -1970,37 +1971,57 @@ static bool shader_sm4_read_src_param(struct vkd3d_shader_sm4_parser *priv, cons return false; }

- if (src_param->reg.type == VKD3DSPR_IMMCONST || src_param->reg.type == VKD3DSPR_IMMCONST64) + switch ((dimension = (token & VKD3D_SM4_DIMENSION_MASK) >> VKD3D_SM4_DIMENSION_SHIFT)) { - src_param->swizzle = VKD3D_SHADER_NO_SWIZZLE; - } - else - { - enum vkd3d_sm4_swizzle_type swizzle_type = - (token & VKD3D_SM4_SWIZZLE_TYPE_MASK) >> VKD3D_SM4_SWIZZLE_TYPE_SHIFT; + case VKD3D_SM4_DIMENSION_NONE: + src_param->swizzle = 0; + break;

- switch (swizzle_type) + case VKD3D_SM4_DIMENSION_SCALAR: + src_param->swizzle = VKD3D_SHADER_SWIZZLE(X, X, X, X); + break; + + case VKD3D_SM4_DIMENSION_VEC4: { - case VKD3D_SM4_SWIZZLE_NONE: - if (shader_sm4_is_scalar_register(&src_param->reg)) - src_param->swizzle = VKD3D_SHADER_SWIZZLE(X, X, X, X); - else + enum vkd3d_sm4_swizzle_type swizzle_type = + (token & VKD3D_SM4_SWIZZLE_TYPE_MASK) >> VKD3D_SM4_SWIZZLE_TYPE_SHIFT; + + switch (swizzle_type) + { + case VKD3D_SM4_SWIZZLE_MASK4: + mask = (token & VKD3D_SM4_WRITEMASK_MASK) >> VKD3D_SM4_WRITEMASK_SHIFT; + src_param->swizzle = VKD3D_SHADER_NO_SWIZZLE; - break; + if (mask == VKD3DSP_WRITEMASK_0) + src_param->swizzle = VKD3D_SHADER_SWIZZLE(X, X, X, X); + else if (!mask) + src_param->swizzle = 0; + else if (mask != VKD3DSP_WRITEMASK_ALL) + FIXME("Unhandled mask %#x.\n", mask); + + if (!mask && (src_param->reg.type == VKD3DSPR_IMMCONST || src_param->reg.type == VKD3DSPR_IMMCONST64)) + src_param->swizzle = VKD3D_SHADER_NO_SWIZZLE; + break;

- case VKD3D_SM4_SWIZZLE_SCALAR: - src_param->swizzle = (token & VKD3D_SM4_SWIZZLE_MASK) >> VKD3D_SM4_SWIZZLE_SHIFT; - src_param->swizzle = (src_param->swizzle & 0x3) * 0x01010101; - break; + case VKD3D_SM4_SWIZZLE_SCALAR: + src_param->swizzle = (token & VKD3D_SM4_SWIZZLE_MASK) >> VKD3D_SM4_SWIZZLE_SHIFT; + src_param->swizzle = (src_param->swizzle & 0x3) * 0x01010101; + break;

- case VKD3D_SM4_SWIZZLE_VEC4: - src_param->swizzle = swizzle_from_sm4((token & VKD3D_SM4_SWIZZLE_MASK) >> VKD3D_SM4_SWIZZLE_SHIFT); - break; + case VKD3D_SM4_SWIZZLE_VEC4: + src_param->swizzle = swizzle_from_sm4((token & VKD3D_SM4_SWIZZLE_MASK) >> VKD3D_SM4_SWIZZLE_SHIFT); + break;

- default: - FIXME("Unhandled swizzle type %#x.\n", swizzle_type); - break; + default: + FIXME("Unhandled swizzle type %#x.\n", swizzle_type); + break; + } + break; } + + default: + FIXME("Unhandled dimension %#x.\n", dimension); + break; }

if (register_is_input_output(&src_param->reg) && !shader_sm4_validate_input_output_register(priv, @@ -2538,7 +2559,7 @@ bool hlsl_sm4_register_from_semantic(struct hlsl_ctx *ctx, const struct hlsl_sem {"sv_groupid", false, VKD3D_SHADER_TYPE_COMPUTE, VKD3D_SM4_SWIZZLE_VEC4, VKD3D_SM5_RT_THREAD_GROUP_ID, false}, {"sv_groupthreadid", false, VKD3D_SHADER_TYPE_COMPUTE, VKD3D_SM4_SWIZZLE_VEC4, VKD3D_SM5_RT_LOCAL_THREAD_ID, false},

- {"sv_primitiveid", false, VKD3D_SHADER_TYPE_GEOMETRY, VKD3D_SM4_SWIZZLE_NONE, VKD3D_SM4_RT_PRIMID, false}, + {"sv_primitiveid", false, VKD3D_SHADER_TYPE_GEOMETRY, VKD3D_SM4_SWIZZLE_MASK4, VKD3D_SM4_RT_PRIMID, false},

/* Put sv_target in this table, instead of letting it fall through to * default varying allocation, so that the register index matches the @@ -3386,7 +3407,7 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r reg->type = VKD3D_SM4_RT_SAMPLER; reg->dim = VKD3D_SM4_DIMENSION_NONE; if (swizzle_type) - *swizzle_type = VKD3D_SM4_SWIZZLE_NONE; + *swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; reg->idx[0] = var->regs[HLSL_REGSET_SAMPLERS].id; reg->idx[0] += hlsl_offset_from_deref_safe(ctx, deref); assert(deref->offset_regset == HLSL_REGSET_SAMPLERS); @@ -3518,7 +3539,7 @@ static void sm4_dst_from_node(struct sm4_dst_register *dst, const struct hlsl_ir static void sm4_src_from_constant_value(struct sm4_src_register *src, const struct hlsl_constant_value *value, unsigned int width, unsigned int map_writemask) { - src->swizzle_type = VKD3D_SM4_SWIZZLE_NONE; + src->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; src->reg.type = VKD3D_SM4_RT_IMMCONST; if (width == 1) { @@ -4069,7 +4090,7 @@ static void write_sm4_ld(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *buf index = hlsl_ir_constant(sample_index);

memset(&instr.srcs[2], 0, sizeof(instr.srcs[2])); - instr.srcs[2].swizzle_type = VKD3D_SM4_SWIZZLE_NONE; + instr.srcs[2].swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; reg->type = VKD3D_SM4_RT_IMMCONST; reg->dim = VKD3D_SM4_DIMENSION_SCALAR; reg->immconst_uint[0] = index->value.u[0].u; @@ -4189,7 +4210,7 @@ static void write_sm4_cast_from_bool(struct hlsl_ctx *ctx, instr.dst_count = 1;

sm4_src_from_node(&instr.srcs[0], arg, instr.dsts[0].writemask); - instr.srcs[1].swizzle_type = VKD3D_SM4_SWIZZLE_NONE; + instr.srcs[1].swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; instr.srcs[1].reg.type = VKD3D_SM4_RT_IMMCONST; instr.srcs[1].reg.dim = VKD3D_SM4_DIMENSION_SCALAR; instr.srcs[1].reg.immconst_uint[0] = mask;

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/269

Conor McCarthy

5:47 p.m.

New subject: [PATCH 2/3] vkd3d-shader/tpf: Read complete swizzle/mask info for dst params.

From: Conor McCarthy cmccarthy@codeweavers.com

For some resources the token contains a swizzle instead of a mask, resulting in a garbage value in write_mask. Read and validate the swizzle, which is relevant for future use of vector formats in image_format_for_image_read(). --- libs/vkd3d-shader/tpf.c | 69 +++++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 24 deletions(-)

diff --git a/libs/vkd3d-shader/tpf.c b/libs/vkd3d-shader/tpf.c index 72a55fc8..783b9f84 100644 --- a/libs/vkd3d-shader/tpf.c +++ b/libs/vkd3d-shader/tpf.c @@ -1842,26 +1842,6 @@ static bool shader_sm4_read_param(struct vkd3d_shader_sm4_parser *priv, const ui return true; }

-static bool shader_sm4_is_scalar_register(const struct vkd3d_shader_register *reg) -{ - switch (reg->type) - { - case VKD3DSPR_COVERAGE: - case VKD3DSPR_DEPTHOUT: - case VKD3DSPR_DEPTHOUTGE: - case VKD3DSPR_DEPTHOUTLE: - case VKD3DSPR_GSINSTID: - case VKD3DSPR_LOCALTHREADINDEX: - case VKD3DSPR_OUTPOINTID: - case VKD3DSPR_PRIMID: - case VKD3DSPR_SAMPLEMASK: - case VKD3DSPR_OUTSTENCILREF: - return true; - default: - return false; - } -} - static uint32_t swizzle_from_sm4(uint32_t s) { return vkd3d_shader_create_swizzle(s & 0x3, (s >> 2) & 0x3, (s >> 4) & 0x3, (s >> 6) & 0x3); @@ -2034,7 +2014,9 @@ static bool shader_sm4_read_src_param(struct vkd3d_shader_sm4_parser *priv, cons static bool shader_sm4_read_dst_param(struct vkd3d_shader_sm4_parser *priv, const uint32_t **ptr, const uint32_t *end, enum vkd3d_data_type data_type, struct vkd3d_shader_dst_param *dst_param) { + enum vkd3d_sm4_swizzle_type swizzle_type; enum vkd3d_shader_src_modifier modifier; + unsigned int dimension, swizzle; DWORD token;

if (*ptr >= end) @@ -2056,12 +2038,51 @@ static bool shader_sm4_read_dst_param(struct vkd3d_shader_sm4_parser *priv, cons return false; }

- dst_param->write_mask = (token & VKD3D_SM4_WRITEMASK_MASK) >> VKD3D_SM4_WRITEMASK_SHIFT; + switch ((dimension = (token & VKD3D_SM4_DIMENSION_MASK) >> VKD3D_SM4_DIMENSION_SHIFT)) + { + case VKD3D_SM4_DIMENSION_NONE: + dst_param->write_mask = 0; + break; + + case VKD3D_SM4_DIMENSION_SCALAR: + dst_param->write_mask = VKD3DSP_WRITEMASK_0; + break; + + case VKD3D_SM4_DIMENSION_VEC4: + swizzle_type = (token & VKD3D_SM4_SWIZZLE_TYPE_MASK) >> VKD3D_SM4_SWIZZLE_TYPE_SHIFT; + switch (swizzle_type) + { + case VKD3D_SM4_SWIZZLE_MASK4: + dst_param->write_mask = (token & VKD3D_SM4_WRITEMASK_MASK) >> VKD3D_SM4_WRITEMASK_SHIFT; + break; + + case VKD3D_SM4_SWIZZLE_VEC4: + swizzle = swizzle_from_sm4((token & VKD3D_SM4_SWIZZLE_MASK) >> VKD3D_SM4_SWIZZLE_SHIFT); + if (swizzle != VKD3D_SHADER_NO_SWIZZLE) + FIXME("Unhandled swizzle %#x.\n", swizzle); + dst_param->write_mask = VKD3DSP_WRITEMASK_ALL; + break; + + case VKD3D_SM4_SWIZZLE_SCALAR: + swizzle = (token & VKD3D_SM4_SWIZZLE_MASK) >> VKD3D_SM4_SWIZZLE_SHIFT; + FIXME("Making mask from component %#x.\n", swizzle); + dst_param->write_mask = VKD3DSP_WRITEMASK_0 << (swizzle & 3); + break; + + default: + FIXME("Unhandled swizzle type %#x.\n", swizzle_type); + break; + } + break; + + default: + FIXME("Unhandled dimension %#x.\n", dimension); + break; + } + if (data_type == VKD3D_DATA_DOUBLE) dst_param->write_mask = vkd3d_write_mask_64_from_32(dst_param->write_mask); - /* Scalar registers are declared with no write mask in shader bytecode. */ - if (!dst_param->write_mask && shader_sm4_is_scalar_register(&dst_param->reg)) - dst_param->write_mask = VKD3DSP_WRITEMASK_0; + dst_param->modifiers = 0; dst_param->shift = 0;

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/269

Francisco Casas

5:47 p.m.

New subject: [PATCH 3/3] vkd3d-shader/tpf: Combine struct sm4_dst_register and struct sm4_src_register.

From: Francisco Casas fcasas@codeweavers.com

In the SM4 bytecode, all the information regarding whether the register uses a writemask, a swizzle, a dimension index, or none of these, is encoded in the register itself, and doesn't depend on the instruction nor argument position on which the register is used.

Thus, it makes sense to merge these two types of registers into a single data type. This has the added benefit of removing additional writemask and swizzle_type arguments to be initialized by pointer in many helper functions. --- libs/vkd3d-shader/tpf.c | 343 +++++++++++++++++++++++----------------- 1 file changed, 202 insertions(+), 141 deletions(-)

diff --git a/libs/vkd3d-shader/tpf.c b/libs/vkd3d-shader/tpf.c index 783b9f84..5e6679ab 100644 --- a/libs/vkd3d-shader/tpf.c +++ b/libs/vkd3d-shader/tpf.c @@ -146,6 +146,9 @@ STATIC_ASSERT(SM4_MAX_SRC_COUNT <= SPIRV_MAX_SRC_COUNT); #define VKD3D_SM4_SWIZZLE_SHIFT 4 #define VKD3D_SM4_SWIZZLE_MASK (0xffu << VKD3D_SM4_SWIZZLE_SHIFT)

+#define VKD3D_SM4_SCALAR_DIM_SHIFT 4 +#define VKD3D_SM4_SCALAR_DIM_MASK (0x3u << VKD3D_SM4_SCALAR_DIM_SHIFT) + #define VKD3D_SM4_VERSION_MAJOR(version) (((version) >> 4) & 0xf) #define VKD3D_SM4_VERSION_MINOR(version) (((version) >> 0) & 0xf)

@@ -3359,6 +3362,14 @@ struct sm4_register enum vkd3d_sm4_dimension dim; uint32_t immconst_uint[4]; unsigned int mod; + + enum vkd3d_sm4_swizzle_type swizzle_type; + union + { + unsigned int swizzle; + unsigned int writemask; + unsigned int dimension_idx; + }; };

struct sm4_instruction @@ -3368,19 +3379,10 @@ struct sm4_instruction struct sm4_instruction_modifier modifiers[1]; unsigned int modifier_count;

- struct sm4_dst_register - { - struct sm4_register reg; - unsigned int writemask; - } dsts[2]; + struct sm4_register dsts[2]; unsigned int dst_count;

- struct sm4_src_register - { - struct sm4_register reg; - enum vkd3d_sm4_swizzle_type swizzle_type; - unsigned int swizzle; - } srcs[5]; + struct sm4_register srcs[5]; unsigned int src_count;

unsigned int byte_stride; @@ -3390,8 +3392,7 @@ struct sm4_instruction };

static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *reg, - unsigned int *writemask, enum vkd3d_sm4_swizzle_type *swizzle_type, - const struct hlsl_deref *deref, const struct hlsl_type *data_type) + const struct hlsl_deref *deref, const struct hlsl_type *data_type, bool is_dst) { const struct hlsl_ir_var *var = deref->var;

@@ -3402,61 +3403,80 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r if (regset == HLSL_REGSET_TEXTURES) { reg->type = VKD3D_SM4_RT_RESOURCE; - reg->dim = VKD3D_SM4_DIMENSION_VEC4; - if (swizzle_type) - *swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; reg->idx[0] = var->regs[HLSL_REGSET_TEXTURES].id; reg->idx[0] += hlsl_offset_from_deref_safe(ctx, deref); assert(deref->offset_regset == HLSL_REGSET_TEXTURES); reg->idx_count = 1; - *writemask = VKD3DSP_WRITEMASK_ALL; + + assert(!is_dst); + reg->dim = VKD3D_SM4_DIMENSION_VEC4; + reg->swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; + reg->swizzle = HLSL_SWIZZLE(X, Y, Z, W); } else if (regset == HLSL_REGSET_UAVS) { reg->type = VKD3D_SM5_RT_UAV; - reg->dim = VKD3D_SM4_DIMENSION_VEC4; - if (swizzle_type) - *swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; reg->idx[0] = var->regs[HLSL_REGSET_UAVS].id; reg->idx[0] += hlsl_offset_from_deref_safe(ctx, deref); assert(deref->offset_regset == HLSL_REGSET_UAVS); reg->idx_count = 1; - *writemask = VKD3DSP_WRITEMASK_ALL; + + reg->dim = VKD3D_SM4_DIMENSION_VEC4; + if (is_dst) + { + reg->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; + reg->writemask = VKD3DSP_WRITEMASK_ALL; + } + else + { + reg->swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; + reg->swizzle = HLSL_SWIZZLE(X, Y, Z, W); + } } else if (regset == HLSL_REGSET_SAMPLERS) { reg->type = VKD3D_SM4_RT_SAMPLER; - reg->dim = VKD3D_SM4_DIMENSION_NONE; - if (swizzle_type) - *swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; reg->idx[0] = var->regs[HLSL_REGSET_SAMPLERS].id; reg->idx[0] += hlsl_offset_from_deref_safe(ctx, deref); assert(deref->offset_regset == HLSL_REGSET_SAMPLERS); reg->idx_count = 1; - *writemask = VKD3DSP_WRITEMASK_ALL; + + assert(!is_dst); + reg->dim = VKD3D_SM4_DIMENSION_NONE; } else { unsigned int offset = hlsl_offset_from_deref_safe(ctx, deref) + var->buffer_offset; + unsigned int writemask = ((1u << data_type->dimx) - 1) << (offset & 3);

assert(data_type->class <= HLSL_CLASS_VECTOR); reg->type = VKD3D_SM4_RT_CONSTBUFFER; - reg->dim = VKD3D_SM4_DIMENSION_VEC4; - if (swizzle_type) - *swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; reg->idx[0] = var->buffer->reg.id; reg->idx[1] = offset / 4; reg->idx_count = 2; - *writemask = ((1u << data_type->dimx) - 1) << (offset & 3); + + reg->dim = VKD3D_SM4_DIMENSION_VEC4; + if (is_dst) + { + reg->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; + reg->writemask = writemask; + } + else + { + reg->swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; + reg->swizzle = hlsl_swizzle_from_writemask(writemask); + } } } else if (var->is_input_semantic) { + enum vkd3d_sm4_swizzle_type swizzle_type; bool has_idx;

- if (hlsl_sm4_register_from_semantic(ctx, &var->semantic, false, &reg->type, swizzle_type, &has_idx)) + if (hlsl_sm4_register_from_semantic(ctx, &var->semantic, false, &reg->type, &swizzle_type, &has_idx)) { unsigned int offset = hlsl_offset_from_deref_safe(ctx, deref); + unsigned int writemask = ((1u << data_type->dimx) - 1) << (offset % 4);

if (has_idx) { @@ -3464,8 +3484,10 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r reg->idx_count = 1; }

+ assert(!is_dst); reg->dim = VKD3D_SM4_DIMENSION_VEC4; - *writemask = ((1u << data_type->dimx) - 1) << (offset % 4); + reg->swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; + reg->swizzle = hlsl_swizzle_from_writemask(writemask); } else { @@ -3473,21 +3495,24 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r

assert(hlsl_reg.allocated); reg->type = VKD3D_SM4_RT_INPUT; - reg->dim = VKD3D_SM4_DIMENSION_VEC4; - if (swizzle_type) - *swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; reg->idx[0] = hlsl_reg.id; reg->idx_count = 1; - *writemask = hlsl_reg.writemask; + + assert(!is_dst); + reg->dim = VKD3D_SM4_DIMENSION_VEC4; + reg->swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; + reg->swizzle = hlsl_swizzle_from_writemask(hlsl_reg.writemask); } } else if (var->is_output_semantic) { + enum vkd3d_sm4_swizzle_type swizzle_type; bool has_idx;

- if (hlsl_sm4_register_from_semantic(ctx, &var->semantic, true, &reg->type, swizzle_type, &has_idx)) + if (hlsl_sm4_register_from_semantic(ctx, &var->semantic, true, &reg->type, &swizzle_type, &has_idx)) { unsigned int offset = hlsl_offset_from_deref_safe(ctx, deref); + unsigned int writemask = ((1u << data_type->dimx) - 1) << (offset % 4);

if (has_idx) { @@ -3495,11 +3520,17 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r reg->idx_count = 1; }

+ assert(is_dst); if (reg->type == VKD3D_SM4_RT_DEPTHOUT) + { reg->dim = VKD3D_SM4_DIMENSION_SCALAR; + } else + { reg->dim = VKD3D_SM4_DIMENSION_VEC4; - *writemask = ((1u << data_type->dimx) - 1) << (offset % 4); + reg->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; + reg->writemask = writemask; + } } else { @@ -3507,10 +3538,13 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r

assert(hlsl_reg.allocated); reg->type = VKD3D_SM4_RT_OUTPUT; - reg->dim = VKD3D_SM4_DIMENSION_VEC4; reg->idx[0] = hlsl_reg.id; reg->idx_count = 1; - *writemask = hlsl_reg.writemask; + + assert(is_dst); + reg->dim = VKD3D_SM4_DIMENSION_VEC4; + reg->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; + reg->writemask = hlsl_reg.writemask; } } else @@ -3519,72 +3553,83 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r

assert(hlsl_reg.allocated); reg->type = VKD3D_SM4_RT_TEMP; - reg->dim = VKD3D_SM4_DIMENSION_VEC4; - if (swizzle_type) - *swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; reg->idx[0] = hlsl_reg.id; reg->idx_count = 1; - *writemask = hlsl_reg.writemask; + + reg->dim = VKD3D_SM4_DIMENSION_VEC4; + if (is_dst) + { + reg->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; + reg->writemask = hlsl_reg.writemask; + } + else + { + reg->swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; + reg->swizzle = hlsl_swizzle_from_writemask(hlsl_reg.writemask); + } } }

-static void sm4_src_from_deref(struct hlsl_ctx *ctx, struct sm4_src_register *src, +static void sm4_src_from_deref(struct hlsl_ctx *ctx, struct sm4_register *reg, const struct hlsl_deref *deref, const struct hlsl_type *data_type, unsigned int map_writemask) { - unsigned int writemask; - - sm4_register_from_deref(ctx, &src->reg, &writemask, &src->swizzle_type, deref, data_type); - if (src->swizzle_type == VKD3D_SM4_SWIZZLE_VEC4) - src->swizzle = hlsl_map_swizzle(hlsl_swizzle_from_writemask(writemask), map_writemask); + sm4_register_from_deref(ctx, reg, deref, data_type, false); + if (reg->dim == VKD3D_SM4_DIMENSION_VEC4 && reg->swizzle_type == VKD3D_SM4_SWIZZLE_VEC4) + reg->swizzle = hlsl_map_swizzle(reg->swizzle, map_writemask); }

-static void sm4_register_from_node(struct sm4_register *reg, unsigned int *writemask, - enum vkd3d_sm4_swizzle_type *swizzle_type, const struct hlsl_ir_node *instr) +static void sm4_register_from_node(struct sm4_register *reg, const struct hlsl_ir_node *instr, bool is_dst) { assert(instr->reg.allocated); reg->type = VKD3D_SM4_RT_TEMP; - reg->dim = VKD3D_SM4_DIMENSION_VEC4; - *swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; reg->idx[0] = instr->reg.id; reg->idx_count = 1; - *writemask = instr->reg.writemask; + + reg->dim = VKD3D_SM4_DIMENSION_VEC4; + if (is_dst) + { + reg->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; + reg->writemask = instr->reg.writemask; + } + else + { + reg->swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; + reg->swizzle = hlsl_swizzle_from_writemask(instr->reg.writemask); + } }

-static void sm4_dst_from_node(struct sm4_dst_register *dst, const struct hlsl_ir_node *instr) +static void sm4_dst_from_node(struct sm4_register *dst, const struct hlsl_ir_node *instr) { - unsigned int swizzle_type; - - sm4_register_from_node(&dst->reg, &dst->writemask, &swizzle_type, instr); + sm4_register_from_node(dst, instr, true); }

-static void sm4_src_from_constant_value(struct sm4_src_register *src, +static void sm4_src_from_constant_value(struct sm4_register *src, const struct hlsl_constant_value *value, unsigned int width, unsigned int map_writemask) { - src->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; - src->reg.type = VKD3D_SM4_RT_IMMCONST; + src->type = VKD3D_SM4_RT_IMMCONST; if (width == 1) { - src->reg.dim = VKD3D_SM4_DIMENSION_SCALAR; - src->reg.immconst_uint[0] = value->u[0].u; + src->dim = VKD3D_SM4_DIMENSION_SCALAR; + src->immconst_uint[0] = value->u[0].u; } else { unsigned int i, j = 0;

- src->reg.dim = VKD3D_SM4_DIMENSION_VEC4; + src->dim = VKD3D_SM4_DIMENSION_VEC4; + src->swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; + src->writemask = 0; for (i = 0; i < 4; ++i) { if (map_writemask & (1u << i)) - src->reg.immconst_uint[i] = value->u[j++].u; + src->immconst_uint[i] = value->u[j++].u; } } }

-static void sm4_src_from_node(struct sm4_src_register *src, +static void sm4_src_from_node(struct sm4_register *src, const struct hlsl_ir_node *instr, unsigned int map_writemask) { - unsigned int writemask; - if (instr->type == HLSL_IR_CONSTANT) { struct hlsl_ir_constant *constant = hlsl_ir_constant(instr); @@ -3593,16 +3638,31 @@ static void sm4_src_from_node(struct sm4_src_register *src, return; }

- sm4_register_from_node(&src->reg, &writemask, &src->swizzle_type, instr); - if (src->swizzle_type == VKD3D_SM4_SWIZZLE_VEC4) - src->swizzle = hlsl_map_swizzle(hlsl_swizzle_from_writemask(writemask), map_writemask); + sm4_register_from_node(src, instr, false); + if (src->dim == VKD3D_SM4_DIMENSION_VEC4 && src->swizzle_type == VKD3D_SM4_SWIZZLE_VEC4) + src->swizzle = hlsl_map_swizzle(src->swizzle, map_writemask); }

static uint32_t sm4_encode_register(const struct sm4_register *reg) { - return (reg->type << VKD3D_SM4_REGISTER_TYPE_SHIFT) - | (reg->idx_count << VKD3D_SM4_REGISTER_ORDER_SHIFT) - | (reg->dim << VKD3D_SM4_DIMENSION_SHIFT); + uint32_t ret = 0; + ret |= reg->type << VKD3D_SM4_REGISTER_TYPE_SHIFT; + ret |= reg->idx_count << VKD3D_SM4_REGISTER_ORDER_SHIFT; + ret |= reg->dim << VKD3D_SM4_DIMENSION_SHIFT; + + if (reg->dim == VKD3D_SM4_DIMENSION_VEC4) + { + ret |= reg->swizzle_type << VKD3D_SM4_SWIZZLE_TYPE_SHIFT; + + if (reg->swizzle_type == VKD3D_SM4_SWIZZLE_MASK4) + ret |= reg->writemask << VKD3D_SM4_WRITEMASK_SHIFT; + else if (reg->swizzle_type == VKD3D_SM4_SWIZZLE_VEC4) + ret |= reg->swizzle << VKD3D_SM4_SWIZZLE_SHIFT; + else if (reg->swizzle_type == VKD3D_SM4_SWIZZLE_SCALAR) + ret |= reg->dimension_idx << VKD3D_SM4_SCALAR_DIM_SHIFT; + } + + return ret; }

static uint32_t sm4_register_order(const struct sm4_register *reg) @@ -3623,9 +3683,9 @@ static void write_sm4_instruction(struct vkd3d_bytecode_buffer *buffer, const st

size += instr->modifier_count; for (i = 0; i < instr->dst_count; ++i) - size += sm4_register_order(&instr->dsts[i].reg); + size += sm4_register_order(&instr->dsts[i]); for (i = 0; i < instr->src_count; ++i) - size += sm4_register_order(&instr->srcs[i].reg); + size += sm4_register_order(&instr->srcs[i]); size += instr->idx_count; if (instr->byte_stride) ++size; @@ -3646,39 +3706,40 @@ static void write_sm4_instruction(struct vkd3d_bytecode_buffer *buffer, const st

for (i = 0; i < instr->dst_count; ++i) { - token = sm4_encode_register(&instr->dsts[i].reg); - if (instr->dsts[i].reg.dim == VKD3D_SM4_DIMENSION_VEC4) - token |= instr->dsts[i].writemask << VKD3D_SM4_WRITEMASK_SHIFT; + token = sm4_encode_register(&instr->dsts[i]); put_u32(buffer, token); + assert(instr->dsts[i].dim != VKD3D_SM4_DIMENSION_VEC4 + || instr->dsts[i].swizzle_type != VKD3D_SM4_SWIZZLE_VEC4);

- for (j = 0; j < instr->dsts[i].reg.idx_count; ++j) - put_u32(buffer, instr->dsts[i].reg.idx[j]); + for (j = 0; j < instr->dsts[i].idx_count; ++j) + put_u32(buffer, instr->dsts[i].idx[j]); }

for (i = 0; i < instr->src_count; ++i) { - token = sm4_encode_register(&instr->srcs[i].reg); - token |= (uint32_t)instr->srcs[i].swizzle_type << VKD3D_SM4_SWIZZLE_TYPE_SHIFT; - token |= instr->srcs[i].swizzle << VKD3D_SM4_SWIZZLE_SHIFT; - if (instr->srcs[i].reg.mod) + token = sm4_encode_register(&instr->srcs[i]); + if (instr->srcs[i].mod) token |= VKD3D_SM4_EXTENDED_OPERAND; put_u32(buffer, token); + assert(instr->srcs[i].type == VKD3D_SM4_RT_IMMCONST + || instr->srcs[i].dim != VKD3D_SM4_DIMENSION_VEC4 + || instr->srcs[i].swizzle_type != VKD3D_SM4_SWIZZLE_MASK4);

- if (instr->srcs[i].reg.mod) - put_u32(buffer, (instr->srcs[i].reg.mod << VKD3D_SM4_REGISTER_MODIFIER_SHIFT) + if (instr->srcs[i].mod) + put_u32(buffer, (instr->srcs[i].mod << VKD3D_SM4_REGISTER_MODIFIER_SHIFT) | VKD3D_SM4_EXTENDED_OPERAND_MODIFIER);

- for (j = 0; j < instr->srcs[i].reg.idx_count; ++j) - put_u32(buffer, instr->srcs[i].reg.idx[j]); + for (j = 0; j < instr->srcs[i].idx_count; ++j) + put_u32(buffer, instr->srcs[i].idx[j]);

- if (instr->srcs[i].reg.type == VKD3D_SM4_RT_IMMCONST) + if (instr->srcs[i].type == VKD3D_SM4_RT_IMMCONST) { - put_u32(buffer, instr->srcs[i].reg.immconst_uint[0]); - if (instr->srcs[i].reg.dim == VKD3D_SM4_DIMENSION_VEC4) + put_u32(buffer, instr->srcs[i].immconst_uint[0]); + if (instr->srcs[i].dim == VKD3D_SM4_DIMENSION_VEC4) { - put_u32(buffer, instr->srcs[i].reg.immconst_uint[1]); - put_u32(buffer, instr->srcs[i].reg.immconst_uint[2]); - put_u32(buffer, instr->srcs[i].reg.immconst_uint[3]); + put_u32(buffer, instr->srcs[i].immconst_uint[1]); + put_u32(buffer, instr->srcs[i].immconst_uint[2]); + put_u32(buffer, instr->srcs[i].immconst_uint[3]); } } } @@ -3723,10 +3784,11 @@ static void write_sm4_dcl_constant_buffer(struct vkd3d_bytecode_buffer *buffer, { .opcode = VKD3D_SM4_OP_DCL_CONSTANT_BUFFER,

- .srcs[0].reg.dim = VKD3D_SM4_DIMENSION_VEC4, - .srcs[0].reg.type = VKD3D_SM4_RT_CONSTBUFFER, - .srcs[0].reg.idx = {cbuffer->reg.id, (cbuffer->used_size + 3) / 4}, - .srcs[0].reg.idx_count = 2, + .srcs[0].type = VKD3D_SM4_RT_CONSTBUFFER, + .srcs[0].idx = {cbuffer->reg.id, (cbuffer->used_size + 3) / 4}, + .srcs[0].idx_count = 2, + + .srcs[0].dim = VKD3D_SM4_DIMENSION_VEC4, .srcs[0].swizzle_type = VKD3D_SM4_SWIZZLE_VEC4, .srcs[0].swizzle = HLSL_SWIZZLE(X, Y, Z, W), .src_count = 1, @@ -3741,8 +3803,8 @@ static void write_sm4_dcl_samplers(struct vkd3d_bytecode_buffer *buffer, const s { .opcode = VKD3D_SM4_OP_DCL_SAMPLER,

- .dsts[0].reg.type = VKD3D_SM4_RT_SAMPLER, - .dsts[0].reg.idx_count = 1, + .dsts[0].type = VKD3D_SM4_RT_SAMPLER, + .dsts[0].idx_count = 1, .dst_count = 1, };

@@ -3754,7 +3816,7 @@ static void write_sm4_dcl_samplers(struct vkd3d_bytecode_buffer *buffer, const s if (!var->objects_usage[HLSL_REGSET_SAMPLERS][i].used) continue;

- instr.dsts[0].reg.idx[0] = var->regs[HLSL_REGSET_SAMPLERS].id + i; + instr.dsts[0].idx[0] = var->regs[HLSL_REGSET_SAMPLERS].id + i; write_sm4_instruction(buffer, &instr); } } @@ -3776,9 +3838,9 @@ static void write_sm4_dcl_textures(struct hlsl_ctx *ctx, struct vkd3d_bytecode_b

instr = (struct sm4_instruction) { - .dsts[0].reg.type = uav ? VKD3D_SM5_RT_UAV : VKD3D_SM4_RT_RESOURCE, - .dsts[0].reg.idx = {var->regs[regset].id + i}, - .dsts[0].reg.idx_count = 1, + .dsts[0].type = uav ? VKD3D_SM5_RT_UAV : VKD3D_SM4_RT_RESOURCE, + .dsts[0].idx = {var->regs[regset].id + i}, + .dsts[0].idx_count = 1, .dst_count = 1,

.idx[0] = sm4_resource_format(component_type) * 0x1111, @@ -3823,33 +3885,34 @@ static void write_sm4_dcl_semantic(struct hlsl_ctx *ctx, struct vkd3d_bytecode_b

struct sm4_instruction instr = { - .dsts[0].reg.dim = VKD3D_SM4_DIMENSION_VEC4, + .dsts[0].dim = VKD3D_SM4_DIMENSION_VEC4, + .dsts[0].swizzle_type = VKD3D_SM4_SWIZZLE_MASK4, .dst_count = 1, };

- if (hlsl_sm4_register_from_semantic(ctx, &var->semantic, output, &instr.dsts[0].reg.type, NULL, &has_idx)) + if (hlsl_sm4_register_from_semantic(ctx, &var->semantic, output, &instr.dsts[0].type, NULL, &has_idx)) { if (has_idx) { - instr.dsts[0].reg.idx[0] = var->semantic.index; - instr.dsts[0].reg.idx_count = 1; + instr.dsts[0].idx[0] = var->semantic.index; + instr.dsts[0].idx_count = 1; } else { - instr.dsts[0].reg.idx_count = 0; + instr.dsts[0].idx_count = 0; } instr.dsts[0].writemask = (1 << var->data_type->dimx) - 1; } else { - instr.dsts[0].reg.type = output ? VKD3D_SM4_RT_OUTPUT : VKD3D_SM4_RT_INPUT; - instr.dsts[0].reg.idx[0] = var->regs[HLSL_REGSET_NUMERIC].id; - instr.dsts[0].reg.idx_count = 1; + instr.dsts[0].type = output ? VKD3D_SM4_RT_OUTPUT : VKD3D_SM4_RT_INPUT; + instr.dsts[0].idx[0] = var->regs[HLSL_REGSET_NUMERIC].id; + instr.dsts[0].idx_count = 1; instr.dsts[0].writemask = var->regs[HLSL_REGSET_NUMERIC].writemask; }

- if (instr.dsts[0].reg.type == VKD3D_SM4_RT_DEPTHOUT) - instr.dsts[0].reg.dim = VKD3D_SM4_DIMENSION_SCALAR; + if (instr.dsts[0].type == VKD3D_SM4_RT_DEPTHOUT) + instr.dsts[0].dim = VKD3D_SM4_DIMENSION_SCALAR;

hlsl_sm4_usage_from_semantic(ctx, &var->semantic, output, &usage); if (usage == ~0u) @@ -3962,7 +4025,7 @@ static void write_sm4_unary_op(struct vkd3d_bytecode_buffer *buffer, enum vkd3d_ instr.dst_count = 1;

sm4_src_from_node(&instr.srcs[0], src, instr.dsts[0].writemask); - instr.srcs[0].reg.mod = src_mod; + instr.srcs[0].mod = src_mod; instr.src_count = 1;

write_sm4_instruction(buffer, &instr); @@ -3980,9 +4043,9 @@ static void write_sm4_unary_op_with_two_destinations(struct vkd3d_bytecode_buffe assert(dst_idx < ARRAY_SIZE(instr.dsts)); sm4_dst_from_node(&instr.dsts[dst_idx], dst); assert(1 - dst_idx >= 0); - instr.dsts[1 - dst_idx].reg.type = VKD3D_SM4_RT_NULL; - instr.dsts[1 - dst_idx].reg.dim = VKD3D_SM4_DIMENSION_NONE; - instr.dsts[1 - dst_idx].reg.idx_count = 0; + instr.dsts[1 - dst_idx].type = VKD3D_SM4_RT_NULL; + instr.dsts[1 - dst_idx].dim = VKD3D_SM4_DIMENSION_NONE; + instr.dsts[1 - dst_idx].idx_count = 0; instr.dst_count = 2;

sm4_src_from_node(&instr.srcs[0], src, instr.dsts[dst_idx].writemask); @@ -4040,9 +4103,9 @@ static void write_sm4_binary_op_with_two_destinations(struct vkd3d_bytecode_buff assert(dst_idx < ARRAY_SIZE(instr.dsts)); sm4_dst_from_node(&instr.dsts[dst_idx], dst); assert(1 - dst_idx >= 0); - instr.dsts[1 - dst_idx].reg.type = VKD3D_SM4_RT_NULL; - instr.dsts[1 - dst_idx].reg.dim = VKD3D_SM4_DIMENSION_NONE; - instr.dsts[1 - dst_idx].reg.idx_count = 0; + instr.dsts[1 - dst_idx].type = VKD3D_SM4_RT_NULL; + instr.dsts[1 - dst_idx].dim = VKD3D_SM4_DIMENSION_NONE; + instr.dsts[1 - dst_idx].idx_count = 0; instr.dst_count = 2;

sm4_src_from_node(&instr.srcs[0], src1, instr.dsts[dst_idx].writemask); @@ -4105,7 +4168,7 @@ static void write_sm4_ld(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *buf { if (sample_index->type == HLSL_IR_CONSTANT) { - struct sm4_register *reg = &instr.srcs[2].reg; + struct sm4_register *reg = &instr.srcs[2]; struct hlsl_ir_constant *index;

index = hlsl_ir_constant(sample_index); @@ -4232,9 +4295,9 @@ static void write_sm4_cast_from_bool(struct hlsl_ctx *ctx,

sm4_src_from_node(&instr.srcs[0], arg, instr.dsts[0].writemask); instr.srcs[1].swizzle_type = VKD3D_SM4_SWIZZLE_MASK4; - instr.srcs[1].reg.type = VKD3D_SM4_RT_IMMCONST; - instr.srcs[1].reg.dim = VKD3D_SM4_DIMENSION_SCALAR; - instr.srcs[1].reg.immconst_uint[0] = mask; + instr.srcs[1].type = VKD3D_SM4_RT_IMMCONST; + instr.srcs[1].dim = VKD3D_SM4_DIMENSION_SCALAR; + instr.srcs[1].immconst_uint[0] = mask; instr.src_count = 2;

write_sm4_instruction(buffer, &instr); @@ -4358,7 +4421,7 @@ static void write_sm4_store_uav_typed(struct hlsl_ctx *ctx, struct vkd3d_bytecod memset(&instr, 0, sizeof(instr)); instr.opcode = VKD3D_SM5_OP_STORE_UAV_TYPED;

- sm4_register_from_deref(ctx, &instr.dsts[0].reg, &instr.dsts[0].writemask, NULL, dst, dst->var->data_type); + sm4_register_from_deref(ctx, &instr.dsts[0], dst, dst->var->data_type, true); instr.dst_count = 1;

sm4_src_from_node(&instr.srcs[0], coords, VKD3DSP_WRITEMASK_ALL); @@ -4936,9 +4999,9 @@ static void write_sm4_loop(struct hlsl_ctx *ctx, static void write_sm4_gather(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *buffer, const struct hlsl_type *resource_type, const struct hlsl_ir_node *dst, const struct hlsl_deref *resource, const struct hlsl_deref *sampler, - const struct hlsl_ir_node *coords, unsigned int swizzle, const struct hlsl_ir_node *texel_offset) + const struct hlsl_ir_node *coords, unsigned int dimension_idx, const struct hlsl_ir_node *texel_offset) { - struct sm4_src_register *src; + struct sm4_register *src; struct sm4_instruction instr;

memset(&instr, 0, sizeof(instr)); @@ -4969,9 +5032,9 @@ static void write_sm4_gather(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer

src = &instr.srcs[instr.src_count++]; sm4_src_from_deref(ctx, src, sampler, sampler->var->data_type, VKD3DSP_WRITEMASK_ALL); - src->reg.dim = VKD3D_SM4_DIMENSION_VEC4; + src->dim = VKD3D_SM4_DIMENSION_VEC4; src->swizzle_type = VKD3D_SM4_SWIZZLE_SCALAR; - src->swizzle = swizzle; + src->dimension_idx = dimension_idx;

write_sm4_instruction(buffer, &instr); } @@ -5036,22 +5099,22 @@ static void write_sm4_resource_load(struct hlsl_ctx *ctx,

case HLSL_RESOURCE_GATHER_RED: write_sm4_gather(ctx, buffer, resource_type, &load->node, &load->resource, - &load->sampler, coords, HLSL_SWIZZLE(X, X, X, X), texel_offset); + &load->sampler, coords, 0, texel_offset); break;

case HLSL_RESOURCE_GATHER_GREEN: write_sm4_gather(ctx, buffer, resource_type, &load->node, &load->resource, - &load->sampler, coords, HLSL_SWIZZLE(Y, Y, Y, Y), texel_offset); + &load->sampler, coords, 1, texel_offset); break;

case HLSL_RESOURCE_GATHER_BLUE: write_sm4_gather(ctx, buffer, resource_type, &load->node, &load->resource, - &load->sampler, coords, HLSL_SWIZZLE(Z, Z, Z, Z), texel_offset); + &load->sampler, coords, 2, texel_offset); break;

case HLSL_RESOURCE_GATHER_ALPHA: write_sm4_gather(ctx, buffer, resource_type, &load->node, &load->resource, - &load->sampler, coords, HLSL_SWIZZLE(W, W, W, W), texel_offset); + &load->sampler, coords, 3, texel_offset); break; } } @@ -5087,13 +5150,12 @@ static void write_sm4_store(struct hlsl_ctx *ctx, { const struct hlsl_ir_node *rhs = store->rhs.node; struct sm4_instruction instr; - unsigned int writemask;

memset(&instr, 0, sizeof(instr)); instr.opcode = VKD3D_SM4_OP_MOV;

- sm4_register_from_deref(ctx, &instr.dsts[0].reg, &writemask, NULL, &store->lhs, rhs->data_type); - instr.dsts[0].writemask = hlsl_combine_writemasks(writemask, store->writemask); + sm4_register_from_deref(ctx, &instr.dsts[0], &store->lhs, rhs->data_type, true); + instr.dsts[0].writemask = hlsl_combine_writemasks(instr.dsts[0].writemask, store->writemask); instr.dst_count = 1;

sm4_src_from_node(&instr.srcs[0], rhs, instr.dsts[0].writemask); @@ -5106,7 +5168,6 @@ static void write_sm4_swizzle(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *buffer, const struct hlsl_ir_swizzle *swizzle) { struct sm4_instruction instr; - unsigned int writemask;

memset(&instr, 0, sizeof(instr)); instr.opcode = VKD3D_SM4_OP_MOV; @@ -5114,8 +5175,8 @@ static void write_sm4_swizzle(struct hlsl_ctx *ctx, sm4_dst_from_node(&instr.dsts[0], &swizzle->node); instr.dst_count = 1;

- sm4_register_from_node(&instr.srcs[0].reg, &writemask, &instr.srcs[0].swizzle_type, swizzle->val.node); - instr.srcs[0].swizzle = hlsl_map_swizzle(hlsl_combine_swizzles(hlsl_swizzle_from_writemask(writemask), + sm4_register_from_node(&instr.srcs[0], swizzle->val.node, false); + instr.srcs[0].swizzle = hlsl_map_swizzle(hlsl_combine_swizzles(instr.srcs[0].swizzle, swizzle->swizzle, swizzle->node.data_type->dimx), instr.dsts[0].writemask); instr.src_count = 1;

-- GitLab https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/269

Henri Verbeet (＠hverbeet)

10 Jul 10 Jul

2:59 p.m.

This seem like the wrong way to go about it. In particular, I think this is what we should do:

- Make struct sm4_register a proper subset of struct vkd3d_shader_register. E.g. by changing the "type" field from vkd3d_sm4_register_type to vkd3d_shader_register_type. It probably also implies moving some fields from struct sm4_register to struct sm4_dst_register/sm4_src_register.

- Replace usage of struct sm4_register with usage of struct vkd3d_shader_register. That should be straightforward after the previous step.

- Move up from there. I.e., doing the same thing for struct sm4_dst_register, struct sm4_src_register, and struct sm4_instruction. It's possible that this will require adding things to the vkd3d_shader_instruction structures; that's fine.

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/269#note_38537

Francisco Casas (＠fcasas)

5:45 p.m.

On Mon Jul 10 17:45:18 2023 +0000, Henri Verbeet wrote:

...

This seem like the wrong way to go about it. In particular, I think this is what we should do:

Make struct sm4_register a proper subset of struct

vkd3d_shader_register. E.g. by changing the "type" field from vkd3d_sm4_register_type to vkd3d_shader_register_type. It probably also implies moving some fields from struct sm4_register to struct sm4_dst_register/sm4_src_register.

Replace usage of struct sm4_register with usage of struct

vkd3d_shader_register. That should be straightforward after the previous step.

Move up from there. I.e., doing the same thing for struct

sm4_dst_register, struct sm4_src_register, and struct sm4_instruction. It's possible that this will require adding things to the vkd3d_shader_instruction structures; that's fine.

Roger!

-- https://gitlab.winehq.org/wine/vkd3d/-/merge_requests/269#note_38564