First part of v2 of !27, which aims to:
* Allow allocation of variables of complex types that contain both numerics and objects across multiple register sets (regsets). * Support the tex2D and tex3D intrinsics, inferring generic samplers dimension from usage, writing sampler declarations, and writing sample instructions. * Support for arrays of resources for both SM1 and SM4 (not to be confused with the resource-arrays of SM 5.1, which can have non-constant indexes). * Support for resources declared within structs. * Support for synthetic combined samplers for SM1 and synthetic separated samplers for SM4, considering that they can be arrays or members of structs. * Imitate the way the native compiler assigns the register indexes of the resources on allocation, which proved to be the most difficult thing. * Support for object components within complex input parameters. * Small fixes to corner cases.
This part consists on parsing the `tex2D()` and `tex3D()` intrinsics and beginning to support the allocation of variables across multiple regsets.
The whole series, is on my [master6](https://gitlab.winehq.org/fcasas/vkd3d/-/commits/master6) branch.
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl_sm4.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/libs/vkd3d-shader/hlsl_sm4.c b/libs/vkd3d-shader/hlsl_sm4.c index e06f4b15..9232ea78 100644 --- a/libs/vkd3d-shader/hlsl_sm4.c +++ b/libs/vkd3d-shader/hlsl_sm4.c @@ -2209,7 +2209,10 @@ static void write_sm4_resource_load(struct hlsl_ctx *ctx,
case HLSL_RESOURCE_SAMPLE: if (!load->sampler.var) + { hlsl_fixme(ctx, &load->node.loc, "SM4 combined sample expression."); + return; + } write_sm4_sample(ctx, buffer, resource_type, &load->node, &load->resource, &load->sampler, coords, texel_offset); break;
From: Zebediah Figura zfigura@codeweavers.com
--- Modifications: * Using new hlsl_resource_load_params struct. * Removed `HLSL_OP2_SAMPLE` from enum hlsl_ir_expr_op, since it is not used. --- libs/vkd3d-shader/hlsl.y | 60 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 60 insertions(+)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index b3e38d1f..10ff3f23 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2846,6 +2846,65 @@ static bool intrinsic_step(struct hlsl_ctx *ctx, return !!add_implicit_conversion(ctx, params->instrs, ge, type, loc); }
+static bool intrinsic_tex(struct hlsl_ctx *ctx, const struct parse_initializer *params, + const struct vkd3d_shader_location *loc, const char *name, enum hlsl_sampler_dim dim) +{ + struct hlsl_resource_load_params load_params = {.type = HLSL_RESOURCE_SAMPLE}; + const struct hlsl_type *sampler_type; + struct hlsl_ir_resource_load *load; + struct hlsl_ir_load *sampler_load; + struct hlsl_ir_node *coords; + + if (params->args_count != 2 && params->args_count != 4) + { + hlsl_error(ctx, loc, VKD3D_SHADER_ERROR_HLSL_WRONG_PARAMETER_COUNT, + "Wrong number of arguments to function '%s': expected 2 or 4, but got %u.", name, params->args_count); + return false; + } + + if (params->args_count == 4) + { + hlsl_fixme(ctx, loc, "Samples with gradients are not implemented.\n"); + return false; + } + + sampler_type = params->args[0]->data_type; + if (sampler_type->type != HLSL_CLASS_OBJECT || sampler_type->base_type != HLSL_TYPE_SAMPLER + || (sampler_type->sampler_dim != dim && sampler_type->sampler_dim != HLSL_SAMPLER_DIM_GENERIC)) + { + struct vkd3d_string_buffer *string; + + if ((string = hlsl_type_to_string(ctx, sampler_type))) + hlsl_error(ctx, loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE, + "Wrong type for argument 1 of '%s': expected 'sampler' or '%s', but got '%s'.", + name, ctx->builtin_types.sampler[dim]->name, string->buffer); + hlsl_release_string_buffer(ctx, string); + return false; + } + + /* Only HLSL_IR_LOAD can return an object. */ + sampler_load = hlsl_ir_load(params->args[0]); + + if (!(coords = add_implicit_conversion(ctx, params->instrs, params->args[1], + hlsl_get_vector_type(ctx, HLSL_TYPE_FLOAT, hlsl_sampler_dim_count(dim)), loc))) + coords = params->args[1]; + + load_params.format = hlsl_get_vector_type(ctx, HLSL_TYPE_FLOAT, 4); + load_params.resource = sampler_load->src; + load_params.coords = coords; + + if (!(load = hlsl_new_resource_load(ctx, &load_params, loc))) + return false; + list_add_tail(params->instrs, &load->node.entry); + return true; +} + +static bool intrinsic_tex2D(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + return intrinsic_tex(ctx, params, loc, "tex2D", HLSL_SAMPLER_DIM_2D); +} + static bool intrinsic_transpose(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { @@ -2936,6 +2995,7 @@ intrinsic_functions[] = {"smoothstep", 3, true, intrinsic_smoothstep}, {"sqrt", 1, true, intrinsic_sqrt}, {"step", 2, true, intrinsic_step}, + {"tex2D", -1, false, intrinsic_tex2D}, {"transpose", 1, true, intrinsic_transpose}, };
From: Zebediah Figura zfigura@codeweavers.com
--- libs/vkd3d-shader/hlsl.y | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/libs/vkd3d-shader/hlsl.y b/libs/vkd3d-shader/hlsl.y index 10ff3f23..3b7ba9bd 100644 --- a/libs/vkd3d-shader/hlsl.y +++ b/libs/vkd3d-shader/hlsl.y @@ -2905,6 +2905,12 @@ static bool intrinsic_tex2D(struct hlsl_ctx *ctx, return intrinsic_tex(ctx, params, loc, "tex2D", HLSL_SAMPLER_DIM_2D); }
+static bool intrinsic_tex3D(struct hlsl_ctx *ctx, + const struct parse_initializer *params, const struct vkd3d_shader_location *loc) +{ + return intrinsic_tex(ctx, params, loc, "tex3D", HLSL_SAMPLER_DIM_3D); +} + static bool intrinsic_transpose(struct hlsl_ctx *ctx, const struct parse_initializer *params, const struct vkd3d_shader_location *loc) { @@ -2996,6 +3002,7 @@ intrinsic_functions[] = {"sqrt", 1, true, intrinsic_sqrt}, {"step", 2, true, intrinsic_step}, {"tex2D", -1, false, intrinsic_tex2D}, + {"tex3D", -1, false, intrinsic_tex3D}, {"transpose", 1, true, intrinsic_transpose}, };
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl.c | 78 +++++++++++++++++++++++--------- libs/vkd3d-shader/hlsl.h | 32 +++++++++---- libs/vkd3d-shader/hlsl_codegen.c | 64 ++++++++++++++------------ libs/vkd3d-shader/hlsl_sm1.c | 2 +- libs/vkd3d-shader/hlsl_sm4.c | 4 +- 5 files changed, 117 insertions(+), 63 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.c b/libs/vkd3d-shader/hlsl.c index 13aac3d4..97d4f09f 100644 --- a/libs/vkd3d-shader/hlsl.c +++ b/libs/vkd3d-shader/hlsl.c @@ -164,6 +164,22 @@ static unsigned int get_array_size(const struct hlsl_type *type) return 1; }
+enum hlsl_regset hlsl_type_get_regset(struct hlsl_ctx *ctx, const struct hlsl_type *type) +{ + unsigned int k; + + if (type->type == HLSL_CLASS_OBJECT) + { + for (k = 0; k <= HLSL_REGSET_LAST_OBJECT; ++k) + { + if (type->reg_size[k] > 0) + return k; + } + vkd3d_unreachable(); + } + return HLSL_REGSET_NUM; +} + unsigned int hlsl_type_get_sm4_offset(const struct hlsl_type *type, unsigned int offset) { /* Align to the next vec4 boundary if: @@ -171,7 +187,7 @@ unsigned int hlsl_type_get_sm4_offset(const struct hlsl_type *type, unsigned int * (b) the type would cross a vec4 boundary; i.e. a vec3 and a * vec1 can be packed together, but not a vec3 and a vec2. */ - if (type->type > HLSL_CLASS_LAST_NUMERIC || (offset & 3) + type->reg_size > 4) + if (type->type > HLSL_CLASS_LAST_NUMERIC || (offset & 3) + type->reg_size[HLSL_REGSET_NUM] > 4) return align(offset, 4); return offset; } @@ -179,31 +195,40 @@ unsigned int hlsl_type_get_sm4_offset(const struct hlsl_type *type, unsigned int static void hlsl_type_calculate_reg_size(struct hlsl_ctx *ctx, struct hlsl_type *type) { bool is_sm4 = (ctx->profile->major_version >= 4); + unsigned int k; + + for (k = 0; k <= HLSL_REGSET_LAST; ++k) + type->reg_size[k] = 0;
switch (type->type) { case HLSL_CLASS_SCALAR: case HLSL_CLASS_VECTOR: - type->reg_size = is_sm4 ? type->dimx : 4; + type->reg_size[HLSL_REGSET_NUM] = is_sm4 ? type->dimx : 4; break;
case HLSL_CLASS_MATRIX: if (hlsl_type_is_row_major(type)) - type->reg_size = is_sm4 ? (4 * (type->dimy - 1) + type->dimx) : (4 * type->dimy); + type->reg_size[HLSL_REGSET_NUM] = is_sm4 ? (4 * (type->dimy - 1) + type->dimx) : (4 * type->dimy); else - type->reg_size = is_sm4 ? (4 * (type->dimx - 1) + type->dimy) : (4 * type->dimx); + type->reg_size[HLSL_REGSET_NUM] = is_sm4 ? (4 * (type->dimx - 1) + type->dimy) : (4 * type->dimx); break;
case HLSL_CLASS_ARRAY: { - unsigned int element_size = type->e.array.type->reg_size; - if (type->e.array.elements_count == HLSL_ARRAY_ELEMENTS_COUNT_IMPLICIT) - type->reg_size = 0; - else if (is_sm4) - type->reg_size = (type->e.array.elements_count - 1) * align(element_size, 4) + element_size; - else - type->reg_size = type->e.array.elements_count * element_size; + break; + + for (k = 0; k <= HLSL_REGSET_LAST; ++k) + { + unsigned int element_size = type->e.array.type->reg_size[k]; + + if (is_sm4 && k == HLSL_REGSET_NUM) + type->reg_size[k] = (type->e.array.elements_count - 1) * align(element_size, 4) + element_size; + else + type->reg_size[k] = type->e.array.elements_count * element_size; + } + break; }
@@ -212,16 +237,17 @@ static void hlsl_type_calculate_reg_size(struct hlsl_ctx *ctx, struct hlsl_type unsigned int i;
type->dimx = 0; - type->reg_size = 0; - for (i = 0; i < type->e.record.field_count; ++i) { struct hlsl_struct_field *field = &type->e.record.fields[i]; - unsigned int field_size = field->type->reg_size;
- type->reg_size = hlsl_type_get_sm4_offset(field->type, type->reg_size); - field->reg_offset = type->reg_size; - type->reg_size += field_size; + for (k = 0; k <= HLSL_REGSET_LAST; ++k) + { + if (k == HLSL_REGSET_NUM) + type->reg_size[k] = hlsl_type_get_sm4_offset(field->type, type->reg_size[k]); + field->reg_offset[k] = type->reg_size[k]; + type->reg_size[k] += field->type->reg_size[k]; + }
type->dimx += field->type->dimx * field->type->dimy * get_array_size(field->type); } @@ -229,16 +255,24 @@ static void hlsl_type_calculate_reg_size(struct hlsl_ctx *ctx, struct hlsl_type }
case HLSL_CLASS_OBJECT: - type->reg_size = 0; + if (type->base_type == HLSL_TYPE_SAMPLER) + type->reg_size[HLSL_REGSET_S] = 1; + if (type->base_type == HLSL_TYPE_TEXTURE) + type->reg_size[HLSL_REGSET_T] = 1; + if (type->base_type == HLSL_TYPE_UAV) + type->reg_size[HLSL_REGSET_U] = 1; + break; } }
-/* Returns the size of a type, considered as part of an array of that type. - * As such it includes padding after the type. */ -unsigned int hlsl_type_get_array_element_reg_size(const struct hlsl_type *type) +/* Returns the size of a type, considered as part of an array of that type, within a specific + * register set. As such it includes padding after the type, when applicable. */ +unsigned int hlsl_type_get_array_element_reg_size(const struct hlsl_type *type, enum hlsl_regset rset) { - return align(type->reg_size, 4); + if (rset == HLSL_REGSET_NUM) + return align(type->reg_size[rset], 4); + return type->reg_size[rset]; }
static struct hlsl_type *hlsl_new_type(struct hlsl_ctx *ctx, const char *name, enum hlsl_type_class type_class, diff --git a/libs/vkd3d-shader/hlsl.h b/libs/vkd3d-shader/hlsl.h index 059abe18..83f0076d 100644 --- a/libs/vkd3d-shader/hlsl.h +++ b/libs/vkd3d-shader/hlsl.h @@ -114,6 +114,15 @@ enum hlsl_matrix_majority HLSL_ROW_MAJOR };
+enum hlsl_regset { + HLSL_REGSET_S, + HLSL_REGSET_T, + HLSL_REGSET_U, + HLSL_REGSET_LAST_OBJECT = HLSL_REGSET_U, + HLSL_REGSET_NUM, + HLSL_REGSET_LAST = HLSL_REGSET_NUM, +}; + /* An HLSL source-level data type, including anonymous structs and typedefs. */ struct hlsl_type { @@ -175,12 +184,12 @@ struct hlsl_type struct hlsl_type *resource_format; } e;
- /* Number of numeric register components used by one value of this type (4 components make 1 - * register). - * If type is HLSL_CLASS_STRUCT or HLSL_CLASS_ARRAY, this value includes the reg_size of - * their elements and padding (which varies according to the backend). - * This value is 0 for types without numeric components, like objects. */ - unsigned int reg_size; + /* Number of numeric register components used by one value of this type, for each regset. + * For HLSL_REGSET_NUM, 4 components make 1 register, while for other regsets 1 component makes + * 1 register. + * If type is HLSL_CLASS_STRUCT or HLSL_CLASS_ARRAY, the reg_size of their elements and padding + * (which varies according to the backend) is also included. */ + unsigned int reg_size[HLSL_REGSET_LAST + 1]; /* Offset where the type's description starts in the output bytecode, in bytes. */ size_t bytecode_offset; }; @@ -205,8 +214,8 @@ struct hlsl_struct_field * type->modifiers instead) and that also are specific to the field and not the whole variable. * In particular, interpolation modifiers. */ unsigned int storage_modifiers; - /* Offset of the field within the type it belongs to, in numeric register components. */ - unsigned int reg_offset; + /* Offset of the field within the type it belongs to, in register components, for each regset. */ + unsigned int reg_offset[HLSL_REGSET_LAST + 1];
/* Offset where the fields's name starts in the output bytecode, in bytes. */ size_t name_bytecode_offset; @@ -536,10 +545,12 @@ struct hlsl_deref struct hlsl_src *path;
/* Single instruction node of data type uint used to represent the register offset (in register - * components), from the start of the variable, of the part referenced. + * components, within the pertaining regset), from the start of the variable, of the part + * referenced. * The path is lowered to this single offset -- whose value may vary between SM1 and SM4 -- * before writing the bytecode. */ struct hlsl_src offset; + enum hlsl_regset offset_regset; };
struct hlsl_ir_load @@ -1055,13 +1066,14 @@ bool hlsl_scope_add_type(struct hlsl_scope *scope, struct hlsl_type *type); struct hlsl_type *hlsl_type_clone(struct hlsl_ctx *ctx, struct hlsl_type *old, unsigned int default_majority, unsigned int modifiers); unsigned int hlsl_type_component_count(const struct hlsl_type *type); -unsigned int hlsl_type_get_array_element_reg_size(const struct hlsl_type *type); +unsigned int hlsl_type_get_array_element_reg_size(const struct hlsl_type *type, enum hlsl_regset rset); struct hlsl_type *hlsl_type_get_component_type(struct hlsl_ctx *ctx, struct hlsl_type *type, unsigned int index); bool hlsl_type_is_row_major(const struct hlsl_type *type); unsigned int hlsl_type_minor_size(const struct hlsl_type *type); unsigned int hlsl_type_major_size(const struct hlsl_type *type); unsigned int hlsl_type_element_count(const struct hlsl_type *type); +enum hlsl_regset hlsl_type_get_regset(struct hlsl_ctx *ctx, const struct hlsl_type *type); unsigned int hlsl_type_get_sm4_offset(const struct hlsl_type *type, unsigned int offset); bool hlsl_types_are_equal(const struct hlsl_type *t1, const struct hlsl_type *t2);
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 9b644d1b..489f791d 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -24,7 +24,7 @@ /* TODO: remove when no longer needed, only used for new_offset_instr_from_deref() */ static struct hlsl_ir_node *new_offset_from_path_index(struct hlsl_ctx *ctx, struct hlsl_block *block, struct hlsl_type *type, struct hlsl_ir_node *offset, struct hlsl_ir_node *idx, - const struct vkd3d_shader_location *loc) + enum hlsl_regset rset, const struct vkd3d_shader_location *loc) { struct hlsl_ir_node *idx_offset = NULL; struct hlsl_ir_constant *c; @@ -52,7 +52,7 @@ static struct hlsl_ir_node *new_offset_from_path_index(struct hlsl_ctx *ctx, str
case HLSL_CLASS_ARRAY: { - unsigned int size = hlsl_type_get_array_element_reg_size(type->e.array.type); + unsigned int size = hlsl_type_get_array_element_reg_size(type->e.array.type, rset);
if (!(c = hlsl_new_uint_constant(ctx, size, loc))) return NULL; @@ -70,7 +70,7 @@ static struct hlsl_ir_node *new_offset_from_path_index(struct hlsl_ctx *ctx, str unsigned int field_idx = hlsl_ir_constant(idx)->value[0].u; struct hlsl_struct_field *field = &type->e.record.fields[field_idx];
- if (!(c = hlsl_new_uint_constant(ctx, field->reg_offset, loc))) + if (!(c = hlsl_new_uint_constant(ctx, field->reg_offset[rset], loc))) return NULL; list_add_tail(&block->instrs, &c->node.entry);
@@ -110,7 +110,8 @@ static struct hlsl_ir_node *new_offset_instr_from_deref(struct hlsl_ctx *ctx, st { struct hlsl_block idx_block;
- if (!(offset = new_offset_from_path_index(ctx, &idx_block, type, offset, deref->path[i].node, loc))) + if (!(offset = new_offset_from_path_index(ctx, &idx_block, type, offset, deref->path[i].node, + deref->offset_regset, loc))) return NULL;
list_move_tail(&block->instrs, &idx_block.instrs); @@ -134,6 +135,8 @@ static void replace_deref_path_with_offset(struct hlsl_ctx *ctx, struct hlsl_der /* register offsets shouldn't be used before this point is reached. */ assert(!deref->offset.node);
+ deref->offset_regset = hlsl_type_get_regset(ctx, hlsl_deref_get_type(ctx, deref)); + if (!(offset = new_offset_instr_from_deref(ctx, &block, deref, &instr->loc))) return; list_move_before(&instr->entry, &block.instrs); @@ -1982,32 +1985,33 @@ static struct hlsl_reg allocate_range(struct hlsl_ctx *ctx, struct liveness *liv static const char *debug_register(char class, struct hlsl_reg reg, const struct hlsl_type *type) { static const char writemask_offset[] = {'w','x','y','z'}; + unsigned int reg_size = type->reg_size[HLSL_REGSET_NUM];
- if (type->reg_size > 4) + if (reg_size > 4) { - if (type->reg_size & 3) - return vkd3d_dbg_sprintf("%c%u-%c%u.%c", class, reg.id, class, - reg.id + (type->reg_size / 4), writemask_offset[type->reg_size & 3]); + if (reg_size & 3) + return vkd3d_dbg_sprintf("%c%u-%c%u.%c", class, reg.id, class, reg.id + (reg_size / 4), + writemask_offset[reg_size & 3]);
- return vkd3d_dbg_sprintf("%c%u-%c%u", class, reg.id, class, - reg.id + (type->reg_size / 4) - 1); + return vkd3d_dbg_sprintf("%c%u-%c%u", class, reg.id, class, reg.id + (reg_size / 4) - 1); } return vkd3d_dbg_sprintf("%c%u%s", class, reg.id, debug_hlsl_writemask(reg.writemask)); }
static void allocate_variable_temp_register(struct hlsl_ctx *ctx, struct hlsl_ir_var *var, struct liveness *liveness) { + unsigned int reg_size = var->data_type->reg_size[HLSL_REGSET_NUM]; + if (var->is_input_semantic || var->is_output_semantic || var->is_uniform) return;
if (!var->reg.allocated && var->last_read) { - if (var->data_type->reg_size > 4) - var->reg = allocate_range(ctx, liveness, var->first_write, - var->last_read, var->data_type->reg_size); + if (reg_size > 4) + var->reg = allocate_range(ctx, liveness, var->first_write, var->last_read, reg_size); else - var->reg = allocate_register(ctx, liveness, var->first_write, - var->last_read, var->data_type->reg_size); + var->reg = allocate_register(ctx, liveness, var->first_write, var->last_read, reg_size); + TRACE("Allocated %s to %s (liveness %u-%u).\n", var->name, debug_register('r', var->reg, var->data_type), var->first_write, var->last_read); } @@ -2021,12 +2025,12 @@ static void allocate_temp_registers_recurse(struct hlsl_ctx *ctx, struct hlsl_bl { if (!instr->reg.allocated && instr->last_read) { - if (instr->data_type->reg_size > 4) - instr->reg = allocate_range(ctx, liveness, instr->index, - instr->last_read, instr->data_type->reg_size); + unsigned int reg_size = instr->data_type->reg_size[HLSL_REGSET_NUM]; + + if (reg_size > 4) + instr->reg = allocate_range(ctx, liveness, instr->index, instr->last_read, reg_size); else - instr->reg = allocate_register(ctx, liveness, instr->index, - instr->last_read, instr->data_type->reg_size); + instr->reg = allocate_register(ctx, liveness, instr->index, instr->last_read, reg_size); TRACE("Allocated anonymous expression @%u to %s (liveness %u-%u).\n", instr->index, debug_register('r', instr->reg, instr->data_type), instr->index, instr->last_read); } @@ -2084,7 +2088,7 @@ static void allocate_const_registers_recurse(struct hlsl_ctx *ctx, struct hlsl_b struct hlsl_ir_constant *constant = hlsl_ir_constant(instr); const struct hlsl_type *type = instr->data_type; unsigned int x, y, i, writemask, end_reg; - unsigned int reg_size = type->reg_size; + unsigned int reg_size = type->reg_size[HLSL_REGSET_NUM];
if (reg_size > 4) constant->reg = allocate_range(ctx, liveness, 1, UINT_MAX, reg_size); @@ -2183,15 +2187,15 @@ static void allocate_const_registers(struct hlsl_ctx *ctx, struct hlsl_ir_functi { if (var->is_uniform && var->last_read) { - if (var->data_type->reg_size == 0) + if (var->data_type->reg_size[HLSL_REGSET_NUM] == 0) continue;
- if (var->data_type->reg_size > 4) - var->reg = allocate_range(ctx, &liveness, 1, UINT_MAX, var->data_type->reg_size); + if (var->data_type->reg_size[HLSL_REGSET_NUM] > 4) + var->reg = allocate_range(ctx, &liveness, 1, UINT_MAX, var->data_type->reg_size[HLSL_REGSET_NUM]); else { var->reg = allocate_register(ctx, &liveness, 1, UINT_MAX, 4); - var->reg.writemask = (1u << var->data_type->reg_size) - 1; + var->reg.writemask = (1u << var->data_type->reg_size[HLSL_REGSET_NUM]) - 1; } TRACE("Allocated %s to %s.\n", var->name, debug_register('c', var->reg, var->data_type)); } @@ -2308,7 +2312,7 @@ static void calculate_buffer_offset(struct hlsl_ir_var *var)
var->buffer_offset = buffer->size; TRACE("Allocated buffer offset %u to %s.\n", var->buffer_offset, var->name); - buffer->size += var->data_type->reg_size; + buffer->size += var->data_type->reg_size[HLSL_REGSET_NUM]; if (var->last_read) buffer->used_size = buffer->size; } @@ -2563,6 +2567,7 @@ bool hlsl_component_index_range_from_deref(struct hlsl_ctx *ctx, const struct hl bool hlsl_offset_from_deref(struct hlsl_ctx *ctx, const struct hlsl_deref *deref, unsigned int *offset) { struct hlsl_ir_node *offset_node = deref->offset.node; + unsigned int size;
if (!offset_node) { @@ -2579,10 +2584,11 @@ bool hlsl_offset_from_deref(struct hlsl_ctx *ctx, const struct hlsl_deref *deref
*offset = hlsl_ir_constant(offset_node)->value[0].u;
- if (*offset >= deref->var->data_type->reg_size) + size = deref->var->data_type->reg_size[deref->offset_regset]; + if (*offset >= size) { hlsl_error(ctx, &deref->offset.node->loc, VKD3D_SHADER_ERROR_HLSL_OFFSET_OUT_OF_BOUNDS, - "Dereference is out of bounds. %u/%u", *offset, deref->var->data_type->reg_size); + "Dereference is out of bounds. %u/%u", *offset, size); return false; }
@@ -2608,6 +2614,8 @@ struct hlsl_reg hlsl_reg_from_deref(struct hlsl_ctx *ctx, const struct hlsl_dere struct hlsl_reg ret = var->reg; unsigned int offset = hlsl_offset_from_deref_safe(ctx, deref);
+ assert(deref->offset_regset == HLSL_REGSET_NUM); + ret.id += offset / 4;
ret.writemask = 0xf & (0xf << (offset % 4)); diff --git a/libs/vkd3d-shader/hlsl_sm1.c b/libs/vkd3d-shader/hlsl_sm1.c index ba22925e..6d45208f 100644 --- a/libs/vkd3d-shader/hlsl_sm1.c +++ b/libs/vkd3d-shader/hlsl_sm1.c @@ -366,7 +366,7 @@ static void write_sm1_uniforms(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffe else { put_u32(buffer, vkd3d_make_u32(D3DXRS_FLOAT4, var->reg.id)); - put_u32(buffer, var->data_type->reg_size / 4); + put_u32(buffer, var->data_type->reg_size[HLSL_REGSET_NUM] / 4); } put_u32(buffer, 0); /* type */ put_u32(buffer, 0); /* FIXME: default value */ diff --git a/libs/vkd3d-shader/hlsl_sm4.c b/libs/vkd3d-shader/hlsl_sm4.c index 9232ea78..c394ba55 100644 --- a/libs/vkd3d-shader/hlsl_sm4.c +++ b/libs/vkd3d-shader/hlsl_sm4.c @@ -383,7 +383,7 @@ static void write_sm4_type(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffer *b
put_u32(buffer, field->name_bytecode_offset); put_u32(buffer, field->type->bytecode_offset); - put_u32(buffer, field->reg_offset); + put_u32(buffer, field->reg_offset[HLSL_REGSET_NUM]); } }
@@ -699,7 +699,7 @@ static void write_sm4_rdef(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc)
put_u32(&buffer, 0); /* name */ put_u32(&buffer, var->buffer_offset * sizeof(float)); - put_u32(&buffer, var->data_type->reg_size * sizeof(float)); + put_u32(&buffer, var->data_type->reg_size[HLSL_REGSET_NUM] * sizeof(float)); put_u32(&buffer, flags); put_u32(&buffer, 0); /* type */ put_u32(&buffer, 0); /* FIXME: default value */
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl.h | 16 +++----- libs/vkd3d-shader/hlsl_codegen.c | 55 +++++++++++++++----------- libs/vkd3d-shader/hlsl_sm1.c | 22 +++++++---- libs/vkd3d-shader/hlsl_sm4.c | 67 ++++++++++++++++++++------------ 4 files changed, 96 insertions(+), 64 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.h b/libs/vkd3d-shader/hlsl.h index 83f0076d..4608203b 100644 --- a/libs/vkd3d-shader/hlsl.h +++ b/libs/vkd3d-shader/hlsl.h @@ -371,19 +371,13 @@ struct hlsl_ir_var /* Offset where the variable's value is stored within its buffer in numeric register components. * This in case the variable is uniform. */ unsigned int buffer_offset; - /* Register to which the variable is allocated during its lifetime. - * In case that the variable spans multiple registers, this is set to the start of the register - * range. - * The register type is inferred from the data type and the storage of the variable. + /* Register to which the variable is allocated during its lifetime, for each register set. + * In case that the variable spans multiple registers in one regset, this is set to the + * start of the register range. * Builtin semantics don't use the field. * In SM4, uniforms don't use the field because they are located using the buffer's hlsl_reg - * and the buffer_offset instead. - * If the variable is an input semantic copy, the register is 'v'. - * If the variable is an output semantic copy, the register is 'o'. - * Textures are stored on 's' registers in SM1, and 't' registers in SM4. - * Samplers are stored on 's' registers. - * UAVs are stored on 'u' registers. */ - struct hlsl_reg reg; + * and the buffer_offset instead. */ + struct hlsl_reg regs[HLSL_REGSET_LAST + 1];
uint32_t is_input_semantic : 1; uint32_t is_output_semantic : 1; diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 489f791d..a39ddd1b 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -2005,15 +2005,17 @@ static void allocate_variable_temp_register(struct hlsl_ctx *ctx, struct hlsl_ir if (var->is_input_semantic || var->is_output_semantic || var->is_uniform) return;
- if (!var->reg.allocated && var->last_read) + if (!var->regs[HLSL_REGSET_NUM].allocated && var->last_read) { if (reg_size > 4) - var->reg = allocate_range(ctx, liveness, var->first_write, var->last_read, reg_size); + var->regs[HLSL_REGSET_NUM] = allocate_range(ctx, liveness, var->first_write, + var->last_read, reg_size); else - var->reg = allocate_register(ctx, liveness, var->first_write, var->last_read, reg_size); + var->regs[HLSL_REGSET_NUM] = allocate_register(ctx, liveness, var->first_write, + var->last_read, reg_size);
- TRACE("Allocated %s to %s (liveness %u-%u).\n", var->name, - debug_register('r', var->reg, var->data_type), var->first_write, var->last_read); + TRACE("Allocated %s to %s (liveness %u-%u).\n", var->name, debug_register('r', + var->regs[HLSL_REGSET_NUM], var->data_type), var->first_write, var->last_read); } }
@@ -2187,17 +2189,21 @@ static void allocate_const_registers(struct hlsl_ctx *ctx, struct hlsl_ir_functi { if (var->is_uniform && var->last_read) { - if (var->data_type->reg_size[HLSL_REGSET_NUM] == 0) + unsigned int reg_size = var->data_type->reg_size[HLSL_REGSET_NUM]; + + if (reg_size == 0) continue;
- if (var->data_type->reg_size[HLSL_REGSET_NUM] > 4) - var->reg = allocate_range(ctx, &liveness, 1, UINT_MAX, var->data_type->reg_size[HLSL_REGSET_NUM]); + if (reg_size > 4) + { + var->regs[HLSL_REGSET_NUM] = allocate_range(ctx, &liveness, 1, UINT_MAX, reg_size); + } else { - var->reg = allocate_register(ctx, &liveness, 1, UINT_MAX, 4); - var->reg.writemask = (1u << var->data_type->reg_size[HLSL_REGSET_NUM]) - 1; + var->regs[HLSL_REGSET_NUM] = allocate_register(ctx, &liveness, 1, UINT_MAX, 4); + var->regs[HLSL_REGSET_NUM].writemask = (1u << reg_size) - 1; } - TRACE("Allocated %s to %s.\n", var->name, debug_register('c', var->reg, var->data_type)); + TRACE("Allocated %s to %s.\n", var->name, debug_register('c', var->regs[HLSL_REGSET_NUM], var->data_type)); } } } @@ -2271,10 +2277,11 @@ static void allocate_semantic_register(struct hlsl_ctx *ctx, struct hlsl_ir_var } else { - var->reg.allocated = true; - var->reg.id = (*counter)++; - var->reg.writemask = (1 << var->data_type->dimx) - 1; - TRACE("Allocated %s to %s.\n", var->name, debug_register(output ? 'o' : 'v', var->reg, var->data_type)); + var->regs[HLSL_REGSET_NUM].allocated = true; + var->regs[HLSL_REGSET_NUM].id = (*counter)++; + var->regs[HLSL_REGSET_NUM].writemask = (1 << var->data_type->dimx) - 1; + TRACE("Allocated %s to %s.\n", var->name, debug_register(output ? 'o' : 'v', + var->regs[HLSL_REGSET_NUM], var->data_type)); } }
@@ -2437,10 +2444,14 @@ static void allocate_objects(struct hlsl_ctx *ctx, enum hlsl_base_type type)
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { + enum hlsl_regset rset; + if (!var->last_read || var->data_type->type != HLSL_CLASS_OBJECT || var->data_type->base_type != type) continue;
+ rset = hlsl_type_get_regset(ctx, var->data_type); + if (var->reg_reservation.type == type_info->reg_name) { const struct hlsl_ir_var *reserved_object = get_reserved_object(ctx, type_info->reg_name, @@ -2462,8 +2473,8 @@ static void allocate_objects(struct hlsl_ctx *ctx, enum hlsl_base_type type) type_info->reg_name, var->reg_reservation.index); }
- var->reg.id = var->reg_reservation.index; - var->reg.allocated = true; + var->regs[rset].id = var->reg_reservation.index; + var->regs[rset].allocated = true; TRACE("Allocated reserved %s to %c%u.\n", var->name, type_info->reg_name, var->reg_reservation.index); } else if (!var->reg_reservation.type) @@ -2471,8 +2482,8 @@ static void allocate_objects(struct hlsl_ctx *ctx, enum hlsl_base_type type) while (get_reserved_object(ctx, type_info->reg_name, index)) ++index;
- var->reg.id = index; - var->reg.allocated = true; + var->regs[rset].id = index; + var->regs[rset].allocated = true; TRACE("Allocated object to %c%u.\n", type_info->reg_name, index); ++index; } @@ -2611,7 +2622,7 @@ unsigned int hlsl_offset_from_deref_safe(struct hlsl_ctx *ctx, const struct hlsl struct hlsl_reg hlsl_reg_from_deref(struct hlsl_ctx *ctx, const struct hlsl_deref *deref) { const struct hlsl_ir_var *var = deref->var; - struct hlsl_reg ret = var->reg; + struct hlsl_reg ret = var->regs[HLSL_REGSET_NUM]; unsigned int offset = hlsl_offset_from_deref_safe(ctx, deref);
assert(deref->offset_regset == HLSL_REGSET_NUM); @@ -2619,8 +2630,8 @@ struct hlsl_reg hlsl_reg_from_deref(struct hlsl_ctx *ctx, const struct hlsl_dere ret.id += offset / 4;
ret.writemask = 0xf & (0xf << (offset % 4)); - if (var->reg.writemask) - ret.writemask = hlsl_combine_writemasks(var->reg.writemask, ret.writemask); + if (var->regs[HLSL_REGSET_NUM].writemask) + ret.writemask = hlsl_combine_writemasks(var->regs[HLSL_REGSET_NUM].writemask, ret.writemask);
return ret; } diff --git a/libs/vkd3d-shader/hlsl_sm1.c b/libs/vkd3d-shader/hlsl_sm1.c index 6d45208f..65af2ea5 100644 --- a/libs/vkd3d-shader/hlsl_sm1.c +++ b/libs/vkd3d-shader/hlsl_sm1.c @@ -315,7 +315,9 @@ static void write_sm1_uniforms(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffe
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { - if (!var->semantic.name && var->reg.allocated) + enum hlsl_regset rset = hlsl_type_get_regset(ctx, var->data_type); + + if (!var->semantic.name && var->regs[rset].allocated) { ++uniform_count;
@@ -353,20 +355,24 @@ static void write_sm1_uniforms(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffe
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { - if (!var->semantic.name && var->reg.allocated) + enum hlsl_regset rset = hlsl_type_get_regset(ctx, var->data_type); + + if (!var->semantic.name && var->regs[rset].allocated) { put_u32(buffer, 0); /* name */ if (var->data_type->type == HLSL_CLASS_OBJECT && (var->data_type->base_type == HLSL_TYPE_SAMPLER || var->data_type->base_type == HLSL_TYPE_TEXTURE)) { - put_u32(buffer, vkd3d_make_u32(D3DXRS_SAMPLER, var->reg.id)); + assert(rset == HLSL_REGSET_S); + put_u32(buffer, vkd3d_make_u32(D3DXRS_SAMPLER, var->regs[rset].id)); put_u32(buffer, 1); } else { - put_u32(buffer, vkd3d_make_u32(D3DXRS_FLOAT4, var->reg.id)); - put_u32(buffer, var->data_type->reg_size[HLSL_REGSET_NUM] / 4); + assert(rset == HLSL_REGSET_NUM); + put_u32(buffer, vkd3d_make_u32(D3DXRS_FLOAT4, var->regs[rset].id)); + put_u32(buffer, var->data_type->reg_size[rset] / 4); } put_u32(buffer, 0); /* type */ put_u32(buffer, 0); /* FIXME: default value */ @@ -377,7 +383,9 @@ static void write_sm1_uniforms(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffe
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { - if (!var->semantic.name && var->reg.allocated) + enum hlsl_regset rset = hlsl_type_get_regset(ctx, var->data_type); + + if (!var->semantic.name && var->regs[rset].allocated) { size_t var_offset = vars_start + (uniform_count * 5 * sizeof(uint32_t)); size_t name_offset; @@ -547,7 +555,7 @@ static void write_sm1_semantic_dcl(struct hlsl_ctx *ctx, struct vkd3d_bytecode_b ret = hlsl_sm1_usage_from_semantic(&var->semantic, &usage, &usage_idx); assert(ret); reg.type = output ? D3DSPR_OUTPUT : D3DSPR_INPUT; - reg.reg = var->reg.id; + reg.reg = var->regs[HLSL_REGSET_NUM].id; }
token = D3DSIO_DCL; diff --git a/libs/vkd3d-shader/hlsl_sm4.c b/libs/vkd3d-shader/hlsl_sm4.c index c394ba55..8498522f 100644 --- a/libs/vkd3d-shader/hlsl_sm4.c +++ b/libs/vkd3d-shader/hlsl_sm4.c @@ -172,9 +172,9 @@ static void write_sm4_signature(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc, } else { - assert(var->reg.allocated); + assert(var->regs[HLSL_REGSET_NUM].allocated); type = VKD3D_SM4_RT_INPUT; - reg_idx = var->reg.id; + reg_idx = var->regs[HLSL_REGSET_NUM].id; }
use_mask = width; /* FIXME: accurately report use mask */ @@ -468,16 +468,27 @@ static D3D_SRV_DIMENSION sm4_rdef_resource_dimension(const struct hlsl_type *typ } }
-static int sm4_compare_externs(const struct hlsl_ir_var *a, const struct hlsl_ir_var *b) +static int sm4_compare_externs(struct hlsl_ctx *ctx, const struct hlsl_ir_var *a, const struct hlsl_ir_var *b) { - if (a->data_type->base_type != b->data_type->base_type) - return a->data_type->base_type - b->data_type->base_type; - if (a->reg.allocated && b->reg.allocated) - return a->reg.id - b->reg.id; + enum hlsl_regset a_rset = hlsl_type_get_regset(ctx, a->data_type); + enum hlsl_regset b_rset = hlsl_type_get_regset(ctx, b->data_type); + unsigned int a_id, b_id; + bool a_allocated, b_allocated; + + a_allocated = a->regs[a_rset].allocated; + a_id = a->regs[a_rset].id; + + b_allocated = b->regs[b_rset].allocated; + b_id = b->regs[b_rset].id; + + if (a_rset != b_rset) + return a_rset - b_rset; + if (a_allocated && b_allocated) + return a_id - b_id; return strcmp(a->name, b->name); }
-static void sm4_sort_extern(struct list *sorted, struct hlsl_ir_var *to_sort) +static void sm4_sort_extern(struct hlsl_ctx *ctx, struct list *sorted, struct hlsl_ir_var *to_sort) { struct hlsl_ir_var *var;
@@ -485,7 +496,7 @@ static void sm4_sort_extern(struct list *sorted, struct hlsl_ir_var *to_sort)
LIST_FOR_EACH_ENTRY(var, sorted, struct hlsl_ir_var, extern_entry) { - if (sm4_compare_externs(to_sort, var) < 0) + if (sm4_compare_externs(ctx, to_sort, var) < 0) { list_add_before(&var->extern_entry, &to_sort->extern_entry); return; @@ -503,7 +514,7 @@ static void sm4_sort_externs(struct hlsl_ctx *ctx) LIST_FOR_EACH_ENTRY_SAFE(var, next, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { if (var->data_type->type == HLSL_CLASS_OBJECT) - sm4_sort_extern(&sorted, var); + sm4_sort_extern(ctx, &sorted, var); } list_move_tail(&ctx->extern_vars, &sorted); } @@ -532,8 +543,11 @@ static void write_sm4_rdef(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc)
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { - if (var->reg.allocated && var->data_type->type == HLSL_CLASS_OBJECT) - ++resource_count; + enum hlsl_regset rset = hlsl_type_get_regset(ctx, var->data_type); + + if (rset > HLSL_REGSET_LAST_OBJECT || !var->regs[rset].allocated) + continue; + ++resource_count; }
LIST_FOR_EACH_ENTRY(cbuffer, &ctx->buffers, struct hlsl_buffer, entry) @@ -573,9 +587,10 @@ static void write_sm4_rdef(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc)
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { + enum hlsl_regset rset = hlsl_type_get_regset(ctx, var->data_type); uint32_t flags = 0;
- if (!var->reg.allocated || var->data_type->type != HLSL_CLASS_OBJECT) + if (rset > HLSL_REGSET_LAST_OBJECT || !var->regs[rset].allocated) continue;
if (var->reg_reservation.type) @@ -583,7 +598,7 @@ static void write_sm4_rdef(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc)
put_u32(&buffer, 0); /* name */ put_u32(&buffer, sm4_resource_type(var->data_type)); - if (var->data_type->base_type == HLSL_TYPE_SAMPLER) + if (rset == HLSL_REGSET_S) { put_u32(&buffer, 0); put_u32(&buffer, 0); @@ -596,7 +611,7 @@ static void write_sm4_rdef(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc) put_u32(&buffer, ~0u); /* FIXME: multisample count */ flags |= (var->data_type->e.resource_format->dimx - 1) << VKD3D_SM4_SIF_TEXTURE_COMPONENTS_SHIFT; } - put_u32(&buffer, var->reg.id); + put_u32(&buffer, var->regs[rset].id); put_u32(&buffer, 1); /* bind count */ put_u32(&buffer, flags); } @@ -625,7 +640,9 @@ static void write_sm4_rdef(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc)
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { - if (!var->reg.allocated || var->data_type->type != HLSL_CLASS_OBJECT) + enum hlsl_regset rset = hlsl_type_get_regset(ctx, var->data_type); + + if (rset > HLSL_REGSET_LAST_OBJECT || !var->regs[rset].allocated) continue;
string_offset = put_string(&buffer, var->name); @@ -851,7 +868,7 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r reg->dim = VKD3D_SM4_DIMENSION_VEC4; if (swizzle_type) *swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; - reg->idx[0] = var->reg.id; + reg->idx[0] = var->regs[HLSL_REGSET_T].id; reg->idx_count = 1; *writemask = VKD3DSP_WRITEMASK_ALL; } @@ -861,7 +878,7 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r reg->dim = VKD3D_SM4_DIMENSION_VEC4; if (swizzle_type) *swizzle_type = VKD3D_SM4_SWIZZLE_VEC4; - reg->idx[0] = var->reg.id; + reg->idx[0] = var->regs[HLSL_REGSET_U].id; reg->idx_count = 1; *writemask = VKD3DSP_WRITEMASK_ALL; } @@ -871,7 +888,7 @@ static void sm4_register_from_deref(struct hlsl_ctx *ctx, struct sm4_register *r reg->dim = VKD3D_SM4_DIMENSION_NONE; if (swizzle_type) *swizzle_type = VKD3D_SM4_SWIZZLE_NONE; - reg->idx[0] = var->reg.id; + reg->idx[0] = var->regs[HLSL_REGSET_S].id; reg->idx_count = 1; *writemask = VKD3DSP_WRITEMASK_ALL; } @@ -1141,7 +1158,7 @@ static void write_sm4_dcl_sampler(struct vkd3d_bytecode_buffer *buffer, const st .opcode = VKD3D_SM4_OP_DCL_SAMPLER,
.dsts[0].reg.type = VKD3D_SM4_RT_SAMPLER, - .dsts[0].reg.idx = {var->reg.id}, + .dsts[0].reg.idx = {var->regs[HLSL_REGSET_S].id}, .dsts[0].reg.idx_count = 1, .dst_count = 1, }; @@ -1157,7 +1174,7 @@ static void write_sm4_dcl_texture(struct vkd3d_bytecode_buffer *buffer, const st | (sm4_resource_dimension(var->data_type) << VKD3D_SM4_RESOURCE_TYPE_SHIFT),
.dsts[0].reg.type = uav ? VKD3D_SM5_RT_UAV : VKD3D_SM4_RT_RESOURCE, - .dsts[0].reg.idx = {var->reg.id}, + .dsts[0].reg.idx = {uav ? var->regs[HLSL_REGSET_U].id : var->regs[HLSL_REGSET_T].id}, .dsts[0].reg.idx_count = 1, .dst_count = 1,
@@ -1196,9 +1213,9 @@ static void write_sm4_dcl_semantic(struct hlsl_ctx *ctx, struct vkd3d_bytecode_b else { instr.dsts[0].reg.type = output ? VKD3D_SM4_RT_OUTPUT : VKD3D_SM4_RT_INPUT; - instr.dsts[0].reg.idx[0] = var->reg.id; + instr.dsts[0].reg.idx[0] = var->regs[HLSL_REGSET_NUM].id; instr.dsts[0].reg.idx_count = 1; - instr.dsts[0].writemask = var->reg.writemask; + instr.dsts[0].writemask = var->regs[HLSL_REGSET_NUM].writemask; }
if (instr.dsts[0].reg.type == VKD3D_SM4_RT_DEPTHOUT) @@ -2404,7 +2421,9 @@ static void write_sm4_shdr(struct hlsl_ctx *ctx,
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, const struct hlsl_ir_var, extern_entry) { - if (!var->reg.allocated || var->data_type->type != HLSL_CLASS_OBJECT) + enum hlsl_regset rset = hlsl_type_get_regset(ctx, var->data_type); + + if (rset > HLSL_REGSET_LAST_OBJECT || !var->regs[rset].allocated) continue;
if (var->data_type->base_type == HLSL_TYPE_SAMPLER)
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl.h | 17 +++++ libs/vkd3d-shader/hlsl_codegen.c | 123 ++++++++++++++++--------------- 2 files changed, 79 insertions(+), 61 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl.h b/libs/vkd3d-shader/hlsl.h index 4608203b..228c440a 100644 --- a/libs/vkd3d-shader/hlsl.h +++ b/libs/vkd3d-shader/hlsl.h @@ -765,6 +765,23 @@ struct hlsl_resource_load_params struct hlsl_ir_node *coords, *lod, *texel_offset; };
+static inline char hlsl_regset_name(enum hlsl_regset rset) +{ + switch (rset) + { + case HLSL_REGSET_S: + return 's'; + case HLSL_REGSET_T: + return 't'; + case HLSL_REGSET_U: + return 'u'; + case HLSL_REGSET_NUM: + vkd3d_unreachable(); + break; + } + return '?'; +} + static inline struct hlsl_ir_call *hlsl_ir_call(const struct hlsl_ir_node *node) { assert(node->type == HLSL_IR_CALL); diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index a39ddd1b..f5c15768 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -1734,6 +1734,39 @@ static void dump_function(struct rb_entry *entry, void *context) rb_for_each_entry(&func->overloads, dump_function_decl, ctx); }
+static void allocate_register_reservations(struct hlsl_ctx *ctx) +{ + struct hlsl_ir_var *var; + + LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) + { + enum hlsl_regset rset = hlsl_type_get_regset(ctx, var->data_type); + + if (rset == HLSL_REGSET_NUM) + continue; + + if (var->reg_reservation.type) + { + if (var->reg_reservation.type != hlsl_regset_name(rset)) + { + struct vkd3d_string_buffer *type_string; + + type_string = hlsl_type_to_string(ctx, var->data_type); + hlsl_error(ctx, &var->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_RESERVATION, + "Object of type '%s' must be bound to register type '%c'.", + type_string->buffer, hlsl_regset_name(rset)); + hlsl_release_string_buffer(ctx, type_string); + } + else + { + var->regs[rset].allocated = true; + var->regs[rset].id = var->reg_reservation.index; + TRACE("Allocated reserved %s to %c%u.\n", var->name, var->reg_reservation.type, var->reg_reservation.index); + } + } + } +} + /* Compute the earliest and latest liveness for each variable. In the case that * a variable is accessed inside of a loop, we promote its liveness to extend * to at least the range of the entire loop. Note that we don't need to do this @@ -2387,50 +2420,30 @@ static void allocate_buffers(struct hlsl_ctx *ctx) } }
-static const struct hlsl_ir_var *get_reserved_object(struct hlsl_ctx *ctx, char type, uint32_t index) +static const struct hlsl_ir_var *get_allocated_object(struct hlsl_ctx *ctx, enum hlsl_regset rset, + uint32_t index) { const struct hlsl_ir_var *var;
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, const struct hlsl_ir_var, extern_entry) { - if (var->last_read && var->reg_reservation.type == type && var->reg_reservation.index == index) + if (!var->regs[rset].allocated) + continue; + + if (index == var->regs[rset].id) return var; } return NULL; }
-static const struct object_type_info +static void allocate_objects(struct hlsl_ctx *ctx, enum hlsl_regset rset) { - enum hlsl_base_type type; - char reg_name; -} -object_types[] = -{ - { HLSL_TYPE_SAMPLER, 's' }, - { HLSL_TYPE_TEXTURE, 't' }, - { HLSL_TYPE_UAV, 'u' }, -}; - -static const struct object_type_info *get_object_type_info(enum hlsl_base_type type) -{ - unsigned int i; - - for (i = 0; i < ARRAY_SIZE(object_types); ++i) - if (type == object_types[i].type) - return &object_types[i]; - - WARN("No type info for object type %u.\n", type); - return NULL; -} - -static void allocate_objects(struct hlsl_ctx *ctx, enum hlsl_base_type type) -{ - const struct object_type_info *type_info = get_object_type_info(type); + char rset_name = hlsl_regset_name(rset); struct hlsl_ir_var *var; uint32_t min_index = 0; uint32_t index;
- if (type == HLSL_TYPE_UAV) + if (rset == HLSL_REGSET_U) { LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { @@ -2444,59 +2457,46 @@ static void allocate_objects(struct hlsl_ctx *ctx, enum hlsl_base_type type)
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { - enum hlsl_regset rset; - - if (!var->last_read || var->data_type->type != HLSL_CLASS_OBJECT - || var->data_type->base_type != type) + if (!var->last_read || !var->data_type->reg_size[rset]) continue;
- rset = hlsl_type_get_regset(ctx, var->data_type); - - if (var->reg_reservation.type == type_info->reg_name) + if (var->regs[rset].allocated) { - const struct hlsl_ir_var *reserved_object = get_reserved_object(ctx, type_info->reg_name, - var->reg_reservation.index); + const struct hlsl_ir_var *reserved_object; + unsigned int index = var->regs[rset].id; + + reserved_object = get_allocated_object(ctx, rset, index);
- if (var->reg_reservation.index < min_index) + if (var->regs[rset].id < min_index) { + assert(rset == HLSL_REGSET_U); hlsl_error(ctx, &var->loc, VKD3D_SHADER_ERROR_HLSL_OVERLAPPING_RESERVATIONS, "UAV index (%u) must be higher than the maximum render target index (%u).", - var->reg_reservation.index, min_index - 1); + var->regs[rset].id, min_index - 1); } else if (reserved_object && reserved_object != var) { hlsl_error(ctx, &var->loc, VKD3D_SHADER_ERROR_HLSL_OVERLAPPING_RESERVATIONS, - "Multiple objects bound to %c%u.", type_info->reg_name, - var->reg_reservation.index); + "Multiple variables bound to %c%u.", rset_name, index); hlsl_note(ctx, &reserved_object->loc, VKD3D_SHADER_LOG_ERROR, - "Object '%s' is already bound to %c%u.", reserved_object->name, - type_info->reg_name, var->reg_reservation.index); + "Variable '%s' is already bound to %c%u.", reserved_object->name, + rset_name, index); }
var->regs[rset].id = var->reg_reservation.index; var->regs[rset].allocated = true; - TRACE("Allocated reserved %s to %c%u.\n", var->name, type_info->reg_name, var->reg_reservation.index); + TRACE("Allocated reserved %s to %c%u.\n", var->name, rset_name, var->regs[rset].id); } - else if (!var->reg_reservation.type) + else { - while (get_reserved_object(ctx, type_info->reg_name, index)) + while (get_allocated_object(ctx, rset, index)) ++index;
var->regs[rset].id = index; var->regs[rset].allocated = true; - TRACE("Allocated object to %c%u.\n", type_info->reg_name, index); + TRACE("Allocated object to %c%u.\n", rset_name, index); ++index; } - else - { - struct vkd3d_string_buffer *type_string; - - type_string = hlsl_type_to_string(ctx, var->data_type); - hlsl_error(ctx, &var->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_RESERVATION, - "Object of type '%s' must be bound to register type '%c'.", - type_string->buffer, type_info->reg_name); - hlsl_release_string_buffer(ctx, type_string); - } } }
@@ -2791,6 +2791,7 @@ int hlsl_emit_bytecode(struct hlsl_ctx *ctx, struct hlsl_ir_function_decl *entry if (TRACE_ON()) rb_for_each_entry(&ctx->functions, dump_function, ctx);
+ allocate_register_reservations(ctx); allocate_temp_registers(ctx, entry_func); if (profile->major_version < 4) { @@ -2799,11 +2800,11 @@ int hlsl_emit_bytecode(struct hlsl_ctx *ctx, struct hlsl_ir_function_decl *entry else { allocate_buffers(ctx); - allocate_objects(ctx, HLSL_TYPE_TEXTURE); - allocate_objects(ctx, HLSL_TYPE_UAV); + allocate_objects(ctx, HLSL_REGSET_T); + allocate_objects(ctx, HLSL_REGSET_U); } allocate_semantic_registers(ctx); - allocate_objects(ctx, HLSL_TYPE_SAMPLER); + allocate_objects(ctx, HLSL_REGSET_S);
if (ctx->result) return ctx->result;
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl.h:
struct hlsl_src *path; /* Single instruction node of data type uint used to represent the register offset (in register
* components), from the start of the variable, of the part referenced.
* components, within the pertaining regset), from the start of the variable, of the part
struct hlsl_src offset;* referenced. * The path is lowered to this single offset -- whose value may vary between SM1 and SM4 -- * before writing the bytecode. */
- enum hlsl_regset offset_regset;
It seems suspicious that we are storing this, since it can be purely calculated from the hlsl_deref (and also since it's kind of backend-specific).
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl.y:
hlsl_fixme(ctx, loc, "Samples with gradients are not implemented.\n");
return false;
- }
- sampler_type = params->args[0]->data_type;
- if (sampler_type->type != HLSL_CLASS_OBJECT || sampler_type->base_type != HLSL_TYPE_SAMPLER
|| (sampler_type->sampler_dim != dim && sampler_type->sampler_dim != HLSL_SAMPLER_DIM_GENERIC))
- {
struct vkd3d_string_buffer *string;
if ((string = hlsl_type_to_string(ctx, sampler_type)))
hlsl_error(ctx, loc, VKD3D_SHADER_ERROR_HLSL_INVALID_TYPE,
"Wrong type for argument 1 of '%s': expected 'sampler' or '%s', but got '%s'.",
name, ctx->builtin_types.sampler[dim]->name, string->buffer);
hlsl_release_string_buffer(ctx, string);
return false;
We don't need to abort here.
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl.y:
- const struct hlsl_type *sampler_type;
- struct hlsl_ir_resource_load *load;
- struct hlsl_ir_load *sampler_load;
- struct hlsl_ir_node *coords;
- if (params->args_count != 2 && params->args_count != 4)
- {
hlsl_error(ctx, loc, VKD3D_SHADER_ERROR_HLSL_WRONG_PARAMETER_COUNT,
"Wrong number of arguments to function '%s': expected 2 or 4, but got %u.", name, params->args_count);
return false;
- }
- if (params->args_count == 4)
- {
hlsl_fixme(ctx, loc, "Samples with gradients are not implemented.\n");
return false;
We don't really need to abort here either.
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl.c:
return 1;
}
+enum hlsl_regset hlsl_type_get_regset(struct hlsl_ctx *ctx, const struct hlsl_type *type) +{
- unsigned int k;
- if (type->type == HLSL_CLASS_OBJECT)
- {
for (k = 0; k <= HLSL_REGSET_LAST_OBJECT; ++k)
{
if (type->reg_size[k] > 0)
return k;
}
vkd3d_unreachable();
This feels like the wrong way to go about this. Shouldn't we be deriving this information from the base type?
(Note also that type->reg_size should eventually go away...)
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl.h:
HLSL_ROW_MAJOR
};
+enum hlsl_regset {
- HLSL_REGSET_S,
- HLSL_REGSET_T,
- HLSL_REGSET_U,
I would still prefer actual words here. (Also, opening brace placement.)
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl.h:
HLSL_ROW_MAJOR
};
+enum hlsl_regset {
- HLSL_REGSET_S,
- HLSL_REGSET_T,
- HLSL_REGSET_U,
- HLSL_REGSET_LAST_OBJECT = HLSL_REGSET_U,
- HLSL_REGSET_NUM,
Probably spell out "NUMERIC"; this makes it look like it means "number of HLSL_REGSET values".
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl.h:
struct hlsl_type *hlsl_type_clone(struct hlsl_ctx *ctx, struct hlsl_type *old, unsigned int default_majority, unsigned int modifiers); unsigned int hlsl_type_component_count(const struct hlsl_type *type); -unsigned int hlsl_type_get_array_element_reg_size(const struct hlsl_type *type); +unsigned int hlsl_type_get_array_element_reg_size(const struct hlsl_type *type, enum hlsl_regset rset);
Let's write "regset", not "rset", please. No need to abbreviate something already that short.
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl.c:
}
+enum hlsl_regset hlsl_type_get_regset(struct hlsl_ctx *ctx, const struct hlsl_type *type) +{
- unsigned int k;
- if (type->type == HLSL_CLASS_OBJECT)
- {
for (k = 0; k <= HLSL_REGSET_LAST_OBJECT; ++k)
{
if (type->reg_size[k] > 0)
return k;
}
vkd3d_unreachable();
- }
- return HLSL_REGSET_NUM;
What about structs and arrays?
Zebediah Figura (@zfigura) commented about libs/vkd3d-shader/hlsl_codegen.c:
type_string = hlsl_type_to_string(ctx, var->data_type);
hlsl_error(ctx, &var->loc, VKD3D_SHADER_ERROR_HLSL_INVALID_RESERVATION,
"Object of type '%s' must be bound to register type '%c'.",
type_string->buffer, hlsl_regset_name(rset));
hlsl_release_string_buffer(ctx, type_string);
}
else
{
var->regs[rset].allocated = true;
var->regs[rset].id = var->reg_reservation.index;
TRACE("Allocated reserved %s to %c%u.\n", var->name, var->reg_reservation.type, var->reg_reservation.index);
}
}
- }
+}
Why separate this?
On Mon Jan 23 19:32:26 2023 +0000, Zebediah Figura wrote:
This feels like the wrong way to go about this. Shouldn't we be deriving this information from the base type? (Note also that type->reg_size should eventually go away...)
It its technically possible but there we would be introducing some redudancy, repeating code in the calculation of the reg_size and in this function.
For instance, we would have to write twice the fact that textures use either the regset S or the regset T, whether we are in SM1 or SM4.
That last thing is also a reason why I am not too inclined to name the regsets HLSL_REGSET_SAMPLERS and HLSL_REGSET_TEXTURES instead of using a single letter (HLSL_REGSET_S and HLSL_REGSET_T) as now.
On Mon Jan 23 19:17:06 2023 +0000, Zebediah Figura wrote:
It seems suspicious that we are storing this, since it can be purely calculated from the hlsl_deref (and also since it's kind of backend-specific).
This field is set at the moment the path is lowered into a single offset. This is necessary because without, after that, the deref wouldn't be able to retrieve the type (and thus the regset) from the offset alone. (At least not without a quite complex mechanism).
Granted, if we manage to get rid of offsets and use paths everywhere this field could go away. But better go one step at the time.
On Mon Jan 23 19:32:27 2023 +0000, Zebediah Figura wrote:
I would still prefer actual words here. (Also, opening brace placement.)
I think using a single letter is better of the object regsets because, as I said in the other comment, textures use HLSL_REGSET_S or HLSL_REGSET_T depending on the SM.
On Mon Jan 23 19:32:27 2023 +0000, Zebediah Figura wrote:
Probably spell out "NUMERIC"; this makes it look like it means "number of HLSL_REGSET values".
Okay, this makes sense.
On Mon Jan 23 19:32:28 2023 +0000, Zebediah Figura wrote:
What about structs and arrays?
Until later patches, structs and arrays are indeed expected to retrieve HLSL_REGSET_NUM.
The exception, which is introduced in latter patches, is for arrays (or multi-dimensional arrays) of objects, which have to have the same regset as their components.
Structs (or array of structs) that contain object components are only allowed in SM5, but SM5 also promotes each object component to a separated variable so, in that case, later patches make a new uniform for each object (or array, or multidimensiona-array of objects) and the original variable is only in charge of handling the numeric components, and thus should retrieve HLSL_REGSET_NUM.
On Mon Jan 23 19:32:28 2023 +0000, Zebediah Figura wrote:
Why separate this?
In later patches there is the need to insert an intermediate step between allocating register reservations and performing the register allocation perse:
```c allocate_register_reservations(ctx);
request_object_registers_for_allocation(ctx);
allocate_temp_registers(ctx, entry_func); ```
This `request_object_registers_for_allocation()` takes care of setting the count of reserved registers for object arrays. In particular, for sampler arrays, this count is given by the last sampler of the array that is actually used in the shader. So, all these register counts have to be calculated previously to make `get_allocated_object()` simple.
Also, I think it is a nice way of taking a little bit of complexity away from `allocate_objects()`.
On Tue Jan 24 16:28:41 2023 +0000, Francisco Casas wrote:
It its technically possible but there we would be introducing some redudancy, repeating code in the calculation of the reg_size and in this function. For instance, we would have to write twice the fact that textures use either the regset S or the regset T, whether we are in SM1 or SM4. That last thing is also a reason why I am not too inclined to name the regsets HLSL_REGSET_SAMPLERS and HLSL_REGSET_TEXTURES instead of using a single letter (HLSL_REGSET_S and HLSL_REGSET_T) as now.
If we're concerned about duplication we can deduplicate the other way, i.e. use hlsl_type_get_regset() in hlsl_type_calculate_reg_size().
I'm not convinced we want to be tying the register set to the letter used in disassembly, though. We could use HLSL_REGSET_RESOURCE regardless of shader version. We get rid of all the HLSL_TYPE_SAMPLER variables for sm1 anyway.
On Tue Jan 24 18:48:06 2023 +0000, Zebediah Figura wrote:
We don't need to abort here.
But if we don't abort here, there is the possibility of `params->args[0]` not being an hlsl_ir_load and failing the assertion within the following `hlsl_ir_load()`.
On Tue Jan 24 17:00:41 2023 +0000, Francisco Casas wrote:
Until later patches, structs and arrays are indeed expected to retrieve HLSL_REGSET_NUM. The exception, which is introduced in latter patches, is for arrays (or multi-dimensional arrays) of objects, which have to have the same regset as their components. Structs (or array of structs) that contain object components are only allowed in SM5, but SM5 also promotes each object component to a separated variable so, in that case, later patches make a new uniform for each object (or array, or multidimensiona-array of objects) and the original variable is only in charge of handling the numeric components, and thus should retrieve HLSL_REGSET_NUM.
Hmm, do we definitely want to split the variables? We don't have to split up the hlsl_ir_var just to write it as multiple variables in the RDEF section.
On the other hand, it lets a variable only have one regset. But if a variable can only have one regset, do we really need the bulk of this patch series?
On Tue Jan 24 17:54:14 2023 +0000, Francisco Casas wrote:
In later patches there is the need to insert an intermediate step between allocating register reservations and performing the register allocation perse:
allocate_register_reservations(ctx); request_object_registers_for_allocation(ctx); allocate_temp_registers(ctx, entry_func);
This `request_object_registers_for_allocation()` takes care of setting the count of reserved registers for object arrays. In particular, for sampler arrays, this count is given by the last sampler of the array that is actually used in the shader. So, all these register counts have to be calculated previously to make `get_allocated_object()` simple. Also, I think it is a nice way of taking a little bit of complexity away from `allocate_objects()`.
I still don't understand why there's a need to refactor anything? (And, if we do, it doesn't belong in this patch). What about register allocation can't be done with the loop we have?
On Tue Jan 24 18:51:20 2023 +0000, Francisco Casas wrote:
But if we don't abort here, there is the possibility of `params->args[0]` not being an hlsl_ir_load and failing the assertion within the following `hlsl_ir_load()`.
Oh, annoying. I guess we could just not initialize load_params.resource, then.
On Tue Jan 24 18:51:54 2023 +0000, Zebediah Figura wrote:
Hmm, do we definitely want to split the variables? We don't have to split up the hlsl_ir_var just to write it as multiple variables in the RDEF section. On the other hand, it lets a variable only have one regset. But if a variable can only have one regset, do we really need the bulk of this patch series?
Sorry, I made an inaccuracy in my explanation:
In later patches, the promotion of each object component to a separated variable will only be done at the beginning of `hlsl_sm4_write()` for SM5 so. before that, the variables are still tied together.
I probably got confused because my memory is bad and I was promoting components much sooner in previous versions of this patch series, but that didn't work (because component usage must be tracked at the variable level for the correct allocation of registers).
So the correct answer would be:
HLSL_REGSET_NUM is just expected to be retrieved as a fallback for structs, because the callers either only work with derefs to simple types (after struct copies have been lowered), or perform a recursive search through a variable's type and check for ```c (hlsl_type_get_regset(ctx, type) <= HLSL_REGSET_LAST_OBJECT) ``` base cases.
Maybe I should include an `HLSL_REGSET_UNDETERMINED` enum value for structs?
On Tue Jan 24 18:56:26 2023 +0000, Zebediah Figura wrote:
I still don't understand why there's a need to refactor anything? (And, if we do, it doesn't belong in this patch). What about register allocation can't be done with the loop we have?
Well, I think that allocation must be done regset-wise instead of hlsl_base_type-wise, given that that is what defines the regsets. And once we replace the argument in `allocate_objects()` with a regset, and considering we have `reg_size[regset]` for each variable, it doesn't make much sense to have `get_object_type_info()` anymore.
The other refactorings may indeed be premature, but I think I remember why those are there:
Given the possibility of multiple-register variables (which are introduced in the patch just after this MR), the allocation strategy needs to be "smarter". Assigning indexes sequentially in a greedy way is no longer possible to get the correct result, consider the following scenario:
- Variable A is reserved to t2 and uses 3 t registers. - We need to allocate variable B, which uses 3 t registers. - We need to allocate variable C, which uses 1 t register.
In that case we need to get the allocation like this: ``` C - A A A B B B ``` instead of like this: ``` - - A A A B B B C ```
So, some of the refactoring changes are useful for the new allocation strategy, for instance, `get_reserved_object()` was changed into `get_allocated_object()` in order to not only get reserved objects but also being able to query objects already allocated by the allocation strategy itself. Also, I probably moved `allocate_register_reservations()` to a previous step to make the strategy simpler.
TL;DR: Okay, I will try to make an intermediate patch, separating the switch from `hlsl_base_type` to regsets and the refactoring associated to a change in the allocation strategy.
On Tue Jan 24 20:13:57 2023 +0000, Francisco Casas wrote:
Sorry, I made an inaccuracy in my explanation: In later patches, the promotion of each object component to a separated variable will only be done at the beginning of `hlsl_sm4_write()` for SM5 so. before that, the variables are still tied together. I probably got confused because my memory is bad and I was promoting components much sooner in previous versions of this patch series, but that didn't work (because component usage must be tracked at the variable level for the correct allocation of registers). So the correct answer would be: HLSL_REGSET_NUM is just expected to be retrieved as a fallback for structs, because the callers either only work with derefs to simple types (after struct copies have been lowered), or perform a recursive search through a variable's type and check for
(hlsl_type_get_regset(ctx, type) <= HLSL_REGSET_LAST_OBJECT)
base cases. Maybe I should include an `HLSL_REGSET_UNDETERMINED` enum value for structs?
It feels awkward to have a helper that doesn't always do a well-defined thing and relies on its callers never to care about the poorly-defined case.
I'd suggest a specific solution, but I've forgotten by now why we do ever need an hlsl_ir_var to be allocated to multiple register sets at the same time?
On Tue Jan 24 21:12:04 2023 +0000, Francisco Casas wrote:
Well, I think that allocation must be done regset-wise instead of hlsl_base_type-wise, given that that is what defines the regsets. And once we replace the argument in `allocate_objects()` with a regset, and considering we have `reg_size[regset]` for each variable, it doesn't make much sense to have `get_object_type_info()` anymore. The other refactorings may indeed be premature, but I think I remember why those are there: Given the possibility of multiple-register variables (which are introduced in the patch just after this MR), the allocation strategy needs to be "smarter". Assigning indexes sequentially in a greedy way is no longer possible to get the correct result, consider the following scenario:
- Variable A is reserved to t2 and uses 3 t registers.
- We need to allocate variable B, which uses 3 t registers.
- We need to allocate variable C, which uses 1 t register.
In that case we need to get the allocation like this:
C - A A A B B B
instead of like this:
- - A A A B B B C
So, some of the refactoring changes are useful for the new allocation strategy, for instance, `get_reserved_object()` was changed into `get_allocated_object()` in order to not only get reserved objects but also being able to query objects already allocated by the allocation strategy itself. Also, I probably moved `allocate_register_reservations()` to a previous step to make the strategy simpler. TL;DR: Okay, I will try to make an intermediate patch, separating the switch from `hlsl_base_type` to regsets and the refactoring associated to a change in the allocation strategy.
Ah, that makes sense, thanks.