First, we have to distinguish between the "bind count" and the "allocation size" of variables.
The "allocation size" affects the starting register id for the resource to be allocated next, while the "bind count" is determined by the last field actually used. The former may be larger than the latter.
Currently we are calling `hlsl_reg.bind_count` to what should be `hlsl_reg.allocation_size`. So it is renamed in 2/4.
The proper "bind count" (now computed when needed in 3/4) is important because it is what should appear in the RDEF table and some resource allocation rules depend on it
For instance, for this shader:
``` texture2D texs[3]; texture2D tex;
float4 main() : sv_target { return texs[0].Load(int3(0, 0, 0)) + tex.Load(int3(0, 0, 0)); } ```
the variable "texs" should show a "bind count" of 1, even though its "allocation size" is 3:
``` // Resource Bindings: // // Name Type Format Dim HLSL Bind Count // ------------------------------ ---------- ------- ----------- -------------- ------ // texs texture float4 2d t0 1 // tex texture float4 2d t3 1 ```
In particular, as shown in the tests in 1/4, textures go in this order:
1. Textures created from SM1-style samples. Those whose "bind count" is larger than 1, in the order of the tex1D/tex2D/tex3D/texCube instructions that create them. 2. Textures created from SM1-style samples. Those whose "bind count" is equal to 1, in the order of the tex1D/tex2D/tex3D/texCube instructions that create them. 3. Regular textures in order of declaration.
Note that the difference between 1 and 2 is not given by the "allocation size" but the "bind count". This order is enforced in 4/4.
-- v2: vkd3d-shader/d3dbc: Use the bind count instead of the allocation size in d3dbc.c. vkd3d-shader/hlsl: Simplify computation of allocation size. vkd3d-shader/hlsl: Sort synthetic separated samplers first for SM4. vkd3d-shader/tpf: Put the actual bind count in the RDEF table. vkd3d-shader/hlsl: Rename hlsl_reg.bind_count to hlsl_reg.allocation_size. tests: Test texture allocation ordering in complex scenarios.
From: Francisco Casas fcasas@codeweavers.com
--- Makefile.am | 1 + tests/hlsl/texture-ordering.shader_test | 303 ++++++++++++++++++++++++ tests/shader_runner_d3d12.c | 4 +- 3 files changed, 306 insertions(+), 2 deletions(-) create mode 100644 tests/hlsl/texture-ordering.shader_test
diff --git a/Makefile.am b/Makefile.am index ecb7c7e2..a4f46b85 100644 --- a/Makefile.am +++ b/Makefile.am @@ -168,6 +168,7 @@ vkd3d_shader_tests = \ tests/hlsl/texture-load-offset.shader_test \ tests/hlsl/texture-load-typed.shader_test \ tests/hlsl/texture-load.shader_test \ + tests/hlsl/texture-ordering.shader_test \ tests/hlsl/transpose.shader_test \ tests/hlsl/trigonometry.shader_test \ tests/hlsl/trunc.shader_test \ diff --git a/tests/hlsl/texture-ordering.shader_test b/tests/hlsl/texture-ordering.shader_test new file mode 100644 index 00000000..b28fea5d --- /dev/null +++ b/tests/hlsl/texture-ordering.shader_test @@ -0,0 +1,303 @@ +[require] +shader model >= 4.0 + +[sampler 0] +filter linear linear linear +address clamp clamp clamp + +[sampler 1] +filter linear linear linear +address clamp clamp clamp + +[sampler 2] +filter linear linear linear +address clamp clamp clamp + +[sampler 3] +filter linear linear linear +address clamp clamp clamp + +[sampler 4] +filter linear linear linear +address clamp clamp clamp + +[sampler 5] +filter linear linear linear +address clamp clamp clamp + +[sampler 6] +filter linear linear linear +address clamp clamp clamp + +[texture 0] +size (1, 1) +0.0 0.0 0.0 1.0 + +[texture 1] +size (1, 1) +1.0 1.0 1.0 1.0 + +[texture 2] +size (1, 1) +2.0 2.0 2.0 1.0 + +[texture 3] +size (1, 1) +3.0 3.0 3.0 1.0 + +[texture 4] +size (1, 1) +4.0 4.0 4.0 1.0 + +[texture 5] +size (1, 1) +5.0 5.0 5.0 1.0 + +[texture 6] +size (1, 1) +6.0 6.0 6.0 1.0 + +[texture 7] +size (1, 1) +7.0 7.0 7.0 1.0 + +[texture 8] +size (1, 1) +8.0 8.0 8.0 1.0 + +[texture 9] +size (1, 1) +9.0 9.0 9.0 1.0 + + +% Regarding resource allocation ordering in SM4, textures go in this order: +% 1. Textures created from SM1-style samples, in decreasing "bind count". +% In case there is a tie in the "bind count", the order is given by the order of appearance of +% the tex1D/tex2D/tex3D/texCube calls that create them. +% 2. Regular textures in order of declaration. +% +% Note that the "bind count" should not be confused with the "allocation size". +% +% The "bind count" appears in the RDEF table ("Count" row), and is determined by the last field +% actually used. +% The "allocation size" for textures affects the starting register id for the next resource in the +% table, and may be larger than the "bind count". + +[pixel shader] +// Name Type Format Dim HLSL Bind Count +// ------------------------------ ---------- ------- ----------- -------------- ------ +// sam_arr_10 sampler NA NA s0 1 +// sam_arr_01 sampler NA NA s1 2 +// sam_arr_11 sampler NA NA s3 2 +// samA sampler NA NA s5 1 +// samB sampler NA NA s6 1 +// sam_arr_11 texture float4 2d t0 2 +// sam_arr_01 texture float4 2d t2 2 +// samB texture float4 2d t4 1 +// sam_arr_10 texture float4 2d t5 1 +// samA texture float4 2d t6 1 +// tex texture float4 2d t7 1 +// texs texture float4 2d t8 2 + +Texture2D tex; +Texture2D texs[2]; +sampler samA; +sampler samB; +sampler sam_arr_10[2]; +sampler sam_arr_01[2]; +sampler sam_arr_11[2]; + +float4 main() : sv_target +{ + float4 f = 0, g = 0, h = 0, res; + + f += 100 * tex2D(samB, float2(0, 0)); + f += 10 * tex2D(sam_arr_10[0], float2(0, 0)); + f += 1 * tex2D(sam_arr_11[0], float2(0, 0)); + g += 100 * tex2D(sam_arr_11[1], float2(0, 0)); + g += 10 * tex2D(sam_arr_01[1], float2(0, 0)); + g += 1 * texs[1].Load(int3(0, 0, 0)); + h += 100 * texs[0].Load(int3(0, 0, 0)); + h += 10 * tex.Load(int3(0, 0, 0)); + h += 1 * tex2D(samA, float2(0, 0)); + + res.x = f.x; + res.y = g.x; + res.z = h.x; + res.w = f.w + g.w + h.w; + + return res; +} + +[test] +draw quad +todo probe all rgba (450, 139, 876, 333) + + +% Same as the first test, but inverting the declaration order. +% Regarding textures, only the allocation of those that are not created from samplers is affected. +[pixel shader] +// Name Type Format Dim HLSL Bind Count +// ------------------------------ ---------- ------- ----------- -------------- ------ +// sam_arr_11 sampler NA NA s0 2 +// sam_arr_01 sampler NA NA s2 2 +// sam_arr_10 sampler NA NA s4 1 +// samB sampler NA NA s5 1 +// samA sampler NA NA s6 1 +// sam_arr_11 texture float4 2d t0 2 +// sam_arr_01 texture float4 2d t2 2 +// samB texture float4 2d t4 1 +// sam_arr_10 texture float4 2d t5 1 +// samA texture float4 2d t6 1 +// texs texture float4 2d t7 2 +// tex texture float4 2d t9 1 + +sampler sam_arr_11[2]; +sampler sam_arr_01[2]; +sampler sam_arr_10[2]; +sampler samB; +sampler samA; +Texture2D texs[2]; +Texture2D tex; + +float4 main() : sv_target +{ + float4 f = 0, g = 0, h = 0, res; + + f += 100 * tex2D(samB, float2(0, 0)); + f += 10 * tex2D(sam_arr_10[0], float2(0, 0)); + f += 1 * tex2D(sam_arr_11[0], float2(0, 0)); + g += 100 * tex2D(sam_arr_11[1], float2(0, 0)); + g += 10 * tex2D(sam_arr_01[1], float2(0, 0)); + g += 1 * texs[1].Load(int3(0, 0, 0)); + h += 100 * texs[0].Load(int3(0, 0, 0)); + h += 10 * tex.Load(int3(0, 0, 0)); + h += 1 * tex2D(samA, float2(0, 0)); + + res.x = f.x; + res.y = g.x; + res.z = h.x; + res.w = f.w + g.w + h.w; + + return res; +} + +[test] +draw quad +todo probe all rgba (450, 138, 796, 333) + + +% Same as the first test, but inverting the resource loads order. +% Regarding textures, only the allocation of those that are created from samplers is affected. +[pixel shader] +// Name Type Format Dim HLSL Bind Count +// ------------------------------ ---------- ------- ----------- -------------- ------ +// sam_arr_10 sampler NA NA s0 1 +// sam_arr_01 sampler NA NA s1 2 +// sam_arr_11 sampler NA NA s3 2 +// samA sampler NA NA s5 1 +// samB sampler NA NA s6 1 +// sam_arr_01 texture float4 2d t0 2 +// sam_arr_11 texture float4 2d t2 2 +// samA texture float4 2d t4 1 +// sam_arr_10 texture float4 2d t5 1 +// samB texture float4 2d t6 1 +// tex texture float4 2d t7 1 +// texs texture float4 2d t8 2 + +Texture2D tex; +Texture2D texs[2]; +sampler samA; +sampler samB; +sampler sam_arr_10[2]; +sampler sam_arr_01[2]; +sampler sam_arr_11[2]; + +float4 main() : sv_target +{ + float4 f = 0, g = 0, h = 0, res; + + f += 100 * tex2D(samA, float2(0, 0)); + f += 10 * tex.Load(int3(0, 0, 0)); + f += 1 * texs[0].Load(int3(0, 0, 0)); + g += 100 * texs[1].Load(int3(0, 0, 0)); + g += 10 * tex2D(sam_arr_01[1], float2(0, 0)); + g += 1 * tex2D(sam_arr_11[1], float2(0, 0)); + h += 100 * tex2D(sam_arr_11[0], float2(0, 0)); + h += 10 * tex2D(sam_arr_10[0], float2(0, 0)); + h += 1 * tex2D(samB, float2(0, 0)); + + res.x = f.x; + res.y = g.x; + res.z = h.x; + res.w = f.w + g.w + h.w; + + return res; +} + +[test] +draw quad +todo probe all rgba (478, 913, 256, 333) + + +% We can conclude that for declared texture arrays, if they are used, the "allocation size" is the +% whole array. +% On the other hand, for textures generated from samplers. the "allocation size" is the "bind count". +[pixel shader] +// Name Type Format Dim HLSL Bind Count +// ------------------------------ ---------- ------- ----------- -------------- ------ +// sam_arr sampler NA NA s0 2 +// sam sampler NA NA s2 1 +// sam_arr texture float4 2d t0 2 +// texs texture float4 2d t2 1 +// tex texture float4 2d t5 1 + +sampler sam; + +Texture2D texs[3]; +sampler sam_arr[3]; +Texture2D tex; + +float4 main() : sv_target +{ + float4 res = 0; + + res += 100 * texs[0].Sample(sam, float2(0, 0)); + res += 10 * tex2D(sam_arr[1], float2(0, 0)); + res += 1 * tex.Sample(sam, float2(0, 0)); + return res; +} + +[test] +draw quad +todo probe all rgba (215, 215, 215, 111) + + +% Test that textures created from SM1-style samples allocation order is in decreasing "bind count". +[pixel shader] +// Name Type Format Dim HLSL Bind Count +// ------------------------------ ---------- ------- ----------- -------------- ------ +// tex_100 sampler NA NA s0 1 +// tex_010 sampler NA NA s1 2 +// tex_001 sampler NA NA s3 3 +// tex_001 texture float4 2d t0 3 +// tex_010 texture float4 2d t3 2 +// tex_100 texture float4 2d t5 1 +sampler tex_100[3]; +sampler tex_010[3]; +sampler tex_001[3]; + +float4 main() : sv_target +{ + float4 res; + + res.x = tex2D(tex_100[0], float2(0, 0)).x; + res.y = tex2D(tex_010[1], float2(0, 0)).x; + res.z = tex2D(tex_001[2], float2(0, 0)).x; + res.w = 0; + return res; +} + +[test] +draw quad +todo probe all rgba (5, 4, 2, 0) diff --git a/tests/shader_runner_d3d12.c b/tests/shader_runner_d3d12.c index daeb11e8..bdd47087 100644 --- a/tests/shader_runner_d3d12.c +++ b/tests/shader_runner_d3d12.c @@ -206,8 +206,8 @@ static ID3D12RootSignature *d3d12_runner_create_root_signature(struct d3d12_shad ID3D12GraphicsCommandList *command_list, unsigned int *uniform_index) { D3D12_ROOT_SIGNATURE_DESC root_signature_desc = {0}; - D3D12_ROOT_PARAMETER root_params[8], *root_param; - D3D12_STATIC_SAMPLER_DESC static_samplers[5]; + D3D12_ROOT_PARAMETER root_params[17], *root_param; + D3D12_STATIC_SAMPLER_DESC static_samplers[7]; ID3D12RootSignature *root_signature; HRESULT hr; size_t i;
From: Francisco Casas fcasas@codeweavers.com
We have to distinguish between the "bind count" and the "allocation size" of variables.
The "allocation size" affects the starting register id for the resource to be allocated next, while the "bind count" is determined by the last field actually used. The former may be larger than the latter.
What we are currently calling hlsl_reg.bind_count is actually the "allocation size", so a rename is in order.
The real "bind count", which will be introduced in following patches, is important because it is what should be shown in the RDEF table and some resource allocation rules depend on it.
For instance, for this shader:
texture2D texs[3]; texture2D tex;
float4 main() : sv_target { return texs[0].Load(int3(0, 0, 0)) + tex.Load(int3(0, 0, 0)); }
the variable "texs" has a "bind count" of 1, but an "allocation size" of 3:
// Resource Bindings: // // Name Type Format Dim HLSL Bind Count // ------------------------------ ---------- ------- ----------- -------------- ------ // texs texture float4 2d t0 1 // tex texture float4 2d t3 1 --- libs/vkd3d-shader/d3dbc.c | 4 ++-- libs/vkd3d-shader/hlsl.h | 2 +- libs/vkd3d-shader/hlsl_codegen.c | 20 ++++++++++---------- libs/vkd3d-shader/tpf.c | 4 ++-- 4 files changed, 15 insertions(+), 15 deletions(-)
diff --git a/libs/vkd3d-shader/d3dbc.c b/libs/vkd3d-shader/d3dbc.c index fe739339..fdb71805 100644 --- a/libs/vkd3d-shader/d3dbc.c +++ b/libs/vkd3d-shader/d3dbc.c @@ -1680,7 +1680,7 @@ static void write_sm1_uniforms(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffe else { put_u32(buffer, vkd3d_make_u32(D3DXRS_SAMPLER, var->regs[r].id)); - put_u32(buffer, var->regs[r].bind_count); + put_u32(buffer, var->regs[r].allocation_size); } put_u32(buffer, 0); /* type */ put_u32(buffer, 0); /* FIXME: default value */ @@ -2027,7 +2027,7 @@ static void write_sm1_sampler_dcls(struct hlsl_ctx *ctx, struct vkd3d_bytecode_b if (!var->regs[HLSL_REGSET_SAMPLERS].allocated) continue;
- count = var->regs[HLSL_REGSET_SAMPLERS].bind_count; + count = var->regs[HLSL_REGSET_SAMPLERS].allocation_size;
for (i = 0; i < count; ++i) { diff --git a/libs/vkd3d-shader/hlsl.h b/libs/vkd3d-shader/hlsl.h index b1427c1d..fb4bdfa6 100644 --- a/libs/vkd3d-shader/hlsl.h +++ b/libs/vkd3d-shader/hlsl.h @@ -257,7 +257,7 @@ struct hlsl_reg /* Number of registers to be allocated. * Unlike the variable's type's regsize, it is not expressed in register components, but rather * in whole registers, and may depend on which components are used within the shader. */ - uint32_t bind_count; + uint32_t allocation_size; /* For numeric registers, a writemask can be provided to indicate the reservation of only some * of the 4 components. */ unsigned int writemask; diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 4f5a5b02..11ebe275 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -2868,7 +2868,7 @@ static void allocate_register_reservations(struct hlsl_ctx *ctx) continue; regset = hlsl_type_get_regset(var->data_type);
- if (var->reg_reservation.reg_type && var->regs[regset].bind_count) + if (var->reg_reservation.reg_type && var->regs[regset].allocation_size) { if (var->reg_reservation.reg_type != get_regset_name(regset)) { @@ -2886,7 +2886,7 @@ static void allocate_register_reservations(struct hlsl_ctx *ctx) var->regs[regset].id = var->reg_reservation.reg_index; TRACE("Allocated reserved %s to %c%u-%c%u.\n", var->name, var->reg_reservation.reg_type, var->reg_reservation.reg_index, var->reg_reservation.reg_type, - var->reg_reservation.reg_index + var->regs[regset].bind_count); + var->reg_reservation.reg_index + var->regs[regset].allocation_size); } } } @@ -3144,7 +3144,7 @@ static struct hlsl_reg allocate_register(struct hlsl_ctx *ctx, struct register_a record_allocation(ctx, allocator, reg_idx, writemask, first_write, last_read);
ret.id = reg_idx; - ret.bind_count = 1; + ret.allocation_size = 1; ret.writemask = hlsl_combine_writemasks(writemask, (1u << component_count) - 1); ret.allocated = true; return ret; @@ -3180,7 +3180,7 @@ static struct hlsl_reg allocate_range(struct hlsl_ctx *ctx, struct register_allo record_allocation(ctx, allocator, reg_idx + i, VKD3DSP_WRITEMASK_ALL, first_write, last_read);
ret.id = reg_idx; - ret.bind_count = align(reg_size, 4) / 4; + ret.allocation_size = align(reg_size, 4) / 4; ret.allocated = true; return ret; } @@ -3306,7 +3306,7 @@ static void calculate_resource_register_counts(struct hlsl_ctx *ctx) /* Samplers (and textures separated from them) are only allocated until the last * used one. */ if (var->objects_usage[k][i].used) - var->regs[k].bind_count = (k == HLSL_REGSET_SAMPLERS || is_separated) ? i + 1 : type->reg_size[k]; + var->regs[k].allocation_size = (k == HLSL_REGSET_SAMPLERS || is_separated) ? i + 1 : type->reg_size[k]; } } } @@ -3613,7 +3613,7 @@ static void allocate_semantic_register(struct hlsl_ctx *ctx, struct hlsl_ir_var { var->regs[HLSL_REGSET_NUMERIC].allocated = true; var->regs[HLSL_REGSET_NUMERIC].id = (*counter)++; - var->regs[HLSL_REGSET_NUMERIC].bind_count = 1; + var->regs[HLSL_REGSET_NUMERIC].allocation_size = 1; var->regs[HLSL_REGSET_NUMERIC].writemask = (1 << var->data_type->dimx) - 1; TRACE("Allocated %s to %s.\n", var->name, debug_register(output ? 'o' : 'v', var->regs[HLSL_REGSET_NUMERIC], var->data_type)); @@ -3792,7 +3792,7 @@ static void allocate_buffers(struct hlsl_ctx *ctx) }
buffer->reg.id = buffer->reservation.reg_index; - buffer->reg.bind_count = 1; + buffer->reg.allocation_size = 1; buffer->reg.allocated = true; TRACE("Allocated reserved %s to cb%u.\n", buffer->name, index); } @@ -3802,7 +3802,7 @@ static void allocate_buffers(struct hlsl_ctx *ctx) ++index;
buffer->reg.id = index; - buffer->reg.bind_count = 1; + buffer->reg.allocation_size = 1; buffer->reg.allocated = true; TRACE("Allocated %s to cb%u.\n", buffer->name, index); ++index; @@ -3842,7 +3842,7 @@ static const struct hlsl_ir_var *get_allocated_object(struct hlsl_ctx *ctx, enum else if (var->regs[regset].allocated) { start = var->regs[regset].id; - count = var->regs[regset].bind_count; + count = var->regs[regset].allocation_size; } else { @@ -3873,7 +3873,7 @@ static void allocate_objects(struct hlsl_ctx *ctx, enum hlsl_regset regset)
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { - unsigned int count = var->regs[regset].bind_count; + unsigned int count = var->regs[regset].allocation_size;
if (count == 0) continue; diff --git a/libs/vkd3d-shader/tpf.c b/libs/vkd3d-shader/tpf.c index 351943e2..b1027428 100644 --- a/libs/vkd3d-shader/tpf.c +++ b/libs/vkd3d-shader/tpf.c @@ -3061,7 +3061,7 @@ static struct extern_resource *sm4_get_extern_resources(struct hlsl_ctx *ctx, un regset = hlsl_type_get_regset(component_type); regset_offset = hlsl_type_get_component_offset(ctx, var->data_type, regset, k);
- if (regset_offset > var->regs[regset].bind_count) + if (regset_offset > var->regs[regset].allocation_size) continue;
if (var->objects_usage[regset][regset_offset].used) @@ -3134,7 +3134,7 @@ static struct extern_resource *sm4_get_extern_resources(struct hlsl_ctx *ctx, un
extern_resources[*count].regset = regset; extern_resources[*count].id = var->regs[regset].id; - extern_resources[*count].bind_count = var->regs[regset].bind_count; + extern_resources[*count].bind_count = var->regs[regset].allocation_size;
++*count; }
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl.h | 3 +++ libs/vkd3d-shader/hlsl_codegen.c | 2 ++ libs/vkd3d-shader/tpf.c | 2 +- 3 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/libs/vkd3d-shader/hlsl.h b/libs/vkd3d-shader/hlsl.h index fb4bdfa6..01d2c8c4 100644 --- a/libs/vkd3d-shader/hlsl.h +++ b/libs/vkd3d-shader/hlsl.h @@ -417,6 +417,9 @@ struct hlsl_ir_var enum hlsl_sampler_dim sampler_dim; struct vkd3d_shader_location first_sampler_dim_loc; } *objects_usage[HLSL_REGSET_LAST_OBJECT + 1]; + /* Minimum number of binds required to include all object components actually used in the shader. + * It may be less than the allocation size, e.g. for texture arrays. */ + unsigned int bind_count[HLSL_REGSET_LAST_OBJECT + 1];
uint32_t is_input_semantic : 1; uint32_t is_output_semantic : 1; diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 11ebe275..d15acd72 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -3275,6 +3275,7 @@ static bool track_object_components_usage(struct hlsl_ctx *ctx, struct hlsl_ir_n return false;
var->objects_usage[regset][index].used = true; + var->bind_count[regset] = max(var->bind_count[regset], index + 1); if (load->sampler.var) { var = load->sampler.var; @@ -3282,6 +3283,7 @@ static bool track_object_components_usage(struct hlsl_ctx *ctx, struct hlsl_ir_n return false;
var->objects_usage[HLSL_REGSET_SAMPLERS][index].used = true; + var->bind_count[HLSL_REGSET_SAMPLERS] = max(var->bind_count[HLSL_REGSET_SAMPLERS], index + 1); }
return false; diff --git a/libs/vkd3d-shader/tpf.c b/libs/vkd3d-shader/tpf.c index b1027428..b985545c 100644 --- a/libs/vkd3d-shader/tpf.c +++ b/libs/vkd3d-shader/tpf.c @@ -3134,7 +3134,7 @@ static struct extern_resource *sm4_get_extern_resources(struct hlsl_ctx *ctx, un
extern_resources[*count].regset = regset; extern_resources[*count].id = var->regs[regset].id; - extern_resources[*count].bind_count = var->regs[regset].allocation_size; + extern_resources[*count].bind_count = var->bind_count[regset];
++*count; }
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl_codegen.c | 39 ++++++++++++++++++++++++ tests/hlsl/combined-samplers.shader_test | 6 ++-- tests/hlsl/texture-ordering.shader_test | 10 +++--- 3 files changed, 47 insertions(+), 8 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index d15acd72..21901548 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -2191,6 +2191,44 @@ static bool lower_combined_samples(struct hlsl_ctx *ctx, struct hlsl_ir_node *in return true; }
+static void insert_ensuring_decreasing_bind_count(struct list *list, struct hlsl_ir_var *to_add, + enum hlsl_regset regset) +{ + struct hlsl_ir_var *var; + + LIST_FOR_EACH_ENTRY(var, list, struct hlsl_ir_var, extern_entry) + { + if (var->bind_count[regset] < to_add->bind_count[regset]) + { + list_add_before(&var->extern_entry, &to_add->extern_entry); + return; + } + } + + list_add_tail(list, &to_add->extern_entry); +} + +static bool sort_synthetic_separated_samplers_first(struct hlsl_ctx *ctx) +{ + struct list separated_resources; + struct hlsl_ir_var *var, *next; + + list_init(&separated_resources); + + LIST_FOR_EACH_ENTRY_SAFE(var, next, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) + { + if (var->is_separated_resource) + { + list_remove(&var->extern_entry); + insert_ensuring_decreasing_bind_count(&separated_resources, var, HLSL_REGSET_TEXTURES); + } + } + + list_move_head(&ctx->extern_vars, &separated_resources); + + return false; +} + /* Lower DIV to RCP + MUL. */ static bool lower_division(struct hlsl_ctx *ctx, struct hlsl_ir_node *instr, void *context) { @@ -4318,6 +4356,7 @@ int hlsl_emit_bytecode(struct hlsl_ctx *ctx, struct hlsl_ir_function_decl *entry if (profile->major_version >= 4) hlsl_transform_ir(ctx, lower_combined_samples, body, NULL); hlsl_transform_ir(ctx, track_object_components_usage, body, NULL); + sort_synthetic_separated_samplers_first(ctx);
if (profile->major_version < 4) { diff --git a/tests/hlsl/combined-samplers.shader_test b/tests/hlsl/combined-samplers.shader_test index 16b5438e..16db3129 100644 --- a/tests/hlsl/combined-samplers.shader_test +++ b/tests/hlsl/combined-samplers.shader_test @@ -60,7 +60,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (10, 10, 10, 11) +probe all rgba (10, 10, 10, 11)
[pixel shader] @@ -74,7 +74,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (21, 21, 21, 11) +probe all rgba (21, 21, 21, 11)
[pixel shader] @@ -105,7 +105,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (104, 104, 104, 111) +probe all rgba (104, 104, 104, 111)
% Sampler arrays with components that have different usage dimensions are only forbidden in SM4 upwards. diff --git a/tests/hlsl/texture-ordering.shader_test b/tests/hlsl/texture-ordering.shader_test index b28fea5d..eb3d5c90 100644 --- a/tests/hlsl/texture-ordering.shader_test +++ b/tests/hlsl/texture-ordering.shader_test @@ -131,7 +131,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (450, 139, 876, 333) +probe all rgba (450, 139, 876, 333)
% Same as the first test, but inverting the declaration order. @@ -184,7 +184,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (450, 138, 796, 333) +probe all rgba (450, 138, 796, 333)
% Same as the first test, but inverting the resource loads order. @@ -237,7 +237,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (478, 913, 256, 333) +probe all rgba (478, 913, 256, 333)
% We can conclude that for declared texture arrays, if they are used, the "allocation size" is the @@ -270,7 +270,7 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (215, 215, 215, 111) +probe all rgba (215, 215, 215, 111)
% Test that textures created from SM1-style samples allocation order is in decreasing "bind count". @@ -300,4 +300,4 @@ float4 main() : sv_target
[test] draw quad -todo probe all rgba (5, 4, 2, 0) +probe all rgba (5, 4, 2, 0)
From: Francisco Casas fcasas@codeweavers.com
--- libs/vkd3d-shader/hlsl_codegen.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/libs/vkd3d-shader/hlsl_codegen.c b/libs/vkd3d-shader/hlsl_codegen.c index 21901548..e4550e85 100644 --- a/libs/vkd3d-shader/hlsl_codegen.c +++ b/libs/vkd3d-shader/hlsl_codegen.c @@ -3331,7 +3331,7 @@ static void calculate_resource_register_counts(struct hlsl_ctx *ctx) { struct hlsl_ir_var *var; struct hlsl_type *type; - unsigned int i, k; + unsigned int k;
LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { @@ -3339,15 +3339,10 @@ static void calculate_resource_register_counts(struct hlsl_ctx *ctx)
for (k = 0; k <= HLSL_REGSET_LAST_OBJECT; ++k) { - for (i = 0; i < type->reg_size[k]; ++i) - { - bool is_separated = var->is_separated_resource; + bool is_separated = var->is_separated_resource;
- /* Samplers (and textures separated from them) are only allocated until the last - * used one. */ - if (var->objects_usage[k][i].used) - var->regs[k].allocation_size = (k == HLSL_REGSET_SAMPLERS || is_separated) ? i + 1 : type->reg_size[k]; - } + if (var->bind_count[k] > 0) + var->regs[k].allocation_size = (k == HLSL_REGSET_SAMPLERS || is_separated) ? var->bind_count[k] : type->reg_size[k]; } } }
From: Francisco Casas fcasas@codeweavers.com
This should have no effect, since in SM1 the allocation size is the same as the bind count because there are no texture registers. It is just done for consistency. --- libs/vkd3d-shader/d3dbc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libs/vkd3d-shader/d3dbc.c b/libs/vkd3d-shader/d3dbc.c index fdb71805..347c1dff 100644 --- a/libs/vkd3d-shader/d3dbc.c +++ b/libs/vkd3d-shader/d3dbc.c @@ -1680,7 +1680,7 @@ static void write_sm1_uniforms(struct hlsl_ctx *ctx, struct vkd3d_bytecode_buffe else { put_u32(buffer, vkd3d_make_u32(D3DXRS_SAMPLER, var->regs[r].id)); - put_u32(buffer, var->regs[r].allocation_size); + put_u32(buffer, var->bind_count[r]); } put_u32(buffer, 0); /* type */ put_u32(buffer, 0); /* FIXME: default value */ @@ -2027,7 +2027,7 @@ static void write_sm1_sampler_dcls(struct hlsl_ctx *ctx, struct vkd3d_bytecode_b if (!var->regs[HLSL_REGSET_SAMPLERS].allocated) continue;
- count = var->regs[HLSL_REGSET_SAMPLERS].allocation_size; + count = var->bind_count[HLSL_REGSET_SAMPLERS];
for (i = 0; i < count; ++i) {
:arrow_up: I changed 4/4 (now 4/6) so that texture resources are indeed sorted in decreasing bind count and added a test on 1/6 to check for that.
Because it was necessary to call hlsl_var_get_bind_count() several times for that, I decided to store the bind count in `hlsl_var.bind_count[]` instead. This also allows for patch 5/6.
I also added 6/6, which uses the bind count instead of the allocation size in d3dbc.c. This should have no effect since both should be the same in SM1, but still, for consistency.
This merge request was approved by Zebediah Figura.
This merge request was approved by Giovanni Mascellani.
This merge request was approved by Henri Verbeet.