This patchset implements support for DXIL shaders (SM 6.0+) via dxil-spirv (github.com/HansKristian-Work/dxil-spirv).
There are three main parts of this patchset.
First, support for SM 5.1 register spaces are introduced. These patches have been submitted before, but it was only reviewed after a rebase, and that rebase broke the patches due to some uninitialized member variables introduced in the UAV clear patch set which came inbetween. I've fixed those, and cleaned up the patches a little bit. I need SM 5.1 register spaces implemented because the DXIL implementation needs it as well. I also fixed some additional cases for SM 5.1 which were not implemented in the last patch set, like SM 5.1 root constants and root descriptors.
To aid debugging, I also added SPIR-V dumping to the VKD3D_SHADER_DUMP_PATH which was very useful to study difference between DXBC and DXIL outputs.
Second, we have the dxil-spirv integration. There are various small refactors needed to enable this, mostly just moving a few helper functions in vkd3d-shader around to the private header so they can be accessed by dxil.c. Vulkan 1.1 is enabled if active, because subgroup operations requires it. SM 6.0 support is only activated if subgroup operations are sufficiently supported and DXIL is enabled. There are other features required for SM 6.0 such as 16-bit arithmetic and storage, but that is left for later. We will need to revisit the binding model to enable that properly.
The actual DXIL implementation lives in dxil.c. dxbc.c will detect that a shader blob is DXIL by looking for TAG_DXIL and dispatch the work to dxil.c if DXIL support is enabled. DXIL blobs are basically identical to DXBC blobs, except TAG_DXIL is used instead of TAG_SHDR, and ISG1 is used instead of ISGN, etc.
To make integration as smooth as possible, dxil-spirv relies on callbacks rather than feeding structures to the compiler to resolve resource bindings and similar. This way, we won't have to translate the vkd3d_shader_* structures.
Finally, most of the commits here are to add DXIL testing paths for many of the tests in tests/d3d12.c. The dxil-spirv repo has a lot of tests already to cover codegen, so these tests are mostly to verify that the integration works. Adding these tests *did* find some bugs in the dxil-spirv implementation as well, but it mostly went without any major problems.
When emitting DXIL, the blob sizes are not word-aligned, so they are embedded as byte arrays instead. They are also much larger than equivalent DXBC blobs, due to LLVM IR bloat. I ran -Qstrip_debug -Qstrip_reflect in DXC when compiling these shaders.
The main strategy I used to add DXIL tests was to turn the existing tests into a function ala static void test_something(bool use_dxil) and just select the right shader code as required.
I had to add some utility functions as well because the D3D12 validation layers refuse to mix and match SM 5.1 and below with SM 6.0 and above. These utilities just use different default shader blobs. Another note is that to make the DXIL shaders pass validation on native D3D12 it is required that the DXIL is signed by Microsoft's validator.
Hans-Kristian Arntzen (41): vkd3d: Deal correctly with SM 5.1 register spaces. vkd3d: Add test case for SM 5.1 register spaces. vkd3d: Add test case for root constants in SM 5.1. vkd3d-shader: Add path for debug dumping SPIR-V as well. vkd3d: Add dxil-spirv to autoconf vkd3d: Attempt to parse ISG1 as well when parsing input signatures. vkd3d-shader: Add entry point to query if DXIL is supported. vkd3d: Attempt to create a Vulkan 1.1 instance and device. vkd3d: Query subgroup properties and expose SM 6.0 if present. vkd3d: Move vkd3d_find_shader into private header. vkd3d: Add helper function to query if a blob is DXIL. vkd3d-shader: Expose debug shader dumping in private header. vkd3d-shader: Add integration for DXIL shaders. vkd3d: Add test helper function to determine if DXIL is supported. vkd3d: Add helper test function to set up a default pipeline with DXIL. vkd3d: Add DXIL test for geometry shader. vkd3d: Add DXIL test for layered rendering. vkd3d: Add DXIL test for ps_layer. vkd3d: Add DXIL test for quad_tessellation. vkd3d: Add DXIL test for tess control point phase. vkd3d: Add DXIL test for tess fork phase. vkd3d: Add DXIL test for line tessellation. vkd3d: Add DXIL test for stream output. vkd3d: Add DXIL test for bufinfo. vkd3d: Add DXIL test for register spaces. vkd3d: Add DXIL test for constant buffers (root const/desc). vkd3d: Add DXIL test for dual source blending. vkd3d: Add DXIL test for face culling. vkd3d: Add DXIL test for render_target_a8. vkd3d-shader: Hook up RT output swizzle path in DXIL. vkd3d: Add DXIL test for sample mask. vkd3d: Add DXIL test for coverage. vkd3d: Add create_pipeline_state_dxil test utility. vkd3d: Add DXIL test for shader_sample_position. vkd3d: Add DXIL test for rasterizer sample count. vkd3d-shader: Hook up RASTERIZER_SAMPLE_COUNT parameter. vkd3d: Add DXIL test for clip distance. vkd3d: Add DXIL test for combined ClipCull. vkd3d: Add DXIL test for eval attribute. vkd3d: Add DXIL test for instance_id. vkd3d: Add DXIL test for vertex ID.
Makefile.am | 8 +- configure.ac | 11 + include/vkd3d_shader.h | 18 + libs/vkd3d-shader/dxbc.c | 58 +- libs/vkd3d-shader/dxil.c | 470 ++ libs/vkd3d-shader/spirv.c | 215 +- libs/vkd3d-shader/vkd3d_shader.map | 1 + libs/vkd3d-shader/vkd3d_shader_main.c | 99 +- libs/vkd3d-shader/vkd3d_shader_private.h | 38 + libs/vkd3d/command.c | 32 +- libs/vkd3d/device.c | 69 +- libs/vkd3d/state.c | 65 +- libs/vkd3d/utils.c | 3 + libs/vkd3d/vkd3d_private.h | 5 + tests/d3d12.c | 5983 +++++++++++++++++++--- tests/d3d12_test_utils.h | 193 +- 16 files changed, 6544 insertions(+), 724 deletions(-) create mode 100644 libs/vkd3d-shader/dxil.c
Resource index is found in idx[0] in SM 5.0, but idx[1] when using SM 5.1, and register space is encoded reparately. An rb_tree keeps track of the internal resource index idx[0] and can map that to space/binding as required when emitting SPIR-V.
For this to work, we must also make UAV counters register space aware. In earlier implementation, UAV counter mask was assumed to correlate 1:1 with register_index, which breaks on SM 5.1.
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- include/vkd3d_shader.h | 16 ++ libs/vkd3d-shader/dxbc.c | 29 ++++ libs/vkd3d-shader/spirv.c | 194 ++++++++++++++++++++--- libs/vkd3d-shader/vkd3d_shader_private.h | 5 + libs/vkd3d/command.c | 32 ++-- libs/vkd3d/state.c | 65 +++++--- libs/vkd3d/vkd3d_private.h | 1 + 7 files changed, 288 insertions(+), 54 deletions(-)
diff --git a/include/vkd3d_shader.h b/include/vkd3d_shader.h index 6b4d3f5..ec52e26 100644 --- a/include/vkd3d_shader.h +++ b/include/vkd3d_shader.h @@ -35,6 +35,7 @@ enum vkd3d_shader_structure_type VKD3D_SHADER_STRUCTURE_TYPE_SCAN_INFO, VKD3D_SHADER_STRUCTURE_TYPE_TRANSFORM_FEEDBACK_INFO, VKD3D_SHADER_STRUCTURE_TYPE_DOMAIN_SHADER_COMPILE_ARGUMENTS, + VKD3D_SHADER_STRUCTURE_TYPE_EFFECTIVE_UAV_COUNTER_BINDING_INFO,
VKD3D_FORCE_32_BIT_ENUM(VKD3D_SHADER_STRUCTURE_TYPE), }; @@ -138,6 +139,7 @@ struct vkd3d_shader_parameter struct vkd3d_shader_resource_binding { enum vkd3d_shader_descriptor_type type; + unsigned int register_space; unsigned int register_index; enum vkd3d_shader_visibility shader_visibility; unsigned int flags; /* vkd3d_shader_binding_flags */ @@ -159,8 +161,10 @@ struct vkd3d_shader_combined_resource_sampler
struct vkd3d_shader_uav_counter_binding { + unsigned int register_space; unsigned int register_index; /* u# */ enum vkd3d_shader_visibility shader_visibility; + unsigned int counter_index;
struct vkd3d_shader_descriptor_binding binding; unsigned int offset; @@ -168,6 +172,7 @@ struct vkd3d_shader_uav_counter_binding
struct vkd3d_shader_push_constant_buffer { + unsigned int register_space; unsigned int register_index; enum vkd3d_shader_visibility shader_visibility;
@@ -215,6 +220,17 @@ struct vkd3d_shader_transform_feedback_info unsigned int buffer_stride_count; };
+/* Extends vkd3d_shader_interface_info. */ +struct vkd3d_shader_effective_uav_counter_binding_info +{ + enum vkd3d_shader_structure_type type; + const void *next; + + unsigned int *uav_register_spaces; + unsigned int *uav_register_bindings; + unsigned int uav_counter_count; +}; + enum vkd3d_shader_target { VKD3D_SHADER_TARGET_NONE, diff --git a/libs/vkd3d-shader/dxbc.c b/libs/vkd3d-shader/dxbc.c index 98c51e4..b3f53ab 100644 --- a/libs/vkd3d-shader/dxbc.c +++ b/libs/vkd3d-shader/dxbc.c @@ -624,6 +624,10 @@ static void shader_sm4_read_dcl_resource(struct vkd3d_shader_instruction *ins, ins->flags = (opcode_token & VKD3D_SM5_UAV_FLAGS_MASK) >> VKD3D_SM5_UAV_FLAGS_SHIFT;
shader_sm4_read_register_space(priv, &tokens, end, &ins->declaration.semantic.register_space); + if (shader_is_sm_5_1(priv)) + ins->declaration.semantic.register_index = ins->declaration.semantic.reg.reg.idx[1].offset; + else + ins->declaration.semantic.register_index = ins->declaration.semantic.reg.reg.idx[0].offset; }
static void shader_sm4_read_dcl_constant_buffer(struct vkd3d_shader_instruction *ins, @@ -647,9 +651,12 @@ static void shader_sm4_read_dcl_constant_buffer(struct vkd3d_shader_instruction return; }
+ ins->declaration.cb.register_index = ins->declaration.cb.src.reg.idx[1].offset; ins->declaration.cb.size = *tokens++; shader_sm4_read_register_space(priv, &tokens, end, &ins->declaration.cb.register_space); } + else + ins->declaration.cb.register_index = ins->declaration.cb.src.reg.idx[0].offset; }
static void shader_sm4_read_dcl_sampler(struct vkd3d_shader_instruction *ins, @@ -663,6 +670,10 @@ static void shader_sm4_read_dcl_sampler(struct vkd3d_shader_instruction *ins, FIXME("Unhandled sampler mode %#x.\n", ins->flags); shader_sm4_read_src_param(priv, &tokens, end, VKD3D_DATA_SAMPLER, &ins->declaration.sampler.src); shader_sm4_read_register_space(priv, &tokens, end, &ins->declaration.sampler.register_space); + if (shader_is_sm_5_1(priv)) + ins->declaration.sampler.register_index = ins->declaration.sampler.src.reg.idx[1].offset; + else + ins->declaration.sampler.register_index = ins->declaration.sampler.src.reg.idx[0].offset; }
static void shader_sm4_read_dcl_index_range(struct vkd3d_shader_instruction *ins, @@ -863,6 +874,10 @@ static void shader_sm5_read_dcl_uav_raw(struct vkd3d_shader_instruction *ins, shader_sm4_read_dst_param(priv, &tokens, end, VKD3D_DATA_UAV, &ins->declaration.raw_resource.dst); ins->flags = (opcode_token & VKD3D_SM5_UAV_FLAGS_MASK) >> VKD3D_SM5_UAV_FLAGS_SHIFT; shader_sm4_read_register_space(priv, &tokens, end, &ins->declaration.raw_resource.register_space); + if (shader_is_sm_5_1(priv)) + ins->declaration.raw_resource.register_index = ins->declaration.raw_resource.dst.reg.idx[1].offset; + else + ins->declaration.raw_resource.register_index = ins->declaration.raw_resource.dst.reg.idx[0].offset; }
static void shader_sm5_read_dcl_uav_structured(struct vkd3d_shader_instruction *ins, @@ -874,9 +889,14 @@ static void shader_sm5_read_dcl_uav_structured(struct vkd3d_shader_instruction * shader_sm4_read_dst_param(priv, &tokens, end, VKD3D_DATA_UAV, &ins->declaration.structured_resource.reg); ins->flags = (opcode_token & VKD3D_SM5_UAV_FLAGS_MASK) >> VKD3D_SM5_UAV_FLAGS_SHIFT; ins->declaration.structured_resource.byte_stride = *tokens; + tokens++; if (ins->declaration.structured_resource.byte_stride % 4) FIXME("Byte stride %u is not multiple of 4.\n", ins->declaration.structured_resource.byte_stride); shader_sm4_read_register_space(priv, &tokens, end, &ins->declaration.structured_resource.register_space); + if (shader_is_sm_5_1(priv)) + ins->declaration.structured_resource.register_index = ins->declaration.structured_resource.reg.reg.idx[1].offset; + else + ins->declaration.structured_resource.register_index = ins->declaration.structured_resource.reg.reg.idx[0].offset; }
static void shader_sm5_read_dcl_tgsm_raw(struct vkd3d_shader_instruction *ins, @@ -909,9 +929,14 @@ static void shader_sm5_read_dcl_resource_structured(struct vkd3d_shader_instruct
shader_sm4_read_dst_param(priv, &tokens, end, VKD3D_DATA_RESOURCE, &ins->declaration.structured_resource.reg); ins->declaration.structured_resource.byte_stride = *tokens; + tokens++; if (ins->declaration.structured_resource.byte_stride % 4) FIXME("Byte stride %u is not multiple of 4.\n", ins->declaration.structured_resource.byte_stride); shader_sm4_read_register_space(priv, &tokens, end, &ins->declaration.structured_resource.register_space); + if (shader_is_sm_5_1(priv)) + ins->declaration.structured_resource.register_index = ins->declaration.structured_resource.reg.reg.idx[1].offset; + else + ins->declaration.structured_resource.register_index = ins->declaration.structured_resource.reg.reg.idx[0].offset; }
static void shader_sm5_read_dcl_resource_raw(struct vkd3d_shader_instruction *ins, @@ -922,6 +947,10 @@ static void shader_sm5_read_dcl_resource_raw(struct vkd3d_shader_instruction *in
shader_sm4_read_dst_param(priv, &tokens, end, VKD3D_DATA_RESOURCE, &ins->declaration.dst); shader_sm4_read_register_space(priv, &tokens, end, &ins->declaration.raw_resource.register_space); + if (shader_is_sm_5_1(priv)) + ins->declaration.raw_resource.register_index = ins->declaration.dst.reg.idx[1].offset; + else + ins->declaration.raw_resource.register_index = ins->declaration.dst.reg.idx[0].offset; }
static void shader_sm5_read_sync(struct vkd3d_shader_instruction *ins, diff --git a/libs/vkd3d-shader/spirv.c b/libs/vkd3d-shader/spirv.c index 3d88be9..f286f8b 100644 --- a/libs/vkd3d-shader/spirv.c +++ b/libs/vkd3d-shader/spirv.c @@ -1890,6 +1890,20 @@ struct vkd3d_symbol } info; };
+struct vkd3d_sm51_symbol_key +{ + enum vkd3d_shader_descriptor_type descriptor_type; + unsigned int idx; +}; + +struct vkd3d_sm51_symbol +{ + struct rb_entry entry; + struct vkd3d_sm51_symbol_key key; + unsigned int register_space; + unsigned int resource_idx; +}; + static int vkd3d_symbol_compare(const void *key, const struct rb_entry *entry) { const struct vkd3d_symbol *a = key; @@ -1900,6 +1914,13 @@ static int vkd3d_symbol_compare(const void *key, const struct rb_entry *entry) return memcmp(&a->key, &b->key, sizeof(a->key)); }
+static int vkd3d_sm51_symbol_compare(const void *key, const struct rb_entry *entry) +{ + const struct vkd3d_sm51_symbol_key *a = key; + const struct vkd3d_sm51_symbol *b = RB_ENTRY_VALUE(entry, const struct vkd3d_sm51_symbol, entry); + return memcmp(a, &b->key, sizeof(*a)); +} + static void vkd3d_symbol_free(struct rb_entry *entry, void *context) { struct vkd3d_symbol *s = RB_ENTRY_VALUE(entry, struct vkd3d_symbol, entry); @@ -1907,6 +1928,13 @@ static void vkd3d_symbol_free(struct rb_entry *entry, void *context) vkd3d_free(s); }
+static void vkd3d_sm51_symbol_free(struct rb_entry *entry, void *context) +{ + struct vkd3d_sm51_symbol *s = RB_ENTRY_VALUE(entry, struct vkd3d_sm51_symbol, entry); + + vkd3d_free(s); +} + static void vkd3d_symbol_make_register(struct vkd3d_symbol *symbol, const struct vkd3d_shader_register *reg) { @@ -2052,6 +2080,7 @@ struct vkd3d_hull_shader_variables
struct vkd3d_dxbc_compiler { + struct vkd3d_shader_version shader_version; struct vkd3d_spirv_builder spirv_builder;
uint32_t options; @@ -2062,6 +2091,8 @@ struct vkd3d_dxbc_compiler struct vkd3d_hull_shader_variables hs; uint32_t sample_positions_id;
+ struct rb_tree sm51_resource_table; + enum vkd3d_shader_type shader_type;
unsigned int branch_id; @@ -2107,6 +2138,11 @@ struct vkd3d_dxbc_compiler size_t spec_constants_size; };
+static bool shader_is_sm_5_1(const struct vkd3d_dxbc_compiler *compiler) +{ + return (compiler->shader_version.major * 100 + compiler->shader_version.minor) >= 501; +} + static bool is_control_point_phase(const struct vkd3d_shader_phase *phase) { return phase && phase->type == VKD3DSIH_HS_CONTROL_POINT_PHASE; @@ -2131,6 +2167,8 @@ struct vkd3d_dxbc_compiler *vkd3d_dxbc_compiler_create(const struct vkd3d_shader
memset(compiler, 0, sizeof(*compiler));
+ compiler->shader_version = *shader_version; + max_element_count = max(output_signature->element_count, patch_constant_signature->element_count); if (!(compiler->output_info = vkd3d_calloc(max_element_count, sizeof(*compiler->output_info)))) { @@ -2142,6 +2180,7 @@ struct vkd3d_dxbc_compiler *vkd3d_dxbc_compiler_create(const struct vkd3d_shader compiler->options = compiler_options;
rb_init(&compiler->symbol_table, vkd3d_symbol_compare); + rb_init(&compiler->sm51_resource_table, vkd3d_sm51_symbol_compare);
compiler->shader_type = shader_version->type;
@@ -2227,9 +2266,10 @@ static bool vkd3d_dxbc_compiler_check_shader_visibility(const struct vkd3d_dxbc_ }
static struct vkd3d_push_constant_buffer_binding *vkd3d_dxbc_compiler_find_push_constant_buffer( - const struct vkd3d_dxbc_compiler *compiler, const struct vkd3d_shader_register *reg) + const struct vkd3d_dxbc_compiler *compiler, const struct vkd3d_shader_constant_buffer *cb) { - unsigned int reg_idx = reg->idx[0].offset; + unsigned int reg_idx = cb->register_index; + unsigned int reg_space = cb->register_space; unsigned int i;
for (i = 0; i < compiler->shader_interface.push_constant_buffer_count; ++i) @@ -2239,7 +2279,7 @@ static struct vkd3d_push_constant_buffer_binding *vkd3d_dxbc_compiler_find_push_ if (!vkd3d_dxbc_compiler_check_shader_visibility(compiler, current->pc.shader_visibility)) continue;
- if (current->pc.register_index == reg_idx) + if (current->pc.register_index == reg_idx && current->pc.register_space == reg_space) return current; }
@@ -2274,6 +2314,49 @@ static bool vkd3d_dxbc_compiler_has_combined_sampler(const struct vkd3d_dxbc_com return false; }
+static bool vkd3d_get_binding_info_for_register( + struct vkd3d_dxbc_compiler *compiler, + const struct vkd3d_shader_register *reg, + unsigned int *reg_space, unsigned int *reg_binding) +{ + const struct vkd3d_sm51_symbol *symbol; + struct vkd3d_sm51_symbol_key key; + const struct rb_entry *entry; + + if (shader_is_sm_5_1(compiler)) + { + key.descriptor_type = VKD3D_SHADER_DESCRIPTOR_TYPE_UNKNOWN; + if (reg->type == VKD3DSPR_CONSTBUFFER) + key.descriptor_type = VKD3D_SHADER_DESCRIPTOR_TYPE_CBV; + else if (reg->type == VKD3DSPR_RESOURCE) + key.descriptor_type = VKD3D_SHADER_DESCRIPTOR_TYPE_SRV; + else if (reg->type == VKD3DSPR_UAV) + key.descriptor_type = VKD3D_SHADER_DESCRIPTOR_TYPE_UAV; + else if (reg->type == VKD3DSPR_SAMPLER) + key.descriptor_type = VKD3D_SHADER_DESCRIPTOR_TYPE_SAMPLER; + else + FIXME("Unhandled register type %#x.\n", reg->type); + + key.idx = reg->idx[0].offset; + entry = rb_get(&compiler->sm51_resource_table, &key); + if (entry) + { + symbol = RB_ENTRY_VALUE(entry, const struct vkd3d_sm51_symbol, entry); + *reg_space = symbol->register_space; + *reg_binding = symbol->resource_idx; + return true; + } + else + return false; + } + else + { + *reg_space = 0; + *reg_binding = reg->idx[0].offset; + return true; + } +} + static struct vkd3d_shader_descriptor_binding vkd3d_dxbc_compiler_get_descriptor_binding( struct vkd3d_dxbc_compiler *compiler, const struct vkd3d_shader_register *reg, enum vkd3d_shader_resource_type resource_type, bool is_uav_counter) @@ -2282,8 +2365,9 @@ static struct vkd3d_shader_descriptor_binding vkd3d_dxbc_compiler_get_descriptor enum vkd3d_shader_descriptor_type descriptor_type; enum vkd3d_shader_binding_flag resource_type_flag; struct vkd3d_shader_descriptor_binding binding; - unsigned int reg_idx = reg->idx[0].offset; unsigned int i; + unsigned int reg_space = 0; + unsigned int reg_idx = 0;
descriptor_type = VKD3D_SHADER_DESCRIPTOR_TYPE_UNKNOWN; if (reg->type == VKD3DSPR_CONSTBUFFER) @@ -2300,6 +2384,11 @@ static struct vkd3d_shader_descriptor_binding vkd3d_dxbc_compiler_get_descriptor resource_type_flag = resource_type == VKD3D_SHADER_RESOURCE_BUFFER ? VKD3D_SHADER_BINDING_FLAG_BUFFER : VKD3D_SHADER_BINDING_FLAG_IMAGE;
+ if (!vkd3d_get_binding_info_for_register(compiler, reg, ®_space, ®_idx)) + { + ERR("Failed to find binding for resource type %#x.\n", reg->type); + } + if (is_uav_counter) { assert(descriptor_type == VKD3D_SHADER_DESCRIPTOR_TYPE_UAV); @@ -2313,8 +2402,19 @@ static struct vkd3d_shader_descriptor_binding vkd3d_dxbc_compiler_get_descriptor if (current->offset) FIXME("Atomic counter offsets are not supported yet.\n");
- if (current->register_index == reg_idx) + /* Do not use space/binding, but just the plain index here, since that's how the UAV counter mask is exposed. */ + if (current->counter_index == reg->idx[0].offset) + { + /* Let pipeline know what the actual space/bindings for the counter are. */ + const struct vkd3d_shader_effective_uav_counter_binding_info *binding_info = + vkd3d_find_struct(shader_interface->next, EFFECTIVE_UAV_COUNTER_BINDING_INFO); + if (binding_info && current->counter_index < binding_info->uav_counter_count) + { + binding_info->uav_register_spaces[current->counter_index] = reg_space; + binding_info->uav_register_bindings[current->counter_index] = reg_idx; + } return current->binding; + } } if (shader_interface->uav_counter_count) FIXME("Could not find descriptor binding for UAV counter %u.\n", reg_idx); @@ -2331,7 +2431,7 @@ static struct vkd3d_shader_descriptor_binding vkd3d_dxbc_compiler_get_descriptor if (!vkd3d_dxbc_compiler_check_shader_visibility(compiler, current->shader_visibility)) continue;
- if (current->type == descriptor_type && current->register_index == reg_idx) + if (current->type == descriptor_type && current->register_index == reg_idx && current->register_space == reg_space) return current->binding; } if (shader_interface->binding_count) @@ -2828,7 +2928,8 @@ static void vkd3d_dxbc_compiler_emit_dereference_register(struct vkd3d_dxbc_comp { assert(!reg->idx[0].rel_addr); indexes[index_count++] = vkd3d_dxbc_compiler_get_constant_uint(compiler, register_info->member_idx); - indexes[index_count++] = vkd3d_dxbc_compiler_emit_register_addressing(compiler, ®->idx[1]); + indexes[index_count++] = vkd3d_dxbc_compiler_emit_register_addressing(compiler, + ®->idx[shader_is_sm_5_1(compiler) ? 2 : 1]); } else if (reg->type == VKD3DSPR_IMMCONSTBUFFER) { @@ -2838,6 +2939,11 @@ static void vkd3d_dxbc_compiler_emit_dereference_register(struct vkd3d_dxbc_comp { indexes[index_count++] = vkd3d_dxbc_compiler_emit_register_addressing(compiler, ®->idx[1]); } + else if (reg->type == VKD3DSPR_SAMPLER) + { + /* SM 5.1 will have an index here referring to something which we throw away. */ + index_count = 0; + } else if (register_info->is_aggregate) { struct vkd3d_shader_register_index reg_idx = reg->idx[0]; @@ -4914,7 +5020,8 @@ static void vkd3d_dxbc_compiler_emit_push_constant_buffers(struct vkd3d_dxbc_com if (!cb->reg.type) continue;
- cb_size = cb->reg.idx[1].offset; + cb_size = (cb->pc.size + 15) / 16; + length_id = vkd3d_dxbc_compiler_get_constant_uint(compiler, cb_size); member_ids[j] = vkd3d_spirv_build_op_type_array(builder, vec4_id, length_id); vkd3d_spirv_build_op_decorate1(builder, member_ids[j], SpvDecorationArrayStride, 16); @@ -4965,10 +5072,19 @@ static void vkd3d_dxbc_compiler_emit_dcl_constant_buffer(struct vkd3d_dxbc_compi
assert(!(instruction->flags & ~VKD3DSI_INDEXED_DYNAMIC));
- if (cb->register_space) - FIXME("Unhandled register space %u.\n", cb->register_space); + if (shader_is_sm_5_1(compiler)) + { + struct vkd3d_sm51_symbol *sym; + sym = vkd3d_calloc(1, sizeof(*sym)); + sym->key.idx = reg->idx[0].offset; + sym->key.descriptor_type = VKD3D_SHADER_DESCRIPTOR_TYPE_CBV; + sym->register_space = instruction->declaration.cb.register_space; + sym->resource_idx = instruction->declaration.cb.register_index; + if (rb_put(&compiler->sm51_resource_table, &sym->key, &sym->entry) == -1) + vkd3d_free(sym); + }
- if ((push_cb = vkd3d_dxbc_compiler_find_push_constant_buffer(compiler, reg))) + if ((push_cb = vkd3d_dxbc_compiler_find_push_constant_buffer(compiler, cb))) { /* Push constant buffers are handled in * vkd3d_dxbc_compiler_emit_push_constant_buffers(). @@ -5050,8 +5166,17 @@ static void vkd3d_dxbc_compiler_emit_dcl_sampler(struct vkd3d_dxbc_compiler *com uint32_t type_id, ptr_type_id, var_id; struct vkd3d_symbol reg_symbol;
- if (instruction->declaration.sampler.register_space) - FIXME("Unhandled register space %u.\n", instruction->declaration.sampler.register_space); + if (shader_is_sm_5_1(compiler)) + { + struct vkd3d_sm51_symbol *sym; + sym = vkd3d_calloc(1, sizeof(*sym)); + sym->key.idx = reg->idx[0].offset; + sym->key.descriptor_type = VKD3D_SHADER_DESCRIPTOR_TYPE_SAMPLER; + sym->register_space = instruction->declaration.sampler.register_space; + sym->resource_idx = instruction->declaration.sampler.register_index; + if (rb_put(&compiler->sm51_resource_table, &sym->key, &sym->entry) == -1) + vkd3d_free(sym); + }
if (vkd3d_dxbc_compiler_has_combined_sampler(compiler, NULL, reg)) return; @@ -5272,8 +5397,18 @@ static void vkd3d_dxbc_compiler_emit_dcl_resource(struct vkd3d_dxbc_compiler *co { const struct vkd3d_shader_semantic *semantic = &instruction->declaration.semantic;
- if (semantic->register_space) - FIXME("Unhandled register space %u.\n", semantic->register_space); + if (shader_is_sm_5_1(compiler)) + { + struct vkd3d_sm51_symbol *sym; + sym = vkd3d_calloc(1, sizeof(*sym)); + sym->key.idx = semantic->reg.reg.idx[0].offset; + sym->key.descriptor_type = semantic->reg.reg.type == VKD3DSPR_UAV ? VKD3D_SHADER_DESCRIPTOR_TYPE_UAV : VKD3D_SHADER_DESCRIPTOR_TYPE_SRV; + sym->register_space = semantic->register_space; + sym->resource_idx = semantic->register_index; + if (rb_put(&compiler->sm51_resource_table, &sym->key, &sym->entry) == -1) + vkd3d_free(sym); + } + if (instruction->flags) FIXME("Unhandled UAV flags %#x.\n", instruction->flags);
@@ -5286,8 +5421,18 @@ static void vkd3d_dxbc_compiler_emit_dcl_resource_raw(struct vkd3d_dxbc_compiler { const struct vkd3d_shader_raw_resource *resource = &instruction->declaration.raw_resource;
- if (resource->register_space) - FIXME("Unhandled register space %u.\n", resource->register_space); + if (shader_is_sm_5_1(compiler)) + { + struct vkd3d_sm51_symbol *sym; + sym = vkd3d_calloc(1, sizeof(*sym)); + sym->key.idx = resource->dst.reg.idx[0].offset; + sym->key.descriptor_type = resource->dst.reg.type == VKD3DSPR_UAV ? VKD3D_SHADER_DESCRIPTOR_TYPE_UAV : VKD3D_SHADER_DESCRIPTOR_TYPE_SRV; + sym->register_space = resource->register_space; + sym->resource_idx = resource->register_index; + if (rb_put(&compiler->sm51_resource_table, &sym->key, &sym->entry) == -1) + vkd3d_free(sym); + } + if (instruction->flags) FIXME("Unhandled UAV flags %#x.\n", instruction->flags);
@@ -5302,8 +5447,18 @@ static void vkd3d_dxbc_compiler_emit_dcl_resource_structured(struct vkd3d_dxbc_c const struct vkd3d_shader_register *reg = &resource->reg.reg; unsigned int stride = resource->byte_stride;
- if (resource->register_space) - FIXME("Unhandled register space %u.\n", resource->register_space); + if (shader_is_sm_5_1(compiler)) + { + struct vkd3d_sm51_symbol *sym; + sym = vkd3d_calloc(1, sizeof(*sym)); + sym->key.idx = resource->reg.reg.idx[0].offset; + sym->key.descriptor_type = resource->reg.reg.type == VKD3DSPR_UAV ? VKD3D_SHADER_DESCRIPTOR_TYPE_UAV : VKD3D_SHADER_DESCRIPTOR_TYPE_SRV; + sym->register_space = resource->register_space; + sym->resource_idx = resource->register_index; + if (rb_put(&compiler->sm51_resource_table, &sym->key, &sym->entry) == -1) + vkd3d_free(sym); + } + if (instruction->flags) FIXME("Unhandled UAV flags %#x.\n", instruction->flags);
@@ -8717,6 +8872,7 @@ void vkd3d_dxbc_compiler_destroy(struct vkd3d_dxbc_compiler *compiler) vkd3d_spirv_builder_free(&compiler->spirv_builder);
rb_destroy(&compiler->symbol_table, vkd3d_symbol_free, NULL); + rb_destroy(&compiler->sm51_resource_table, vkd3d_sm51_symbol_free, NULL);
vkd3d_free(compiler->shader_phases); vkd3d_free(compiler->spec_constants); diff --git a/libs/vkd3d-shader/vkd3d_shader_private.h b/libs/vkd3d-shader/vkd3d_shader_private.h index 100d515..135b48b 100644 --- a/libs/vkd3d-shader/vkd3d_shader_private.h +++ b/libs/vkd3d-shader/vkd3d_shader_private.h @@ -615,6 +615,7 @@ struct vkd3d_shader_semantic enum vkd3d_data_type resource_data_type; struct vkd3d_shader_dst_param reg; unsigned int register_space; + unsigned int register_index; };
enum vkd3d_shader_input_sysval_semantic @@ -662,6 +663,7 @@ struct vkd3d_shader_register_semantic struct vkd3d_shader_sampler { struct vkd3d_shader_src_param src; + unsigned int register_index; unsigned int register_space; };
@@ -669,6 +671,7 @@ struct vkd3d_shader_constant_buffer { struct vkd3d_shader_src_param src; unsigned int size; + unsigned int register_index; unsigned int register_space; };
@@ -676,12 +679,14 @@ struct vkd3d_shader_structured_resource { struct vkd3d_shader_dst_param reg; unsigned int byte_stride; + unsigned int register_index; unsigned int register_space; };
struct vkd3d_shader_raw_resource { struct vkd3d_shader_dst_param dst; + unsigned int register_index; unsigned int register_space; };
diff --git a/libs/vkd3d/command.c b/libs/vkd3d/command.c index 8a7ff66..7245802 100644 --- a/libs/vkd3d/command.c +++ b/libs/vkd3d/command.c @@ -2655,7 +2655,7 @@ static void d3d12_command_list_update_descriptor_table(struct d3d12_command_list const struct d3d12_root_descriptor_table *descriptor_table; const struct d3d12_root_descriptor_table_range *range; VkDevice vk_device = list->device->vk_device; - unsigned int i, j, descriptor_count; + unsigned int i, j, k, descriptor_count; struct d3d12_desc *descriptor;
descriptor_table = root_signature_get_descriptor_table(root_signature, index); @@ -2678,14 +2678,26 @@ static void d3d12_command_list_update_descriptor_table(struct d3d12_command_list unsigned int register_idx = range->base_register_idx + j;
/* Track UAV counters. */ - if (range->descriptor_magic == VKD3D_DESCRIPTOR_MAGIC_UAV - && register_idx < ARRAY_SIZE(bindings->vk_uav_counter_views)) + if (list->state->uav_counter_mask != 0 && range->descriptor_magic == VKD3D_DESCRIPTOR_MAGIC_UAV) { - VkBufferView vk_counter_view = descriptor->magic == VKD3D_DESCRIPTOR_MAGIC_UAV - ? descriptor->u.view->vk_counter_view : VK_NULL_HANDLE; - if (bindings->vk_uav_counter_views[register_idx] != vk_counter_view) - bindings->uav_counter_dirty_mask |= 1u << register_idx; - bindings->vk_uav_counter_views[register_idx] = vk_counter_view; + const struct vkd3d_shader_uav_counter_binding *counter_bindings = list->state->uav_counters; + for (k = 0; k < VKD3D_SHADER_MAX_UNORDERED_ACCESS_VIEWS; k++) + { + if (list->state->uav_counter_mask & (1u << k)) + { + if (counter_bindings->register_space == range->register_space && + counter_bindings->register_index == register_idx) + { + VkBufferView vk_counter_view = descriptor->magic == VKD3D_DESCRIPTOR_MAGIC_UAV + ? descriptor->u.view->vk_counter_view : VK_NULL_HANDLE; + if (bindings->vk_uav_counter_views[k] != vk_counter_view) + bindings->uav_counter_dirty_mask |= 1u << k; + bindings->vk_uav_counter_views[k] = vk_counter_view; + break; + } + counter_bindings++; + } + } }
if (!vk_write_descriptor_set_from_d3d12_desc(current_descriptor_write, @@ -2841,7 +2853,7 @@ static void d3d12_command_list_update_uav_counter_descriptors(struct d3d12_comma const struct vkd3d_shader_uav_counter_binding *uav_counter = &state->uav_counters[i]; const VkBufferView *vk_uav_counter_views = bindings->vk_uav_counter_views;
- assert(vk_uav_counter_views[uav_counter->register_index]); + assert(vk_uav_counter_views[uav_counter->counter_index]);
vk_descriptor_writes[i].sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET; vk_descriptor_writes[i].pNext = NULL; @@ -2852,7 +2864,7 @@ static void d3d12_command_list_update_uav_counter_descriptors(struct d3d12_comma vk_descriptor_writes[i].descriptorType = VK_DESCRIPTOR_TYPE_STORAGE_TEXEL_BUFFER; vk_descriptor_writes[i].pImageInfo = NULL; vk_descriptor_writes[i].pBufferInfo = NULL; - vk_descriptor_writes[i].pTexelBufferView = &vk_uav_counter_views[uav_counter->register_index]; + vk_descriptor_writes[i].pTexelBufferView = &vk_uav_counter_views[uav_counter->counter_index]; }
VK_CALL(vkUpdateDescriptorSets(vk_device, uav_counter_count, vk_descriptor_writes, 0, NULL)); diff --git a/libs/vkd3d/state.c b/libs/vkd3d/state.c index e1f7da9..9add56b 100644 --- a/libs/vkd3d/state.c +++ b/libs/vkd3d/state.c @@ -309,12 +309,6 @@ static bool vk_binding_from_d3d12_descriptor_range(struct VkDescriptorSetLayoutB = vk_descriptor_type_from_d3d12_range_type(descriptor_range->RangeType, is_buffer); binding_desc->descriptorCount = 1;
- if (descriptor_range->RegisterSpace) - { - FIXME("Unhandled register space %u.\n", descriptor_range->RegisterSpace); - return false; - } - binding_desc->stageFlags = stage_flags_from_visibility(shader_visibility); binding_desc->pImmutableSamplers = NULL;
@@ -495,12 +489,6 @@ static HRESULT d3d12_root_signature_init_push_constants(struct d3d12_root_signat if (p->ParameterType != D3D12_ROOT_PARAMETER_TYPE_32BIT_CONSTANTS) continue;
- if (p->u.Constants.RegisterSpace) - { - FIXME("Unhandled register space %u for parameter %u.\n", p->u.Constants.RegisterSpace, i); - return E_NOTIMPL; - } - idx = push_constant_count == 1 ? 0 : p->ShaderVisibility; offset = push_constants_offset[idx]; push_constants_offset[idx] += p->u.Constants.Num32BitValues * sizeof(uint32_t); @@ -510,6 +498,7 @@ static HRESULT d3d12_root_signature_init_push_constants(struct d3d12_root_signat ? push_constants[0].stageFlags : stage_flags_from_visibility(p->ShaderVisibility); root_constant->offset = offset;
+ root_signature->root_constants[j].register_space = p->u.Constants.RegisterSpace; root_signature->root_constants[j].register_index = p->u.Constants.ShaderRegister; root_signature->root_constants[j].shader_visibility = vkd3d_shader_visibility_from_d3d12(p->ShaderVisibility); @@ -533,7 +522,7 @@ struct vkd3d_descriptor_set_context };
static void d3d12_root_signature_append_vk_binding(struct d3d12_root_signature *root_signature, - enum vkd3d_shader_descriptor_type descriptor_type, unsigned int register_idx, + enum vkd3d_shader_descriptor_type descriptor_type, unsigned int register_space, unsigned int register_idx, bool buffer_descriptor, enum vkd3d_shader_visibility shader_visibility, struct vkd3d_descriptor_set_context *context) { @@ -541,6 +530,7 @@ static void d3d12_root_signature_append_vk_binding(struct d3d12_root_signature * = &root_signature->descriptor_mapping[context->descriptor_index++];
mapping->type = descriptor_type; + mapping->register_space = register_space; mapping->register_index = register_idx; mapping->shader_visibility = shader_visibility; mapping->flags = buffer_descriptor ? VKD3D_SHADER_BINDING_FLAG_BUFFER : VKD3D_SHADER_BINDING_FLAG_IMAGE; @@ -549,7 +539,7 @@ static void d3d12_root_signature_append_vk_binding(struct d3d12_root_signature * }
static uint32_t d3d12_root_signature_assign_vk_bindings(struct d3d12_root_signature *root_signature, - enum vkd3d_shader_descriptor_type descriptor_type, unsigned int base_register_idx, + enum vkd3d_shader_descriptor_type descriptor_type, unsigned int register_space, unsigned int base_register_idx, unsigned int binding_count, bool is_buffer_descriptor, bool duplicate_descriptors, enum vkd3d_shader_visibility shader_visibility, struct vkd3d_descriptor_set_context *context) { @@ -566,10 +556,10 @@ static uint32_t d3d12_root_signature_assign_vk_bindings(struct d3d12_root_signat { if (duplicate_descriptors) d3d12_root_signature_append_vk_binding(root_signature, descriptor_type, - base_register_idx + i, true, shader_visibility, context); + register_space, base_register_idx + i, true, shader_visibility, context);
d3d12_root_signature_append_vk_binding(root_signature, descriptor_type, - base_register_idx + i, is_buffer_descriptor, shader_visibility, context); + register_space, base_register_idx + i, is_buffer_descriptor, shader_visibility, context); } return first_binding; } @@ -625,7 +615,7 @@ static HRESULT d3d12_root_signature_init_root_descriptor_tables(struct d3d12_roo
vk_binding = d3d12_root_signature_assign_vk_bindings(root_signature, vkd3d_descriptor_type_from_d3d12_range_type(range->RangeType), - range->BaseShaderRegister, range->NumDescriptors, false, true, + range->RegisterSpace, range->BaseShaderRegister, range->NumDescriptors, false, true, vkd3d_shader_visibility_from_d3d12(p->ShaderVisibility), context);
/* Unroll descriptor range. */ @@ -658,6 +648,7 @@ static HRESULT d3d12_root_signature_init_root_descriptor_tables(struct d3d12_roo table->ranges[j].binding = vk_binding; table->ranges[j].descriptor_magic = vkd3d_descriptor_magic_from_d3d12(range->RangeType); table->ranges[j].base_register_idx = range->BaseShaderRegister; + table->ranges[j].register_space = range->RegisterSpace; } }
@@ -683,15 +674,9 @@ static HRESULT d3d12_root_signature_init_root_descriptors(struct d3d12_root_sign
root_signature->push_descriptor_mask |= 1u << i;
- if (p->u.Descriptor.RegisterSpace) - { - FIXME("Unhandled register space %u for parameter %u.\n", p->u.Descriptor.RegisterSpace, i); - return E_NOTIMPL; - } - cur_binding->binding = d3d12_root_signature_assign_vk_bindings(root_signature, vkd3d_descriptor_type_from_d3d12_root_parameter_type(p->ParameterType), - p->u.Descriptor.ShaderRegister, 1, true, false, + p->u.Descriptor.RegisterSpace, p->u.Descriptor.ShaderRegister, 1, true, false, vkd3d_shader_visibility_from_d3d12(p->ShaderVisibility), context); cur_binding->descriptorType = vk_descriptor_type_from_d3d12_root_parameter(p->ParameterType); cur_binding->descriptorCount = 1; @@ -728,7 +713,7 @@ static HRESULT d3d12_root_signature_init_static_samplers(struct d3d12_root_signa return hr;
cur_binding->binding = d3d12_root_signature_assign_vk_bindings(root_signature, - VKD3D_SHADER_DESCRIPTOR_TYPE_SAMPLER, s->ShaderRegister, 1, false, false, + VKD3D_SHADER_DESCRIPTOR_TYPE_SAMPLER, s->RegisterSpace, s->ShaderRegister, 1, false, false, vkd3d_shader_visibility_from_d3d12(s->ShaderVisibility), context); cur_binding->descriptorType = VK_DESCRIPTOR_TYPE_SAMPLER; cur_binding->descriptorCount = 1; @@ -1451,7 +1436,14 @@ static HRESULT d3d12_pipeline_state_init_compute_uav_counters(struct d3d12_pipel if (!(shader_info->uav_counter_mask & (1u << i))) continue;
+ /* UAV counters will lookup Vulkan bindings based on the mask index directly. + * We currently don't know the actual space/binding for this UAV, + * but register_space/register_index are fixed up later after compilation is finished. */ + state->uav_counters[j].register_space = 0; state->uav_counters[j].register_index = i; + + state->uav_counters[j].counter_index = i; + state->uav_counters[j].shader_visibility = VKD3D_SHADER_VISIBILITY_COMPUTE; state->uav_counters[j].binding.set = context.set_index; state->uav_counters[j].binding.binding = context.descriptor_binding; @@ -1507,6 +1499,10 @@ static HRESULT d3d12_pipeline_state_init_compute(struct d3d12_pipeline_state *st struct vkd3d_shader_code dxbc; HRESULT hr; int ret; + unsigned int i, j; + unsigned int uav_counter_spaces[VKD3D_SHADER_MAX_UNORDERED_ACCESS_VIEWS] = { 0 }; + unsigned int uav_counter_bindings[VKD3D_SHADER_MAX_UNORDERED_ACCESS_VIEWS] = { 0 }; + struct vkd3d_shader_effective_uav_counter_binding_info uav_binding_info = { VKD3D_SHADER_STRUCTURE_TYPE_EFFECTIVE_UAV_COUNTER_BINDING_INFO };
state->ID3D12PipelineState_iface.lpVtbl = &d3d12_pipeline_state_vtbl; state->refcount = 1; @@ -1550,8 +1546,14 @@ static HRESULT d3d12_pipeline_state_init_compute(struct d3d12_pipeline_state *st shader_interface.uav_counters = state->uav_counters; shader_interface.uav_counter_count = vkd3d_popcount(state->uav_counter_mask);
+ shader_interface.next = &uav_binding_info; + uav_binding_info.uav_register_spaces = uav_counter_spaces; + uav_binding_info.uav_register_bindings = uav_counter_bindings; + uav_binding_info.uav_counter_count = VKD3D_SHADER_MAX_UNORDERED_ACCESS_VIEWS; + vk_pipeline_layout = state->vk_pipeline_layout ? state->vk_pipeline_layout : root_signature->vk_pipeline_layout; + if (FAILED(hr = vkd3d_create_compute_pipeline(device, &desc->CS, &shader_interface, vk_pipeline_layout, &state->u.compute.vk_pipeline))) { @@ -1575,6 +1577,17 @@ static HRESULT d3d12_pipeline_state_init_compute(struct d3d12_pipeline_state *st return hr; }
+ /* Map back to actual space/bindings for the UAV counter now that we know. */ + for (i = 0, j = 0; i < VKD3D_SHADER_MAX_UNORDERED_ACCESS_VIEWS; i++) + { + if (state->uav_counter_mask & (1u << i)) + { + state->uav_counters[j].register_space = uav_counter_spaces[i]; + state->uav_counters[j].register_index = uav_counter_bindings[i]; + j++; + } + } + state->vk_bind_point = VK_PIPELINE_BIND_POINT_COMPUTE; d3d12_device_add_ref(state->device = device);
@@ -2911,6 +2924,7 @@ HRESULT vkd3d_uav_clear_state_init(struct vkd3d_uav_clear_state *state, struct d
binding.type = VKD3D_SHADER_DESCRIPTOR_TYPE_UAV; binding.register_index = 0; + binding.register_space = 0; binding.shader_visibility = VKD3D_SHADER_VISIBILITY_COMPUTE; binding.binding.set = 0; binding.binding.binding = 0; @@ -2919,6 +2933,7 @@ HRESULT vkd3d_uav_clear_state_init(struct vkd3d_uav_clear_state *state, struct d push_constant_range.offset = 0; push_constant_range.size = sizeof(struct vkd3d_uav_clear_args);
+ push_constant.register_space = 0; push_constant.register_index = 0; push_constant.shader_visibility = VKD3D_SHADER_VISIBILITY_COMPUTE; push_constant.offset = 0; diff --git a/libs/vkd3d/vkd3d_private.h b/libs/vkd3d/vkd3d_private.h index 23abee7..86bef18 100644 --- a/libs/vkd3d/vkd3d_private.h +++ b/libs/vkd3d/vkd3d_private.h @@ -659,6 +659,7 @@ struct d3d12_root_descriptor_table_range
uint32_t descriptor_magic; unsigned int base_register_idx; + unsigned int register_space; };
struct d3d12_root_descriptor_table
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- tests/d3d12.c | 392 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 392 insertions(+)
diff --git a/tests/d3d12.c b/tests/d3d12.c index 5e1530a..969db3c 100644 --- a/tests/d3d12.c +++ b/tests/d3d12.c @@ -32813,6 +32813,397 @@ static void test_write_buffer_immediate(void) destroy_test_context(&context); }
+static void test_register_space_sm51(void) +{ + ID3D12DescriptorHeap *heap, *sampler_heap, *heaps[2]; + + D3D12_ROOT_SIGNATURE_DESC root_signature_desc; + D3D12_ROOT_PARAMETER root_parameters[2]; + + ID3D12Resource *input_buffers[8]; + ID3D12Resource* input_buffer_counter; + ID3D12Resource *textures[2]; + + struct resource_readback rb; + + D3D12_CONSTANT_BUFFER_VIEW_DESC cbv_desc; + D3D12_SHADER_RESOURCE_VIEW_DESC srv_desc; + D3D12_UNORDERED_ACCESS_VIEW_DESC uav_desc; + ID3D12GraphicsCommandList *command_list; + D3D12_CPU_DESCRIPTOR_HANDLE cpu_handle; + D3D12_GPU_DESCRIPTOR_HANDLE gpu_handle; + unsigned int i, descriptor_size; + D3D12_SAMPLER_DESC sampler_desc; + D3D12_SUBRESOURCE_DATA data; + struct test_context context; + ID3D12CommandQueue *queue; + HRESULT hr; + unsigned int counter_value; + + static const DWORD cs_code[] = + { +#if 0 + cbuffer CBuf : register(b1, space1) + { + float4 cbuffer_data; + }; + + Buffer<float4> Buf : register(t1, space2); + ByteAddressBuffer AddrBuf : register(t1, space3); + StructuredBuffer<float4> StructuredBuf : register(t1, space4); + RWBuffer<float4> RWBuf : register(u1, space5); + RWByteAddressBuffer RWAddrBuf : register(u1, space6); + RWStructuredBuffer<float4> RWStructuredBuf : register(u1, space7); + RWStructuredBuffer<float4> RWStructuredBufResult : register(u1, space8); + + Texture2D<float4> Tex : register(t1, space9); + RWTexture2D<float> RWTex : register(u1, space10); + SamplerState Samp : register(s1, space11); + + [numthreads(1, 1, 1)] + void main() + { + float4 res = 1.0.xxxx; + + res *= cbuffer_data; + res *= Buf[0]; + res *= asfloat(AddrBuf.Load4(0)); + res *= StructuredBuf[0]; + res *= RWBuf[0]; + res *= asfloat(RWAddrBuf.Load4(0)); + res *= RWStructuredBuf[0]; + + res *= Tex.SampleLevel(Samp, float2(0.5, 0.5), 0.0).xxxx; + res *= RWTex[int2(0, 0)].xxxx; + + RWStructuredBuf.IncrementCounter(); + RWStructuredBufResult[0] = res; + } +#endif + 0x43425844, 0x70f33bd3, 0x11527a3b, 0x08c5298b, 0x28a1f88e, 0x00000001, 0x00000434, 0x00000004, + 0x00000030, 0x00000040, 0x00000050, 0x00000424, 0x4e475349, 0x00000008, 0x00000000, 0x00000008, + 0x4e47534f, 0x00000008, 0x00000000, 0x00000008, 0x58454853, 0x000003cc, 0x00050051, 0x000000f3, + 0x0100086a, 0x07000059, 0x00308e46, 0x00000000, 0x00000001, 0x00000001, 0x00000001, 0x00000001, + 0x0600005a, 0x00306e46, 0x00000000, 0x00000001, 0x00000001, 0x0000000b, 0x07000858, 0x00307e46, + 0x00000000, 0x00000001, 0x00000001, 0x00005555, 0x00000002, 0x060000a1, 0x00307e46, 0x00000001, + 0x00000001, 0x00000001, 0x00000003, 0x070000a2, 0x00307e46, 0x00000002, 0x00000001, 0x00000001, + 0x00000010, 0x00000004, 0x07001858, 0x00307e46, 0x00000003, 0x00000001, 0x00000001, 0x00005555, + 0x00000009, 0x0700089c, 0x0031ee46, 0x00000000, 0x00000001, 0x00000001, 0x00005555, 0x00000005, + 0x0600009d, 0x0031ee46, 0x00000001, 0x00000001, 0x00000001, 0x00000006, 0x0780009e, 0x0031ee46, + 0x00000002, 0x00000001, 0x00000001, 0x00000010, 0x00000007, 0x0700009e, 0x0031ee46, 0x00000003, + 0x00000001, 0x00000001, 0x00000010, 0x00000008, 0x0700189c, 0x0031ee46, 0x00000004, 0x00000001, + 0x00000001, 0x00005555, 0x0000000a, 0x02000068, 0x00000002, 0x0400009b, 0x00000001, 0x00000001, + 0x00000001, 0x0b00002d, 0x001000f2, 0x00000000, 0x00004002, 0x00000000, 0x00000000, 0x00000000, + 0x00000000, 0x00207e46, 0x00000000, 0x00000001, 0x09000038, 0x001000f2, 0x00000000, 0x00100e46, + 0x00000000, 0x00308e46, 0x00000000, 0x00000001, 0x00000000, 0x080000a5, 0x001000f2, 0x00000001, + 0x00004001, 0x00000000, 0x00207e46, 0x00000001, 0x00000001, 0x07000038, 0x001000f2, 0x00000000, + 0x00100e46, 0x00000000, 0x00100e46, 0x00000001, 0x0a0000a7, 0x001000f2, 0x00000001, 0x00004001, + 0x00000000, 0x00004001, 0x00000000, 0x00207e46, 0x00000002, 0x00000001, 0x07000038, 0x001000f2, + 0x00000000, 0x00100e46, 0x00000000, 0x00100e46, 0x00000001, 0x0b0000a3, 0x001000f2, 0x00000001, + 0x00004002, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x0021ee46, 0x00000000, 0x00000001, + 0x07000038, 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00100e46, 0x00000001, 0x080000a5, + 0x001000f2, 0x00000001, 0x00004001, 0x00000000, 0x0021ee46, 0x00000001, 0x00000001, 0x07000038, + 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00100e46, 0x00000001, 0x0a0000a7, 0x001000f2, + 0x00000001, 0x00004001, 0x00000000, 0x00004001, 0x00000000, 0x0021ee46, 0x00000002, 0x00000001, + 0x07000038, 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00100e46, 0x00000001, 0x10000048, + 0x00100012, 0x00000001, 0x00004002, 0x3f000000, 0x3f000000, 0x00000000, 0x00000000, 0x00207e46, + 0x00000003, 0x00000001, 0x00206000, 0x00000000, 0x00000001, 0x00004001, 0x00000000, 0x07000038, + 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00100006, 0x00000001, 0x0b0000a3, 0x00100012, + 0x00000001, 0x00004002, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x0021ee46, 0x00000004, + 0x00000001, 0x07000038, 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00100006, 0x00000001, + 0x060000b2, 0x00100012, 0x00000001, 0x0021e000, 0x00000002, 0x00000001, 0x0a0000a8, 0x0021e0f2, + 0x00000003, 0x00000001, 0x00004001, 0x00000000, 0x00004001, 0x00000000, 0x00100e46, 0x00000000, + 0x0100003e, 0x30494653, 0x00000008, 0x00000800, 0x00000000, + }; + + static const D3D12_DESCRIPTOR_RANGE_TYPE range_types[] = { + /* CBV<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_CBV, + /* Buffer<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_SRV, + /* ByteAddressBuffer<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_SRV, + /* StructuredBuffer<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_SRV, + /* RWBuffer<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_UAV, + /* RWByteAddressBuffer<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_UAV, + /* RWStructuredBuffer<> with atomic counter */ + D3D12_DESCRIPTOR_RANGE_TYPE_UAV, + /* RWStructuredBuffer<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_UAV, + /* Texture<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_SRV, + /* RWTexture<> */ + D3D12_DESCRIPTOR_RANGE_TYPE_UAV, + /* SamplerState */ + D3D12_DESCRIPTOR_RANGE_TYPE_SAMPLER, + }; + + static const float buffer_data[ARRAY_SIZE(range_types) - 1][D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT / sizeof(float)] = { + { 2.0f, 2.0f, 2.0f, 2.0f }, + { 3.0f, 3.0f, 3.0f, 3.0f }, + { 4.0f, 4.0f, 4.0f, 4.0f }, + { 5.0f, 5.0f, 5.0f, 5.0f }, + { 6.0f, 6.0f, 6.0f, 6.0f }, + { 7.0f, 7.0f, 7.0f, 7.0f }, + { 8.0f, 8.0f, 8.0f, 8.0f }, + { 9.0f, 9.0f, 9.0f, 9.0f }, + { 10.0f, 10.0f, 10.0f, 10.0f }, + { 11.0f, 11.0f, 11.0f, 11.0f }, + }; + + static const uint8_t zero_data[D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT] = { 0 }; + + D3D12_DESCRIPTOR_RANGE descriptor_range[ARRAY_SIZE(range_types)]; + + if (!init_compute_test_context(&context)) + return; + command_list = context.list; + queue = context.queue; + + root_signature_desc.NumParameters = 2; + root_signature_desc.Flags = 0; + root_signature_desc.NumStaticSamplers = 0; + root_signature_desc.pStaticSamplers = NULL; + root_signature_desc.pParameters = root_parameters; + + root_parameters[0].ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + root_parameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + root_parameters[0].DescriptorTable.NumDescriptorRanges = ARRAY_SIZE(range_types) - 1; + root_parameters[0].DescriptorTable.pDescriptorRanges = &descriptor_range[0]; + + root_parameters[1].ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + root_parameters[1].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + root_parameters[1].DescriptorTable.NumDescriptorRanges = 1; + root_parameters[1].DescriptorTable.pDescriptorRanges = &descriptor_range[ARRAY_SIZE(range_types) - 1]; + + memset(descriptor_range, 0, sizeof(descriptor_range)); + + for (i = 0; i < ARRAY_SIZE(range_types); i++) + { + descriptor_range[i].NumDescriptors = 1; + descriptor_range[i].BaseShaderRegister = 1; + descriptor_range[i].RegisterSpace = i + 1; + descriptor_range[i].OffsetInDescriptorsFromTableStart = (i != ARRAY_SIZE(range_types) - 1) ? i : 0; + descriptor_range[i].RangeType = range_types[i]; + } + + hr = create_root_signature(context.device, &root_signature_desc, &context.root_signature); + ok(SUCCEEDED(hr), "Failed to create root signature, hr %#x.\n", hr); + + context.pipeline_state = create_compute_pipeline_state(context.device, + context.root_signature, shader_bytecode(cs_code, sizeof(cs_code))); + + heap = create_gpu_descriptor_heap(context.device, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, ARRAY_SIZE(range_types) - 1); + sampler_heap = create_gpu_descriptor_heap(context.device, D3D12_DESCRIPTOR_HEAP_TYPE_SAMPLER, 1); + + memset(&sampler_desc, 0, sizeof(sampler_desc)); + sampler_desc.Filter = D3D12_FILTER_MIN_MAG_MIP_POINT; + sampler_desc.AddressU = D3D12_TEXTURE_ADDRESS_MODE_WRAP; + sampler_desc.AddressV = D3D12_TEXTURE_ADDRESS_MODE_WRAP; + sampler_desc.AddressW = D3D12_TEXTURE_ADDRESS_MODE_WRAP; + + cpu_handle = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(sampler_heap); + ID3D12Device_CreateSampler(context.device, &sampler_desc, cpu_handle); + + descriptor_size = ID3D12Device_GetDescriptorHandleIncrementSize(context.device, + D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV); + + /* CBV<> */ + input_buffers[0] = create_default_buffer(context.device, sizeof(buffer_data[0]), + D3D12_RESOURCE_FLAG_NONE, D3D12_RESOURCE_STATE_COPY_DEST); + /* Buffer<> */ + input_buffers[1] = create_default_buffer(context.device, sizeof(buffer_data[1]), + D3D12_RESOURCE_FLAG_NONE, D3D12_RESOURCE_STATE_COPY_DEST); + /* ByteAddressBuffer<> */ + input_buffers[2] = create_default_buffer(context.device, sizeof(buffer_data[2]), + D3D12_RESOURCE_FLAG_NONE, D3D12_RESOURCE_STATE_COPY_DEST); + /* StructuredBuffer<> */ + input_buffers[3] = create_default_buffer(context.device, sizeof(buffer_data[3]), + D3D12_RESOURCE_FLAG_NONE, D3D12_RESOURCE_STATE_COPY_DEST); + /* RWBuffer<> */ + input_buffers[4] = create_default_buffer(context.device, sizeof(buffer_data[4]), + D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, D3D12_RESOURCE_STATE_COPY_DEST); + /* RWByteAddressBuffer<> */ + input_buffers[5] = create_default_buffer(context.device, sizeof(buffer_data[5]), + D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, D3D12_RESOURCE_STATE_COPY_DEST); + /* RWStructuredBuffer<> with counter */ + input_buffers[6] = create_default_buffer(context.device, sizeof(buffer_data[6]), + D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, D3D12_RESOURCE_STATE_COPY_DEST); + + input_buffer_counter = create_default_buffer(context.device, sizeof(buffer_data[6]), + D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, D3D12_RESOURCE_STATE_COPY_DEST); + + /* RWStructuredBuffer<> without counter */ + input_buffers[7] = create_default_buffer(context.device, sizeof(buffer_data[7]), + D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, D3D12_RESOURCE_STATE_COPY_DEST); + + textures[0] = create_default_texture2d(context.device, 1, 1, 1, 1, DXGI_FORMAT_R32_FLOAT, D3D12_RESOURCE_FLAG_NONE, D3D12_RESOURCE_STATE_COPY_DEST); + textures[1] = create_default_texture2d(context.device, 1, 1, 1, 1, DXGI_FORMAT_R32_FLOAT, D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, D3D12_RESOURCE_STATE_COPY_DEST); + + cpu_handle = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(heap); + + /* CBV<> */ + cbv_desc.BufferLocation = ID3D12Resource_GetGPUVirtualAddress(input_buffers[0]); + cbv_desc.SizeInBytes = sizeof(buffer_data[0]); + ID3D12Device_CreateConstantBufferView(context.device, &cbv_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* Buffer<> */ + srv_desc.Format = DXGI_FORMAT_R32G32B32A32_FLOAT; + srv_desc.ViewDimension = D3D12_SRV_DIMENSION_BUFFER; + srv_desc.Shader4ComponentMapping = D3D12_DEFAULT_SHADER_4_COMPONENT_MAPPING; + srv_desc.Buffer.Flags = D3D12_BUFFER_SRV_FLAG_NONE; + srv_desc.Buffer.FirstElement = 0; + srv_desc.Buffer.NumElements = 1; + srv_desc.Buffer.StructureByteStride = 0; + ID3D12Device_CreateShaderResourceView(context.device, input_buffers[1], &srv_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* ByteAddressBuffer<> */ + srv_desc.Format = DXGI_FORMAT_R32_TYPELESS; + srv_desc.Buffer.Flags = D3D12_BUFFER_SRV_FLAG_RAW; + srv_desc.Buffer.FirstElement = 0; + srv_desc.Buffer.NumElements = 4; + srv_desc.Buffer.StructureByteStride = 0; + ID3D12Device_CreateShaderResourceView(context.device, input_buffers[2], &srv_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* StructuredBuffer<> */ + srv_desc.Format = DXGI_FORMAT_UNKNOWN; + srv_desc.Buffer.Flags = D3D12_BUFFER_SRV_FLAG_NONE; + srv_desc.Buffer.FirstElement = 0; + srv_desc.Buffer.NumElements = 1; + srv_desc.Buffer.StructureByteStride = 16; + ID3D12Device_CreateShaderResourceView(context.device, input_buffers[3], &srv_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* RWBuffer<> */ + uav_desc.Format = DXGI_FORMAT_R32G32B32A32_FLOAT; + uav_desc.ViewDimension = D3D12_UAV_DIMENSION_BUFFER; + uav_desc.Buffer.Flags = D3D12_BUFFER_UAV_FLAG_NONE; + uav_desc.Buffer.FirstElement = 0; + uav_desc.Buffer.NumElements = 1; + uav_desc.Buffer.StructureByteStride = 0; + uav_desc.Buffer.CounterOffsetInBytes = 0; + ID3D12Device_CreateUnorderedAccessView(context.device, input_buffers[4], NULL, &uav_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* RWByteAddressBuffer<> */ + uav_desc.Format = DXGI_FORMAT_R32_TYPELESS; + uav_desc.Buffer.Flags = D3D12_BUFFER_UAV_FLAG_RAW; + uav_desc.Buffer.StructureByteStride = 0; + uav_desc.Buffer.NumElements = 4; + ID3D12Device_CreateUnorderedAccessView(context.device, input_buffers[5], NULL, &uav_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* RWStructuredBuffer<> with counter */ + uav_desc.Format = DXGI_FORMAT_UNKNOWN; + uav_desc.Buffer.Flags = D3D12_BUFFER_UAV_FLAG_NONE; + uav_desc.Buffer.StructureByteStride = 16; + uav_desc.Buffer.NumElements = 1; + uav_desc.Buffer.CounterOffsetInBytes = 0; + ID3D12Device_CreateUnorderedAccessView(context.device, input_buffers[6], input_buffer_counter, &uav_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* RWStructuredBuffer<> without counter */ + uav_desc.Format = DXGI_FORMAT_UNKNOWN; + uav_desc.Buffer.Flags = D3D12_BUFFER_UAV_FLAG_NONE; + uav_desc.Buffer.StructureByteStride = 16; + uav_desc.Buffer.NumElements = 1; + uav_desc.Buffer.CounterOffsetInBytes = 0; + ID3D12Device_CreateUnorderedAccessView(context.device, input_buffers[7], NULL, &uav_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* Texture */ + srv_desc.Format = DXGI_FORMAT_R32_FLOAT; + srv_desc.Shader4ComponentMapping = D3D12_DEFAULT_SHADER_4_COMPONENT_MAPPING; + srv_desc.ViewDimension = D3D12_SRV_DIMENSION_TEXTURE2D; + srv_desc.Texture2D.MipLevels = 1; + srv_desc.Texture2D.MostDetailedMip = 0; + srv_desc.Texture2D.PlaneSlice = 0; + srv_desc.Texture2D.ResourceMinLODClamp = 0; + ID3D12Device_CreateShaderResourceView(context.device, textures[0], &srv_desc, cpu_handle); + cpu_handle.ptr += descriptor_size; + + /* RWTexture */ + uav_desc.ViewDimension = D3D12_UAV_DIMENSION_TEXTURE2D; + uav_desc.Format = DXGI_FORMAT_R32_FLOAT; + uav_desc.Texture2D.MipSlice = 0; + uav_desc.Texture2D.PlaneSlice = 0; + ID3D12Device_CreateUnorderedAccessView(context.device, textures[1], NULL, &uav_desc, cpu_handle); + + for (i = 0; i < 8; i++) + { + upload_buffer_data(input_buffers[i], 0, sizeof(buffer_data[i]), buffer_data[i], queue, command_list); + reset_command_list(command_list, context.allocator); + + if (i != 0) + { + transition_resource_state(command_list, input_buffers[i], D3D12_RESOURCE_STATE_COPY_DEST, + i < 4 ? D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE + : D3D12_RESOURCE_STATE_UNORDERED_ACCESS); + } + else + { + transition_resource_state(command_list, input_buffers[i], D3D12_RESOURCE_STATE_COPY_DEST, + D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER); + } + } + + for (i = 0; i < 2; i++) + { + D3D12_SUBRESOURCE_DATA sub; + sub.pData = buffer_data[8 + i]; + sub.RowPitch = D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT / 4; + sub.SlicePitch = 0; + upload_texture_data(textures[i], &sub, 1, queue, command_list); + reset_command_list(command_list, context.allocator); + transition_resource_state(command_list, textures[i], D3D12_RESOURCE_STATE_COPY_DEST, + i == 0 ? D3D12_RESOURCE_STATE_NON_PIXEL_SHADER_RESOURCE : D3D12_RESOURCE_STATE_UNORDERED_ACCESS); + } + + upload_buffer_data(input_buffer_counter, 0, sizeof(zero_data), zero_data, queue, command_list); + reset_command_list(command_list, context.allocator); + transition_resource_state(command_list, input_buffer_counter, D3D12_RESOURCE_STATE_UNORDERED_ACCESS, + D3D12_RESOURCE_STATE_COPY_DEST); + + ID3D12GraphicsCommandList_SetComputeRootSignature(command_list, context.root_signature); + ID3D12GraphicsCommandList_SetPipelineState(command_list, context.pipeline_state); + heaps[0] = heap; heaps[1] = sampler_heap; + ID3D12GraphicsCommandList_SetDescriptorHeaps(command_list, ARRAY_SIZE(heaps), heaps); + ID3D12GraphicsCommandList_SetComputeRootDescriptorTable(command_list, 0, + ID3D12DescriptorHeap_GetGPUDescriptorHandleForHeapStart(heap)); + ID3D12GraphicsCommandList_SetComputeRootDescriptorTable(command_list, 1, + ID3D12DescriptorHeap_GetGPUDescriptorHandleForHeapStart(sampler_heap)); + ID3D12GraphicsCommandList_Dispatch(command_list, 1, 1, 1); + + transition_resource_state(command_list, input_buffers[7], D3D12_RESOURCE_STATE_UNORDERED_ACCESS, + D3D12_RESOURCE_STATE_COPY_SOURCE); + get_buffer_readback_with_command_list(input_buffers[7], DXGI_FORMAT_UNKNOWN, &rb, queue, command_list); + for (i = 0; i < 4; i++) + { + /* Start value of 9 is for the StructuredBuffer we write to. */ + float reference = 2 * 3 * 4 * 5 * 6 * 7 * 8 * 10 * 11; + ok(get_readback_float(&rb, i, 0) == reference, "Readback value is: %f\n", get_readback_float(&rb, i, 0)); + } + release_resource_readback(&rb); + reset_command_list(command_list, context.allocator); + counter_value = read_uav_counter(&context, input_buffer_counter, 0); + ok(counter_value == 1, "Atomic counter is %u.\n", counter_value); + + for (i = 0; i < 8; i++) + ID3D12Resource_Release(input_buffers[i]); + for (i = 0; i < 2; i++) + ID3D12Resource_Release(textures[i]); + ID3D12Resource_Release(input_buffer_counter); + ID3D12DescriptorHeap_Release(heap); + ID3D12DescriptorHeap_Release(sampler_heap); + destroy_test_context(&context); +} + START_TEST(d3d12) { pfn_D3D12CreateDevice = get_d3d12_pfn(D3D12CreateDevice); @@ -32980,4 +33371,5 @@ START_TEST(d3d12) run_test(test_conditional_rendering); run_test(test_bufinfo_instruction); run_test(test_write_buffer_immediate); + run_test(test_register_space_sm51); }
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- tests/d3d12.c | 187 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 187 insertions(+)
diff --git a/tests/d3d12.c b/tests/d3d12.c index 969db3c..0679b5e 100644 --- a/tests/d3d12.c +++ b/tests/d3d12.c @@ -33204,6 +33204,192 @@ static void test_register_space_sm51(void) destroy_test_context(&context); }
+static void test_constant_buffer_sm51(void) +{ + ID3D12DescriptorHeap *heap; + + D3D12_ROOT_SIGNATURE_DESC root_signature_desc; + D3D12_ROOT_PARAMETER root_parameters[5]; + D3D12_DESCRIPTOR_RANGE descriptor_range; + ID3D12Resource *input_buffer, *output_buffer; + struct resource_readback rb; + + D3D12_CONSTANT_BUFFER_VIEW_DESC cbv_desc; + ID3D12GraphicsCommandList *command_list; + D3D12_CPU_DESCRIPTOR_HANDLE cpu_handle; + D3D12_GPU_DESCRIPTOR_HANDLE gpu_handle; + unsigned int i, descriptor_size; + D3D12_SAMPLER_DESC sampler_desc; + D3D12_SUBRESOURCE_DATA data; + struct test_context context; + ID3D12CommandQueue *queue; + HRESULT hr; + unsigned int counter_value; + + static const DWORD cs_code[] = { +#if 0 + cbuffer DescriptorTableCBV : register(b2, space1) + { + float4 table_data[8]; + }; + + cbuffer RootCBV : register(b3, space2) + { + float4 root_data[8]; + }; + + cbuffer RootConstant1 : register(b4, space3) + { + float4 c1; + float4 c2; + }; + + cbuffer RootConstant2 : register(b5, space4) + { + float4 c3; + float4 c4; + }; + + RWStructuredBuffer<float4> RWStructuredBuf : register(u6, space5); + + [numthreads(1, 1, 1)] + void main() + { + float4 res = float4(35, 40, 50, 60); + res += table_data[1]; + res += table_data[6]; + res += root_data[2]; + res += root_data[7]; + res += c1; + res += c2; + res += c3; + res += c4; + RWStructuredBuf[0] = res; + } +#endif + 0x43425844, 0xb9b08cff, 0xb39daa33, 0x3d0264dc, 0x7c5a0155, 0x00000001, 0x0000025c, 0x00000003, + 0x0000002c, 0x0000003c, 0x0000004c, 0x4e475349, 0x00000008, 0x00000000, 0x00000008, 0x4e47534f, + 0x00000008, 0x00000000, 0x00000008, 0x58454853, 0x00000208, 0x00050051, 0x00000082, 0x0100086a, + 0x07000059, 0x00308e46, 0x00000000, 0x00000002, 0x00000002, 0x00000007, 0x00000001, 0x07000059, + 0x00308e46, 0x00000001, 0x00000003, 0x00000003, 0x00000008, 0x00000002, 0x07000059, 0x00308e46, + 0x00000002, 0x00000004, 0x00000004, 0x00000002, 0x00000003, 0x07000059, 0x00308e46, 0x00000003, + 0x00000005, 0x00000005, 0x00000002, 0x00000004, 0x0700009e, 0x0031ee46, 0x00000000, 0x00000006, + 0x00000006, 0x00000010, 0x00000005, 0x02000068, 0x00000001, 0x0400009b, 0x00000001, 0x00000001, + 0x00000001, 0x0b000000, 0x001000f2, 0x00000000, 0x00308e46, 0x00000000, 0x00000002, 0x00000001, + 0x00308e46, 0x00000000, 0x00000002, 0x00000006, 0x09000000, 0x001000f2, 0x00000000, 0x00100e46, + 0x00000000, 0x00308e46, 0x00000001, 0x00000003, 0x00000002, 0x09000000, 0x001000f2, 0x00000000, + 0x00100e46, 0x00000000, 0x00308e46, 0x00000001, 0x00000003, 0x00000007, 0x09000000, 0x001000f2, + 0x00000000, 0x00100e46, 0x00000000, 0x00308e46, 0x00000002, 0x00000004, 0x00000000, 0x09000000, + 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00308e46, 0x00000002, 0x00000004, 0x00000001, + 0x09000000, 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00308e46, 0x00000003, 0x00000005, + 0x00000000, 0x09000000, 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00308e46, 0x00000003, + 0x00000005, 0x00000001, 0x0a000000, 0x001000f2, 0x00000000, 0x00100e46, 0x00000000, 0x00004002, + 0x420c0000, 0x42200000, 0x42480000, 0x42700000, 0x0a0000a8, 0x0021e0f2, 0x00000000, 0x00000006, + 0x00004001, 0x00000000, 0x00004001, 0x00000000, 0x00100e46, 0x00000000, 0x0100003e, + }; + + static const float buffer_data[D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT / 4] = { + 1, 2, 3, 4, + 5, 6, 7, 8, + 9, 10, 11, 12, + 13, 14, 15, 16, + }; + + if (!init_compute_test_context(&context)) + return; + command_list = context.list; + queue = context.queue; + + root_signature_desc.NumParameters = 5; + root_signature_desc.Flags = 0; + root_signature_desc.NumStaticSamplers = 0; + root_signature_desc.pStaticSamplers = NULL; + root_signature_desc.pParameters = root_parameters; + + root_parameters[0].ParameterType = D3D12_ROOT_PARAMETER_TYPE_DESCRIPTOR_TABLE; + root_parameters[0].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + root_parameters[0].DescriptorTable.NumDescriptorRanges = 1; + root_parameters[0].DescriptorTable.pDescriptorRanges = &descriptor_range; + + root_parameters[1].ParameterType = D3D12_ROOT_PARAMETER_TYPE_CBV; + root_parameters[1].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + root_parameters[1].Descriptor.RegisterSpace = 2; + root_parameters[1].Descriptor.ShaderRegister = 3; + + descriptor_range.RegisterSpace = 1; + descriptor_range.BaseShaderRegister = 2; + descriptor_range.OffsetInDescriptorsFromTableStart = 0; + descriptor_range.NumDescriptors = 1; + descriptor_range.RangeType = D3D12_DESCRIPTOR_RANGE_TYPE_CBV; + + root_parameters[2].ParameterType = D3D12_ROOT_PARAMETER_TYPE_32BIT_CONSTANTS; + root_parameters[2].Constants.RegisterSpace = 3; + root_parameters[2].Constants.ShaderRegister = 4; + root_parameters[2].Constants.Num32BitValues = 8; + root_parameters[2].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + + root_parameters[3].ParameterType = D3D12_ROOT_PARAMETER_TYPE_32BIT_CONSTANTS; + root_parameters[3].Constants.RegisterSpace = 4; + root_parameters[3].Constants.ShaderRegister = 5; + root_parameters[3].Constants.Num32BitValues = 8; + root_parameters[3].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + + root_parameters[4].ParameterType = D3D12_ROOT_PARAMETER_TYPE_UAV; + root_parameters[4].Descriptor.RegisterSpace = 5; + root_parameters[4].Descriptor.ShaderRegister = 6; + root_parameters[4].ShaderVisibility = D3D12_SHADER_VISIBILITY_ALL; + + hr = create_root_signature(context.device, &root_signature_desc, &context.root_signature); + ok(SUCCEEDED(hr), "Failed to create root signature, hr %#x.\n", hr); + + context.pipeline_state = create_compute_pipeline_state(context.device, + context.root_signature, + shader_bytecode(cs_code, sizeof(cs_code))); + + heap = create_gpu_descriptor_heap(context.device, D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV, 1); + + input_buffer = create_default_buffer(context.device, sizeof(buffer_data), + D3D12_RESOURCE_FLAG_NONE, D3D12_RESOURCE_STATE_COPY_DEST); + output_buffer = create_default_buffer(context.device, sizeof(buffer_data), + D3D12_RESOURCE_FLAG_ALLOW_UNORDERED_ACCESS, D3D12_RESOURCE_STATE_UNORDERED_ACCESS); + + cpu_handle = ID3D12DescriptorHeap_GetCPUDescriptorHandleForHeapStart(heap); + cbv_desc.BufferLocation = ID3D12Resource_GetGPUVirtualAddress(input_buffer); + cbv_desc.SizeInBytes = sizeof(buffer_data); + ID3D12Device_CreateConstantBufferView(context.device, &cbv_desc, cpu_handle); + + upload_buffer_data(input_buffer, 0, sizeof(buffer_data), buffer_data, queue, command_list); + reset_command_list(command_list, context.allocator); + transition_resource_state(command_list, input_buffer, D3D12_RESOURCE_STATE_COPY_DEST, D3D12_RESOURCE_STATE_VERTEX_AND_CONSTANT_BUFFER); + + ID3D12GraphicsCommandList_SetComputeRootSignature(command_list, context.root_signature); + ID3D12GraphicsCommandList_SetPipelineState(command_list, context.pipeline_state); + ID3D12GraphicsCommandList_SetDescriptorHeaps(command_list, 1, &heap); + ID3D12GraphicsCommandList_SetComputeRootDescriptorTable(command_list, 0, + ID3D12DescriptorHeap_GetGPUDescriptorHandleForHeapStart(heap)); + ID3D12GraphicsCommandList_SetComputeRootConstantBufferView(command_list, 1, ID3D12Resource_GetGPUVirtualAddress(input_buffer)); + ID3D12GraphicsCommandList_SetComputeRoot32BitConstants(command_list, 2, 8, &buffer_data[0], 0); + ID3D12GraphicsCommandList_SetComputeRoot32BitConstants(command_list, 3, 8, &buffer_data[8], 0); + ID3D12GraphicsCommandList_SetComputeRootUnorderedAccessView(command_list, 4, ID3D12Resource_GetGPUVirtualAddress(output_buffer)); + ID3D12GraphicsCommandList_Dispatch(command_list, 1, 1, 1); + + transition_resource_state(command_list, output_buffer, D3D12_RESOURCE_STATE_UNORDERED_ACCESS, + D3D12_RESOURCE_STATE_COPY_SOURCE); + get_buffer_readback_with_command_list(output_buffer, DXGI_FORMAT_UNKNOWN, &rb, queue, command_list); + for (i = 0; i < 4; i++) + { + static const float reference[] = { 77, 88, 104, 120 }; + ok(get_readback_float(&rb, i, 0) == reference[i], "Readback value is: %f\n", get_readback_float(&rb, i, 0)); + } + release_resource_readback(&rb); + reset_command_list(command_list, context.allocator); + + ID3D12Resource_Release(input_buffer); + ID3D12Resource_Release(output_buffer); + ID3D12DescriptorHeap_Release(heap); + destroy_test_context(&context); +} + START_TEST(d3d12) { pfn_D3D12CreateDevice = get_d3d12_pfn(D3D12CreateDevice); @@ -33372,4 +33558,5 @@ START_TEST(d3d12) run_test(test_bufinfo_instruction); run_test(test_write_buffer_immediate); run_test(test_register_space_sm51); + run_test(test_constant_buffer_sm51); }
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- libs/vkd3d-shader/vkd3d_shader_main.c | 35 +++++++++++++++++++++------ 1 file changed, 28 insertions(+), 7 deletions(-)
diff --git a/libs/vkd3d-shader/vkd3d_shader_main.c b/libs/vkd3d-shader/vkd3d_shader_main.c index aa486cc..fb89de4 100644 --- a/libs/vkd3d-shader/vkd3d_shader_main.c +++ b/libs/vkd3d-shader/vkd3d_shader_main.c @@ -25,16 +25,13 @@ VKD3D_DEBUG_ENV_NAME("VKD3D_SHADER_DEBUG"); STATIC_ASSERT(MEMBER_SIZE(struct vkd3d_shader_scan_info, uav_counter_mask) * CHAR_BIT >= VKD3D_SHADER_MAX_UNORDERED_ACCESS_VIEWS); STATIC_ASSERT(MEMBER_SIZE(struct vkd3d_shader_scan_info, uav_read_mask) * CHAR_BIT >= VKD3D_SHADER_MAX_UNORDERED_ACCESS_VIEWS);
-static void vkd3d_shader_dump_blob(const char *path, const char *prefix, const void *data, size_t size) +static void vkd3d_shader_dump_blob(const char *path, const char *prefix, const void *data, size_t size, + unsigned int id, const char *ext) { - static int shader_id = 0; char filename[1024]; - unsigned int id; FILE *f;
- id = InterlockedIncrement(&shader_id) - 1; - - snprintf(filename, ARRAY_SIZE(filename), "%s/vkd3d-shader-%s-%u.dxbc", path, prefix, id); + snprintf(filename, ARRAY_SIZE(filename), "%s/vkd3d-shader-%s-%u.%s", path, prefix, id, ext); if ((f = fopen(filename, "wb"))) { if (fwrite(data, 1, size, f) != size) @@ -50,6 +47,26 @@ static void vkd3d_shader_dump_blob(const char *path, const char *prefix, const v
static void vkd3d_shader_dump_shader(enum vkd3d_shader_type type, const struct vkd3d_shader_code *shader) { + static int shader_id = 0; + static bool enabled = true; + const char *path; + + if (!enabled) + return; + + if (!(path = getenv("VKD3D_SHADER_DUMP_PATH"))) + { + enabled = false; + return; + } + + vkd3d_shader_dump_blob(path, shader_get_type_prefix(type), shader->code, shader->size, + InterlockedIncrement(&shader_id) - 1, "dxbc"); +} + +static void vkd3d_shader_dump_spirv_shader(enum vkd3d_shader_type type, const struct vkd3d_shader_code *shader) +{ + static int shader_id = 0; static bool enabled = true; const char *path;
@@ -62,7 +79,8 @@ static void vkd3d_shader_dump_shader(enum vkd3d_shader_type type, const struct v return; }
- vkd3d_shader_dump_blob(path, shader_get_type_prefix(type), shader->code, shader->size); + vkd3d_shader_dump_blob(path, shader_get_type_prefix(type), shader->code, shader->size, + InterlockedIncrement(&shader_id) - 1, "spv"); }
struct vkd3d_shader_parser @@ -190,6 +208,9 @@ int vkd3d_shader_compile_dxbc(const struct vkd3d_shader_code *dxbc, if (ret >= 0) ret = vkd3d_dxbc_compiler_generate_spirv(spirv_compiler, spirv);
+ if (ret == 0) + vkd3d_shader_dump_spirv_shader(parser.shader_version.type, spirv); + vkd3d_dxbc_compiler_destroy(spirv_compiler); vkd3d_shader_parser_destroy(&parser); return ret;
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- Makefile.am | 4 ++-- configure.ac | 11 +++++++++++ 2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/Makefile.am b/Makefile.am index 00a5f58..1e2959f 100644 --- a/Makefile.am +++ b/Makefile.am @@ -83,9 +83,9 @@ libvkd3d_shader_la_SOURCES = \ libs/vkd3d-shader/vkd3d_shader.map \ libs/vkd3d-shader/vkd3d_shader_main.c \ libs/vkd3d-shader/vkd3d_shader_private.h -libvkd3d_shader_la_CFLAGS = $(AM_CFLAGS) @SPIRV_TOOLS_CFLAGS@ +libvkd3d_shader_la_CFLAGS = $(AM_CFLAGS) @SPIRV_TOOLS_CFLAGS@ @dxil_spirv_c_shared_CFLAGS@ libvkd3d_shader_la_LDFLAGS = $(AM_LDFLAGS) -version-info 1:0:0 -libvkd3d_shader_la_LIBADD = libvkd3d-common.la @SPIRV_TOOLS_LIBS@ +libvkd3d_shader_la_LIBADD = libvkd3d-common.la @SPIRV_TOOLS_LIBS@ @dxil_spirv_c_shared_LIBS@ if HAVE_LD_VERSION_SCRIPT libvkd3d_shader_la_LDFLAGS += -Wl,--version-script=$(srcdir)/libs/vkd3d-shader/vkd3d_shader.map EXTRA_libvkd3d_shader_la_DEPENDENCIES = $(srcdir)/libs/vkd3d-shader/vkd3d_shader.map diff --git a/configure.ac b/configure.ac index 355aaab..a7e973e 100644 --- a/configure.ac +++ b/configure.ac @@ -11,6 +11,8 @@ AC_ARG_VAR([CROSSCC64], [64-bit Windows cross compiler]) AC_ARG_WITH([xcb], AS_HELP_STRING([--with-xcb], [Build with XCB library (default: test)])) AC_ARG_WITH([spirv-tools], AS_HELP_STRING([--with-spirv-tools], [Build with SPIRV-Tools library (default: disabled)])) +AC_ARG_WITH([dxil-spirv], AS_HELP_STRING([--with-dxil-spirv], + [Build with dxil-spirv library for DXIL support (default: enabled)])) AC_ARG_ENABLE([demos], AS_HELP_STRING([--enable-demos], [Build demo programs (default: disabled)]),, [enable_demos=no]) @@ -113,6 +115,13 @@ AS_IF([test "x$with_xcb" != "xno"], HAVE_XCB=yes], [HAVE_XCB=no])])
+HAVE_DXIL_SPV=no +AS_IF([test "x$with_dxil_spirv" != "xno"], + [PKG_CHECK_MODULES([dxil_spirv_c_shared], [dxil-spirv-c-shared], + [AC_DEFINE([HAVE_DXIL_SPV], [1], [Define to 1 if you have dxil-spirv.]) + HAVE_DXIL_SPV=yes], [HAVE_DXIL_SPV=no])], + [HAVE_DXIL_SPV=no]) + dnl Check for functions VKD3D_CHECK_FUNC([HAVE_BUILTIN_CLZ], [__builtin_clz], [__builtin_clz(0)]) VKD3D_CHECK_FUNC([HAVE_BUILTIN_POPCOUNT], [__builtin_popcount], [__builtin_popcount(0)]) @@ -129,6 +138,7 @@ AM_CONDITIONAL([BUILD_TESTS], [test "x$enable_tests" != "xno"]) AM_CONDITIONAL([HAVE_WIDL], [test "x$WIDL" != "xno"]) AM_CONDITIONAL([HAVE_CROSSTARGET32], [test "x$CROSSTARGET32" != "xno"]) AM_CONDITIONAL([HAVE_CROSSTARGET64], [test "x$CROSSTARGET64" != "xno"]) +AM_CONDITIONAL([HAVE_DXIL_SPV], [test "x$HAVE_DXIL_SPV" = "xyes"])
AC_CONFIG_FILES([Makefile]) AC_OUTPUT @@ -144,6 +154,7 @@ AS_ECHO(["
Have XCB: ${HAVE_XCB} Have SPIRV-Tools: ${with_spirv_tools} + Have dxil-spirv: ${HAVE_DXIL_SPV}
Building demos: ${enable_demos} Building tests: ${enable_tests}
Hi,
While running your changed tests, I think I found new failures. Being a bot and all I'm not very good at pattern recognition, so I might be wrong, but could you please double-check?
Full results can be found at: https://testbot.winehq.org/JobDetails.pl?Key=64048
Your paranoid android.
=== debian10 (build log) ===
error: patch failed: configure.ac:11 Task: Patch failed to apply
=== debian10 (build log) ===
error: patch failed: configure.ac:11 Task: Patch failed to apply
DXIL blobs use ISG1.
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- libs/vkd3d-shader/dxbc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libs/vkd3d-shader/dxbc.c b/libs/vkd3d-shader/dxbc.c index b3f53ab..d1987c4 100644 --- a/libs/vkd3d-shader/dxbc.c +++ b/libs/vkd3d-shader/dxbc.c @@ -2096,7 +2096,7 @@ static int isgn_handler(const char *data, DWORD data_size, DWORD tag, void *ctx) { struct vkd3d_shader_signature *is = ctx;
- if (tag != TAG_ISGN) + if (tag != TAG_ISGN && tag != TAG_ISG1) return VKD3D_OK;
if (is->elements)
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- include/vkd3d_shader.h | 2 ++ libs/vkd3d-shader/vkd3d_shader.map | 1 + libs/vkd3d-shader/vkd3d_shader_main.c | 9 +++++++++ 3 files changed, 12 insertions(+)
diff --git a/include/vkd3d_shader.h b/include/vkd3d_shader.h index ec52e26..9e81d81 100644 --- a/include/vkd3d_shader.h +++ b/include/vkd3d_shader.h @@ -672,6 +672,8 @@ struct vkd3d_shader_signature_element *vkd3d_shader_find_signature_element( unsigned int semantic_index, unsigned int stream_index); void vkd3d_shader_free_shader_signature(struct vkd3d_shader_signature *signature);
+int vkd3d_shader_supports_dxil(void); + #endif /* VKD3D_SHADER_NO_PROTOTYPES */
/* diff --git a/libs/vkd3d-shader/vkd3d_shader.map b/libs/vkd3d-shader/vkd3d_shader.map index 74c38e1..bd3ec18 100644 --- a/libs/vkd3d-shader/vkd3d_shader.map +++ b/libs/vkd3d-shader/vkd3d_shader.map @@ -11,6 +11,7 @@ global: vkd3d_shader_parse_root_signature; vkd3d_shader_scan_dxbc; vkd3d_shader_serialize_root_signature; + vkd3d_shader_supports_dxil;
local: *; }; diff --git a/libs/vkd3d-shader/vkd3d_shader_main.c b/libs/vkd3d-shader/vkd3d_shader_main.c index fb89de4..234b80a 100644 --- a/libs/vkd3d-shader/vkd3d_shader_main.c +++ b/libs/vkd3d-shader/vkd3d_shader_main.c @@ -438,3 +438,12 @@ void vkd3d_shader_free_shader_signature(struct vkd3d_shader_signature *signature vkd3d_free(signature->elements); signature->elements = NULL; } + +int vkd3d_shader_supports_dxil(void) +{ +#ifdef HAVE_DXIL_SPV + return 1; +#else + return 0; +#endif +}
Need this to support subgroup operations for SM 6.0.
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- libs/vkd3d/device.c | 22 +++++++++++++++++++++- libs/vkd3d/utils.c | 3 +++ libs/vkd3d/vkd3d_private.h | 3 +++ 3 files changed, 27 insertions(+), 1 deletion(-)
diff --git a/libs/vkd3d/device.c b/libs/vkd3d/device.c index 4484e80..28430fb 100644 --- a/libs/vkd3d/device.c +++ b/libs/vkd3d/device.c @@ -481,6 +481,7 @@ static HRESULT vkd3d_instance_init(struct vkd3d_instance *instance, VkInstance vk_instance; VkResult vr; HRESULT hr; + uint32_t loader_version = VK_API_VERSION_1_0;
TRACE("Build: %s.\n", vkd3d_build);
@@ -521,13 +522,20 @@ static HRESULT vkd3d_instance_init(struct vkd3d_instance *instance, return hr; }
+ if (vk_global_procs->vkEnumerateInstanceVersion) + vk_global_procs->vkEnumerateInstanceVersion(&loader_version); + + /* Do not opt-in to versions we don't need yet. */ + if (loader_version > VK_API_VERSION_1_1) + loader_version = VK_API_VERSION_1_1; + application_info.sType = VK_STRUCTURE_TYPE_APPLICATION_INFO; application_info.pNext = NULL; application_info.pApplicationName = NULL; application_info.applicationVersion = 0; application_info.pEngineName = PACKAGE_NAME; application_info.engineVersion = vkd3d_get_vk_version(); - application_info.apiVersion = VK_API_VERSION_1_0; + application_info.apiVersion = loader_version;
if ((vkd3d_application_info = vkd3d_find_struct(create_info->next, APPLICATION_INFO))) { @@ -592,8 +600,13 @@ static HRESULT vkd3d_instance_init(struct vkd3d_instance *instance, }
instance->vk_instance = vk_instance; + instance->instance_version = loader_version;
TRACE("Created Vulkan instance %p.\n", vk_instance); + if (loader_version == VK_API_VERSION_1_1) + TRACE("Created Vulkan 1.1 instance.\n"); + else + TRACE("Created Vulkan 1.0 instance.\n");
instance->refcount = 1;
@@ -1720,6 +1733,8 @@ static HRESULT vkd3d_create_vk_device(struct d3d12_device *device, VkDevice vk_device; VkResult vr; HRESULT hr; + VkPhysicalDeviceProperties device_properties; + bool use_vulkan_11;
TRACE("device %p, create_info %p.\n", device, create_info);
@@ -1731,6 +1746,11 @@ static HRESULT vkd3d_create_vk_device(struct d3d12_device *device,
device->vk_physical_device = physical_device;
+ VK_CALL(vkGetPhysicalDeviceProperties(device->vk_physical_device, &device_properties)); + use_vulkan_11 = device_properties.apiVersion >= VK_API_VERSION_1_1 && + device->vkd3d_instance->instance_version >= VK_API_VERSION_1_1; + device->api_version = use_vulkan_11 ? VK_API_VERSION_1_1 : VK_API_VERSION_1_0; + if (FAILED(hr = vkd3d_select_queues(device->vkd3d_instance, physical_device, &device_queue_info))) return hr;
diff --git a/libs/vkd3d/utils.c b/libs/vkd3d/utils.c index ad42900..cc5697b 100644 --- a/libs/vkd3d/utils.c +++ b/libs/vkd3d/utils.c @@ -798,6 +798,8 @@ HRESULT hresult_from_vkd3d_result(int vkd3d_result) ERR("Could not get global proc addr for '" #name "'.\n"); \ return E_FAIL; \ } +#define MAYBE_LOAD_GLOBAL_PFN(name) \ + procs->name = (void *)vkGetInstanceProcAddr(NULL, #name);
HRESULT vkd3d_load_vk_global_procs(struct vkd3d_vk_global_procs *procs, PFN_vkGetInstanceProcAddr vkGetInstanceProcAddr) @@ -808,6 +810,7 @@ HRESULT vkd3d_load_vk_global_procs(struct vkd3d_vk_global_procs *procs,
LOAD_GLOBAL_PFN(vkCreateInstance) LOAD_GLOBAL_PFN(vkEnumerateInstanceExtensionProperties) + MAYBE_LOAD_GLOBAL_PFN(vkEnumerateInstanceVersion)
TRACE("Loaded global Vulkan procs.\n"); return S_OK; diff --git a/libs/vkd3d/vkd3d_private.h b/libs/vkd3d/vkd3d_private.h index 86bef18..29386c5 100644 --- a/libs/vkd3d/vkd3d_private.h +++ b/libs/vkd3d/vkd3d_private.h @@ -61,6 +61,7 @@ struct d3d12_resource; struct vkd3d_vk_global_procs { PFN_vkCreateInstance vkCreateInstance; + PFN_vkEnumerateInstanceVersion vkEnumerateInstanceVersion; PFN_vkEnumerateInstanceExtensionProperties vkEnumerateInstanceExtensionProperties; PFN_vkGetInstanceProcAddr vkGetInstanceProcAddr; }; @@ -135,6 +136,7 @@ enum vkd3d_config_flags struct vkd3d_instance { VkInstance vk_instance; + uint32_t instance_version; struct vkd3d_vk_instance_procs vk_procs;
PFN_vkd3d_signal_event signal_event; @@ -1116,6 +1118,7 @@ struct d3d12_device LONG refcount;
VkDevice vk_device; + uint32_t api_version; VkPhysicalDevice vk_physical_device; struct vkd3d_vk_device_procs vk_procs; PFN_vkd3d_signal_event signal_event;
Does not cover every case for SM 6.0, but it's a useful start.
Signed-off-by: Hans-Kristian Arntzen post@arntzen-software.no --- libs/vkd3d/device.c | 47 +++++++++++++++++++++++++++++++++++--- libs/vkd3d/vkd3d_private.h | 1 + 2 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/libs/vkd3d/device.c b/libs/vkd3d/device.c index 28430fb..82355f1 100644 --- a/libs/vkd3d/device.c +++ b/libs/vkd3d/device.c @@ -699,6 +699,7 @@ struct vkd3d_physical_device_info VkPhysicalDeviceTexelBufferAlignmentPropertiesEXT texel_buffer_alignment_properties; VkPhysicalDeviceTransformFeedbackPropertiesEXT xfb_properties; VkPhysicalDeviceVertexAttributeDivisorPropertiesEXT vertex_divisor_properties; + VkPhysicalDeviceSubgroupProperties subgroup_properties;
VkPhysicalDeviceProperties2KHR properties2;
@@ -730,6 +731,7 @@ static void vkd3d_physical_device_info_init(struct vkd3d_physical_device_info *i VkPhysicalDeviceTransformFeedbackPropertiesEXT *xfb_properties; VkPhysicalDevice physical_device = device->vk_physical_device; VkPhysicalDeviceTransformFeedbackFeaturesEXT *xfb_features; + VkPhysicalDeviceSubgroupProperties *subgroup_properties; struct vkd3d_vulkan_info *vulkan_info = &device->vk_info;
memset(info, 0, sizeof(*info)); @@ -745,6 +747,7 @@ static void vkd3d_physical_device_info_init(struct vkd3d_physical_device_info *i vertex_divisor_properties = &info->vertex_divisor_properties; xfb_features = &info->xfb_features; xfb_properties = &info->xfb_properties; + subgroup_properties = &info->subgroup_properties;
info->features2.sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FEATURES_2;
@@ -780,6 +783,8 @@ static void vkd3d_physical_device_info_init(struct vkd3d_physical_device_info *i vk_prepend_struct(&info->properties2, xfb_properties); vertex_divisor_properties->sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VERTEX_ATTRIBUTE_DIVISOR_PROPERTIES_EXT; vk_prepend_struct(&info->properties2, vertex_divisor_properties); + subgroup_properties->sType = VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_SUBGROUP_PROPERTIES; + vk_prepend_struct(&info->properties2, subgroup_properties);
if (vulkan_info->KHR_get_physical_device_properties2) VK_CALL(vkGetPhysicalDeviceProperties2KHR(physical_device, &info->properties2)); @@ -1293,6 +1298,42 @@ static void vkd3d_init_feature_level(struct vkd3d_vulkan_info *vk_info, TRACE("Max feature level: %#x.\n", vk_info->max_feature_level); }
+static void vkd3d_init_shader_model(uint32_t api_version, + struct vkd3d_vulkan_info *vulkan_info, + struct vkd3d_physical_device_info *physical_device_info) +{ + /* SHUFFLE is required to implement WaveReadLaneAt with dynamically uniform index before SPIR-V 1.5 / Vulkan 1.2. */ + static const VkSubgroupFeatureFlags required = + VK_SUBGROUP_FEATURE_ARITHMETIC_BIT | + VK_SUBGROUP_FEATURE_BASIC_BIT | + VK_SUBGROUP_FEATURE_BALLOT_BIT | + VK_SUBGROUP_FEATURE_SHUFFLE_BIT | + VK_SUBGROUP_FEATURE_QUAD_BIT | + VK_SUBGROUP_FEATURE_VOTE_BIT; + + static const VkSubgroupFeatureFlags required_stages = + VK_SHADER_STAGE_COMPUTE_BIT | + VK_SHADER_STAGE_FRAGMENT_BIT; + + if (api_version >= VK_API_VERSION_1_1 && + vkd3d_shader_supports_dxil() && + physical_device_info->subgroup_properties.subgroupSize >= 4 && + (physical_device_info->subgroup_properties.supportedOperations & required) == required && + (physical_device_info->subgroup_properties.supportedStages & required_stages) == required_stages) + { + /* TODO: Add checks for all the other features which are required to implement SM 6.0. + * - 16-bit arithmetic / storage. + */ + vulkan_info->max_shader_model = D3D_SHADER_MODEL_6_0; + TRACE("Enabling support for SM 6.0.\n"); + } + else + { + vulkan_info->max_shader_model = D3D_SHADER_MODEL_5_1; + TRACE("Enabling support for SM 5.1.\n"); + } +} + static HRESULT vkd3d_init_device_caps(struct d3d12_device *device, const struct vkd3d_device_create_info *create_info, struct vkd3d_physical_device_info *physical_device_info, @@ -1481,6 +1522,8 @@ static HRESULT vkd3d_init_device_caps(struct d3d12_device *device, features->robustBufferAccess = VK_FALSE; }
+ vkd3d_init_shader_model(device->api_version, vulkan_info, physical_device_info); + return S_OK; }
@@ -2703,9 +2746,7 @@ static HRESULT STDMETHODCALLTYPE d3d12_device_CheckFeatureSupport(ID3D12Device * }
TRACE("Request shader model %#x.\n", data->HighestShaderModel); - - data->HighestShaderModel = D3D_SHADER_MODEL_5_1; - + data->HighestShaderModel = min(data->HighestShaderModel, device->vk_info.max_shader_model); TRACE("Shader model %#x.\n", data->HighestShaderModel); return S_OK; } diff --git a/libs/vkd3d/vkd3d_private.h b/libs/vkd3d/vkd3d_private.h index 29386c5..c032e05 100644 --- a/libs/vkd3d/vkd3d_private.h +++ b/libs/vkd3d/vkd3d_private.h @@ -126,6 +126,7 @@ struct vkd3d_vulkan_info enum vkd3d_shader_target_extension shader_extensions[VKD3D_MAX_SHADER_EXTENSIONS];
D3D_FEATURE_LEVEL max_feature_level; + D3D_SHADER_MODEL max_shader_model; };
enum vkd3d_config_flags
On Wed, Jan 29, 2020 at 12:51:25PM +0100, Hans-Kristian Arntzen wrote:
Hans-Kristian Arntzen (41):
16 files changed, 6544 insertions(+), 724 deletions(-)
I can't speak to the content of the series, but this is a huuuuuge amount of code to dump all at once. I'd suggest sending series of about 4-5 patches at a time.
Andrew
On 1/29/20 2:54 PM, Andrew Eikum wrote:
On Wed, Jan 29, 2020 at 12:51:25PM +0100, Hans-Kristian Arntzen wrote:
Hans-Kristian Arntzen (41):
16 files changed, 6544 insertions(+), 724 deletions(-)
I can't speak to the content of the series, but this is a huuuuuge amount of code to dump all at once. I'd suggest sending series of about 4-5 patches at a time.
Andrew
Hi,
I wanted to have a look and the patches don't apply cleanly either (it fails at #02, but later as well).
Also I tried to build dxil-spirv on Debian, and it misses some LLVM include paths in the CMakeLists.txt. Fixing that is easy, but then it also requires another include fix to build with llvm-10. Then I tried several other version and LLVM 8, 9 and 10 seems to build fine, but it starts to fail compiling with earlier versions of LLVM.
In my experience, building a project using LLVM C++ API makes it very brittle, as they tend to change their API quite often. The projects fall behind very quickly as they cannot keep up with LLVM development speed, and it becomes very hard to build.
I'm not sure what's the target but I believe that since the LLVM IR format is advertised as stable since 3.7 -and DXIL format probably even more- depending on some unstable API to parse it goes a bit against it.
A better solution to this, IMHO, would be to have a custom parser for DXIL. Although it's a bit more work, the project would be more self contained and much more portable.
Cheers,
On 1/29/20 3:09 PM, Rémi Bernon wrote:
On 1/29/20 2:54 PM, Andrew Eikum wrote:
On Wed, Jan 29, 2020 at 12:51:25PM +0100, Hans-Kristian Arntzen wrote:
Hans-Kristian Arntzen (41):
16 files changed, 6544 insertions(+), 724 deletions(-)
I can't speak to the content of the series, but this is a huuuuuge amount of code to dump all at once. I'd suggest sending series of about 4-5 patches at a time.
Andrew
Hi,
I wanted to have a look and the patches don't apply cleanly either (it fails at #02, but later as well).
There seems to have been some commits in-between master and the patchset, which caused issue when I tried to apply them to clean master. Doesn't seem like there is much of a mood for getting another 41 commits on the ML, so just uploading a tarball instead if that's okay. Alternatively, a branch is here: https://github.com/HansKristian-Work/vkd3d/tree/dxil-review.
Also I tried to build dxil-spirv on Debian, and it misses some LLVM include paths in the CMakeLists.txt. Fixing that is easy, but then it also requires another include fix to build with llvm-10. Then I tried several other version and LLVM 8, 9 and 10 seems to build fine, but it starts to fail compiling with earlier versions of LLVM.
In my experience, building a project using LLVM C++ API makes it very brittle, as they tend to change their API quite often. The projects fall behind very quickly as they cannot keep up with LLVM development speed, and it becomes very hard to build.
I'm not sure what's the target but I believe that since the LLVM IR format is advertised as stable since 3.7 -and DXIL format probably even more- depending on some unstable API to parse it goes a bit against it.
A better solution to this, IMHO, would be to have a custom parser for DXIL. Although it's a bit more work, the project would be more self contained and much more portable.
Yes, that's the ideal solution. I am aware of this problem, but I had hoped to defer this. Writing a good BC parser from scratch is likely another several man-months of work on top unless someone have made useful progress on that already.
Cheers, Hans-Kristian
Cheers,
On 1/29/20 4:31 PM, Hans-Kristian Arntzen wrote:
On 1/29/20 3:09 PM, Rémi Bernon wrote:
On 1/29/20 2:54 PM, Andrew Eikum wrote:
On Wed, Jan 29, 2020 at 12:51:25PM +0100, Hans-Kristian Arntzen wrote:
Hans-Kristian Arntzen (41):
16 files changed, 6544 insertions(+), 724 deletions(-)
I can't speak to the content of the series, but this is a huuuuuge amount of code to dump all at once. I'd suggest sending series of about 4-5 patches at a time.
Andrew
Hi,
I wanted to have a look and the patches don't apply cleanly either (it fails at #02, but later as well).
There seems to have been some commits in-between master and the patchset, which caused issue when I tried to apply them to clean master. Doesn't seem like there is much of a mood for getting another 41 commits on the ML, so just uploading a tarball instead if that's okay. Alternatively, a branch is here: https://github.com/HansKristian-Work/vkd3d/tree/dxil-review.
Also I tried to build dxil-spirv on Debian, and it misses some LLVM include paths in the CMakeLists.txt. Fixing that is easy, but then it also requires another include fix to build with llvm-10. Then I tried several other version and LLVM 8, 9 and 10 seems to build fine, but it starts to fail compiling with earlier versions of LLVM.
In my experience, building a project using LLVM C++ API makes it very brittle, as they tend to change their API quite often. The projects fall behind very quickly as they cannot keep up with LLVM development speed, and it becomes very hard to build.
I'm not sure what's the target but I believe that since the LLVM IR format is advertised as stable since 3.7 -and DXIL format probably even more- depending on some unstable API to parse it goes a bit against it.
A better solution to this, IMHO, would be to have a custom parser for DXIL. Although it's a bit more work, the project would be more self contained and much more portable.
Yes, that's the ideal solution. I am aware of this problem, but I had hoped to defer this. Writing a good BC parser from scratch is likely another several man-months of work on top unless someone have made useful progress on that already.
FWIW, on dxil-spirv master now, LLVM is used as a submodule instead and everything is linked statically, so any incompatibility should be gone.
Cheers, Hans-Kristian
Cheers, Hans-Kristian
Cheers,
Hello Hans-Kristian,
On 1/29/20 9:31 AM, Hans-Kristian Arntzen wrote:
On 1/29/20 3:09 PM, Rémi Bernon wrote:
On 1/29/20 2:54 PM, Andrew Eikum wrote:
On Wed, Jan 29, 2020 at 12:51:25PM +0100, Hans-Kristian Arntzen wrote:
Hans-Kristian Arntzen (41):
16 files changed, 6544 insertions(+), 724 deletions(-)
I can't speak to the content of the series, but this is a huuuuuge amount of code to dump all at once. I'd suggest sending series of about 4-5 patches at a time.
Andrew
Hi,
I wanted to have a look and the patches don't apply cleanly either (it fails at #02, but later as well).
There seems to have been some commits in-between master and the patchset, which caused issue when I tried to apply them to clean master. Doesn't seem like there is much of a mood for getting another 41 commits on the ML, so just uploading a tarball instead if that's okay. Alternatively, a branch is here: https://github.com/HansKristian-Work/vkd3d/tree/dxil-review.
Any update on this?
I can't speak for Henri, but I expect that it would be preferred to resend the rebased patches to the mailing list, just in smaller batches (5 at a time, generally).
In response to your reply to Andrew Eikum, I suspect that submitting tests for DXIL before trying to submit an implementation would certainly be welcome.
Also I tried to build dxil-spirv on Debian, and it misses some LLVM include paths in the CMakeLists.txt. Fixing that is easy, but then it also requires another include fix to build with llvm-10. Then I tried several other version and LLVM 8, 9 and 10 seems to build fine, but it starts to fail compiling with earlier versions of LLVM.
In my experience, building a project using LLVM C++ API makes it very brittle, as they tend to change their API quite often. The projects fall behind very quickly as they cannot keep up with LLVM development speed, and it becomes very hard to build.
I'm not sure what's the target but I believe that since the LLVM IR format is advertised as stable since 3.7 -and DXIL format probably even more- depending on some unstable API to parse it goes a bit against it.
A better solution to this, IMHO, would be to have a custom parser for DXIL. Although it's a bit more work, the project would be more self contained and much more portable.
Yes, that's the ideal solution. I am aware of this problem, but I had hoped to defer this. Writing a good BC parser from scratch is likely another several man-months of work on top unless someone have made useful progress on that already.
Again, I can't speak for the vkd3d maintainer, but I suspect that they are not going to be particularly happy about separating the DXIL parser into a separate library, having to review C++ code, or the method of dependency management that you have chosen (namely, building third-party libraries not as dynamic, external dependencies; see [1] for further rationale).
[1] https://wiki.debian.org/UpstreamGuide
Cheers, Hans-Kristian
Cheers,
On 1/29/20 2:54 PM, Andrew Eikum wrote:
On Wed, Jan 29, 2020 at 12:51:25PM +0100, Hans-Kristian Arntzen wrote:
Hans-Kristian Arntzen (41):
16 files changed, 6544 insertions(+), 724 deletions(-)
I can't speak to the content of the series, but this is a huuuuuge amount of code to dump all at once. I'd suggest sending series of about 4-5 patches at a time.
FWIW, 90% of that diff size is just adding test DXIL blobs alongside DXBC. Would it help if we focus on getting the DXIL tests in first?
Cheers, Hans-Kristian
Andrew