Does this series strictly depend on the first two patches by Conor?
I see now that it is not as dependent as !269 was. I removed the first two patches and just replaced it with a patch that renames `VKD3D_SM4_SWIZZLE_NONE` to `VKD3D_SM4_SWIZZLE_MASK4`.
So it actually doesn't depend on those patches at all. :) This series might have some minor conflicts with that series, but we should simply deal with those once they actually happen; it's entirely possible for this series to go in first.
bool hlsl_sm4_register_from_semantic(struct hlsl_ctx *ctx, const struct hlsl_semantic *semantic, - bool output, unsigned int *type, enum vkd3d_sm4_swizzle_type *swizzle_type, bool *has_idx) + bool output, enum vkd3d_shader_register_type *type, enum vkd3d_sm4_swizzle_type *swizzle_type, bool *has_idx)
You should update the prototype in hlsl.h as well.
@@ -2555,7 +2555,8 @@ bool hlsl_sm4_register_from_semantic(struct hlsl_ctx *ctx, const struct hlsl_sem && output == register_table[i].output && ctx->profile->type == register_table[i].shader_type) { - *type = register_table[i].type; + if (type) + *type = register_table[i].type; if (swizzle_type) *swizzle_type = register_table[i].swizzle_type; *has_idx = register_table[i].has_idx; @@ -2652,7 +2653,6 @@ static void write_sm4_signature(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc, LIST_FOR_EACH_ENTRY(var, &ctx->extern_vars, struct hlsl_ir_var, extern_entry) { unsigned int width = (1u << var->data_type->dimx) - 1, use_mask; - enum vkd3d_sm4_register_type type; uint32_t usage_idx, reg_idx; D3D_NAME usage; bool has_idx; @@ -2666,14 +2666,13 @@ static void write_sm4_signature(struct hlsl_ctx *ctx, struct dxbc_writer *dxbc, continue; usage_idx = var->semantic.index; - if (hlsl_sm4_register_from_semantic(ctx, &var->semantic, output, &type, NULL, &has_idx)) + if (hlsl_sm4_register_from_semantic(ctx, &var->semantic, output, NULL, NULL, &has_idx)) { reg_idx = has_idx ? var->semantic.index : ~0u; } else { assert(var->regs[HLSL_REGSET_NUMERIC].allocated); - type = VKD3D_SM4_RT_INPUT; reg_idx = var->regs[HLSL_REGSET_NUMERIC].id; }
I don't dislike this, but it does seem like somewhat of a separate change.
+ VKD3D_SM4_REGISTER_TYPE_LAST = VKD3D_SM5_RT_OUTPUT_STENCIL_REF,
We'd typically add something like VKD3D_SM4_REGISTER_TYPE_COUNT. E.g., VKD3D_SHADER_TYPE_COUNT.
- Do the same thing, but use proper locking and/or atomic operations to ensure concurrent threads do the right thing. It can be tricky and it forces us to deal with threading API in a library that probably shouldn't. I'd avoid that.
- Do the same thing, but store the lookup tables in the parser context instead of in a static variable. There's a bit of CPU cycle wasting, but it's a small thing, and concurrent threads work on independent tables and do not step on each other.
- Don't do anything fancy, just write both lookup tables in the code. It's not DRY, so in principle I hate that, but given that the two tables are still rather small and C really tries to get in your way here, it still feels one of the best alternatives.
- Turn to the dark side and embrace the C way: the preprocessor! Create a file full of `X(VKD3D_SM4_RT_TEMP, VKD3DSPR_TEMP)` lines and include each time you need it defining `X()` appropriately each time. Once you're done with that you start doing things like `#define E } else {` and you're halfway submitting vkd3d-shader to the IOCCC. It's a great career path, though, I think you can get consultancy work from Satan himself.
Notice that with the third and fourth alternatives you can also write a `switch` statement instead of a lookup table, in theory giving your compiler even more optimization opportunities. Not sure the compiler is going to make good use of them, though.
I'd go with the second or third alternative.
Yeah. I'd go with the second option.
It's probably not that hard to make the first option work with either something like "__attribute__((constructor))" or something like pthread_once(), although there may be a few pitfalls. I doubt it's worth the effort at this point though.
+ VKD3D_SM4_DIMENSION_INVALID = ~0u,
Do we really need this? We use it in register_type_table[] for VKD3DSPR_IMMCONST and VKD3DSPR_IMMCONST64, but then overwrite it for VKD3DSPR_IMMCONST. I'm not sure it adds that much over using e.g. VKD3D_SM4_DIMENSION_NONE or VKD3D_SM4_DIMENSION_VEC4 for these.