Seems that the allocation order of the texture part and the sampler part is not bonded together in the native compiler:
That doesn't seem like a blocker, though. We already do allocation in multiple passes; we could do a separate pass for combined samplers where necessary.
Yep, that is a reasonable solution.
The var->objects_usage[] array is created on hlsl_new_var() from the type->regsize[]. So we would either have to allocate memory for it once we discover that var is used as a combined sampler or we would have to preemptively set type->regsize[HLSL_REGSET_TEXTURES] the same as type->regsize[HLSL_REGSET_SAMPLERS] for all samplers in SM4.
Or we mark internally that the variable is used as a combined sampler before RA (i.e. the same time when we're running lower_combined_samples, probably) and then set objects_usage based on that.
A problem is that we first need to use var->objects_usage in track_object_components_sampler_dim() first, so that we know the sampler_dim of the new texture to be generated.
But I have an alternate solution: we could add a "requires_separate_texture" field to the anonymous struct in objects_usage[][], storing the requirement for the texture resource allocation in objects_usage[HLSL_REGSET_SAMPLER][·].
See, and this is where I have to once again state that hlsl_type_get_regset() has always seemed like a fundamentally broken function to me. It never had a clear boolean answer for structs, and I'm not convinced it can have a clear boolean answer for individual variables either.
I agree in part now. IMO, we should get rid of hlsl_type_get_regset() except for when it is used on a particular deref (maybe, turning it into something like hlsl_deref_get_regset(), or extending deref->offset_regset to the whole lifetime of the deref).
Currently we are using it for two things:
1. As "the regset of the type of a value used by an instruction", which I think is the correct use. In theory, a deref shall never point to a struct after we have split copies.
All these uses after the deref is lowered into a single offset can be replaced by deref->offset_regset. The uses before the lowering would still require to use the implementation of hlsl_type_get_regset() but on the type reached by the deref's path. We would have the guarantee that it is not a struct (unless we are doing something wrong).
2. When we iterate over extern variables to either allocate registers or write the CTAB and RDEF sections. We sometimes use hlsl_type_is_resource() to ensure that they belong to a single regset.
In all these cases we should assume that each variable doesn't necessarily belong to a single regset and not call the function. Instead, iterate over all regsets and check individually if the variable is allocated, or needs to be.
---
The last change in particular is big so I would prefer to leave these things for part 4 of this series and upstream this MR as it is. If you think we should introduce them right away, that's fine, I hope I don't make these new patches too controversial though.