On 10-10-22 02:09, Zebediah Figura wrote:
I haven't done a detailed review of these patches, but looking over them at a high level, the general approach seems right for all of these, so I think it would make sense to start submitting them (a few at a time).
Sure!
I am thinking on prepending some patches with field-level comments for the data structures too. To put code where my mouth is (regarding my suggestion of field-level comments), and so that the fields usage is clear as the patch series progresses.
I do have a few high-level comments that might be good to address sooner than later, though.
From "vkd3d-shader/hlsl: Span variables across multiple register spaces.":
- "The 'reg' field is still used for numeric register spaces. The
HLSL_SPACE_DEFAULT value can be used to indicate this." — My inclination is that we don't want to do this. We should just use one array, probably an array of hlsl_reg structures (even though writemask is meaningless for anything non-numeric). Same for reg_size vs reg_space_size.
I agree, I will try to go in that direction.
The only thing to keep in mind is that it implies a very intrusive patch (or couple of patches) that change all uses of the 'reg' and 'reg_size' fields, so that should be considered when rebasing.
- HLSL_SPACE_* naming could be improved, I think. It's written to match
the bytecode, but I'd advocate for SAMPLER and TEXTURE instead of S and T, and probably NUMERIC instead of DEFAULT. [Although... hmm. While writing this I realized that this does make it a bit easier to mix up HLSL_TYPE_TEXTURE and HLSL_SPACE_TEXTURE. Still...]
When I was first implementing this patch I indeed went for HLSL_SPACE_SAMPLER and HLSL_SPACE_TEXTURE, but because textures in SM1 use HLSL_SPACE_SAMPLER too, I thought that may lead to confusion. Also, these enum values get used in many places and I found the abbreviated names to be far less verbose, and harder to confuse with the HLSL_TYPE_* as you mentioned. But if you still want, I will go back to use those names.
I decided to name this "register space" as "DEFAULT" because I am also using it as a return value for hlsl_type_get_reg_space() for struct types, that may or may not contain objects within their fields.
Maybe I should separate HLSL_SPACE_DEFAULT into HLSL_SPACE_NUMERIC, that encompasses the numeric register spaces according to the situation (constant, temps, or semantics) and HLSL_SPACE_MULTI for struct types; but I am not yet sure about it.
- Also, "reg_space" is... a bit unfortunate, since "(register) space"
also means something else (specifically, sm 5.1 register space). Unfortunately what we're dealing with is namespaces on namespaces. Some other names that occur to me are "set" (although that one means something else for Vulkan), "namespace", "reg_type". "Set" is probably my favourite, especially since that matches the d3dx9 terminology. I'm open to bikeshedding this one though.
I like register "set". Another alternative may be register "group", if we don't want them to collide with other definitions.
- We probably want to move away from storing reg_size (and hence also
reg_space_size) in the variable. I don't mind deferring that one to later, though.
I think you mean "in the type" instead of "in the variable" (?). Hmm, I don't see the problem on keeping that information in the hlsl_type, since it is constant.
Still, yeah, if we want to switch to computing them on the fly as needed, we better do that later.
From "vkd3d-shader/hlsl: Skip object components when creating input/output copies.":
- Why this patch? Is this broken now?
Yes, currently a shader that contains an input variable with an object field either reaches unreachable code or throws a failed assertion.
For instance: --- struct apple { Texture2D tex; float4 a : sem2; };
float4 main(struct apple var) : sv_target { return var.tex.Load(float3(1, 2, 3)); } ---
I think it also resulted on some segfaults with similar shaders. The reason being that prepend_input_var_copy() and append_var_output_copy() are not prepared to handle object fields.
Still, this made me realize that I should do
if (var->data_type->reg_space_size[k] > 0)
instead of
if (var->reg_space[k].count > 0)
!
Because "var->reg_space[k].count" is not set yet.
But probably, that is redundant with the checks added on that patch to prepend_input_copy() and append_output_copy() in the same patch. I have to try to remember why I didn't just keep those.
From "vkd3d-shader/hlsl: Track object components usage and allocate registers accordingly.":
- This one looks fundamentally sensible. It may be an open question
whether we want to merge the first_write/last_read fields into object_usage (and make the whole thing completely per-component rather than just per-object-component) but probably makes more sense just to leave it alone for the moment.
hmm... I may be wrong, but...
I think that... with the combined effect of:
* splitting struct copies. * splitting array copies. * splitting matrix copies. * component-wise copy propagation.
all used temp variables should be a single register now (?).
And, on the other hand, the non-temps lifetime is the same as the whole program.
So, having a first_write, last_read for every component won't be very useful considering these things.
Still, yes, probably better to analyze this in the future.
I also like having an object_usage array for each "register space" so that the offset can be used immediately to get the information about that object, instead of having to go through hlsl_reg_space_range_from_deref(), which we probably want to do after path indexes get into the sm1 and sm4 functions but we can't do easily now.
- hlsl_reg_space_range_from_deref() is odd here, because, why the count
parameter? As far as I can tell it's not used in this patch.
Well, yes, so far it is only used for doing assert(count == 1); after calling hlsl_reg_space_range_from_deref(). Which I think is a good assertion to have, because we are checking that resource arrays cannot be passed when single resources are expected.
Also, I added it for consistency with hlsl_component_index_range_from_deref(), and I thought it could be useful in the future, if we want to handle SM 5.1 resource arrays and their non-constant indexes.
But yes, if you consider it is dead code I can remove it.
- request_object_registers_for_allocation is one of those things that we
can probably do as part of track_object_components_usage(). I.e. count = max(count, index).
Yes, but I have the feeling that it is better to have this as a separate step, so that all uses of "reg_space" are in the same place, when doing register allocation, after compute_liveness(). Just to keep things separated.
The last few patches related to sorting—can these be put into sm{1,4}_sort_externs() instead of being separate functions?
Just to clarify, there are two different orderings for the externs we have to care about:
A) The register indexes they have assigned. B) The order in which they appear in the CTAB/RDEF tables.
I am afraid it is necessary to call these sorting functions before doing the allocate_objects() calls, in order to ensure that A is done imitating the native compiler.
The sm{1,4}_sort_externs() then should ensure that B is done imitating the native compiler.
I haven't yet took much time one the latter, as my understanding is that imitating A is more important for applications (if some of them make assumptions about on which register to place the resources).
If necessary I can work later on some patches that modify sm{1,4}_sort_externs() to try to achieve B.
Best regards, Francisco