On Fri Apr 21 19:46:43 2023 +0000, Zebediah Figura wrote:
Okay, it is not as bad as I imagine. Perhaps I should do the cost
assessments when I am fresh in the morning and not when I tired.
Still, there are 5 places where it would be better to iterate over a
regset: track_object_components_usage(), calculate_resource_register_counts(), write_sm1_sampler_dcls(), write_sm4_dcl_samplers(), and write_sm4_dcl_textures(); but iteration per-component it is useful for promoting resource components into separate variables for SM 5.1. Hrm, I didn't quite think this through completely, I forgot that register allocation is still only done per-variable. Yeah, so it would require that for every time we're declaring a texture. If we need to track anything per-component, it still may be best to do it that way, though.
I probably didn't understood this the first time, I thought you were
referring to the register range that's allocated, which goes from 0 to the maximum used component (which can be stored in var->regs[HLSL_REGSET_NUMERIC].count).
If I understand correctly, you are saying that in SM1 there are cases
where a uniform variable (or field) can require the size of a bool or a int according to how it is used. I am not familiarized with this behavior. Can you provide an example? Most of sm1 uses float uniforms, even where they're declared with non-float type. SM 3.0 introduces the first flow control, and its flow control instructions—and *only* those instructions—take non-float types. Specifically "if" takes a bool type, and "loop" takes an integer type. Consider the following shader:
uniform struct { float f; bool b; int i; } a; float4 main() : sv_target { float x = a.f + a.i + a.b; if (a.b) x += 2; return x; }
Ultimately "a.b" is allocated to *both* the float and bool register sets. (You can get a similar effect for ints by declaring a loop that executes for i iterations, but I didn't bother with that.) This is true without the struct too, of course, but the struct shows that there are interesting consequences wrt allocation. Granted, I suppose we don't really need to do this tracking per-component—we can just do it the same way as we do textures...
That example alone doesn't seem to be triggering the allocation of both bool and float registers for `a.b` because of compiler optimizations[1], but I changed the shader a little to prevent them:
```hlsl sampler sam;
uniform struct { float f; bool b; int i; } a;
float4 main() : sv_target { float x = a.f + a.i + a.b; if (a.b) x += tex2D(sam, float2(0, 0)); return x; } ```
I get this register table: ``` // Name Reg Size // ------------ ----- ---- // a b0 2 // a c0 3 // sam s0 1 ```
Note that the offset of the bool `a.b` is the same as the offset of the float `a.b`.
So, we will need a function similar to `track_object_components_usage()` but checking for the usage of bool components, to determine the size of the variable in the REGSET_BOOL (?). Still, I don't think tracking per-component would be necessary.
---
[1] This implies that whether to expect bools in the input signature or not is totally a decision of the compiler.
Perhaps we can get away with just putting floats in the input signature and casting to float within the shader. That may not work if some applications hardcode the input signature, but also, I may be really impossible to mimic compiler optimizations 1:1. For instance, this slightly modified shader **doesn't** require bool registers:
```hlsl sampler sam;
uniform struct { float f; bool b; int i; } a;
bool k;
float4 main() : sv_target { float x = a.f + a.i + a.b; if (a.b || k) x += tex2D(sam, float2(0, 0)); return x; } ```