Hello,
I write this update on what I am working on, because I think it
introduces some architectural changes that may be good to discuss, or at
least mention.
Sorry if this text is a bit dense, I hope to send all this as a merge
request in a couple of days from now. Albeit we may want to avoid
merging before the code freeze ends.
More than a week ago, I started trying to implement SM1 resource loads.
For this, in particular, it is necessary to infer each sampler's
dimensions from usage (a use in a tex2D implies that the sampler is 2D,
if not previously declared as such).
This gets complicated when there are arrays of samplers, because each
sampler component can have a different dimension, and yet being part of
the same variable.
For solving this I tried introducing a field to allow us to store data
for each component within hlsl_ir_var.
But this turned to be a complicated solution. As now, we lost the
component information in the derefs once we start working with register
offsets.
I also wanted to have per-component data because I realized that, in
general, we are not handling object components properly when they are
part of a larger array/struct.
But then I thought on an alternative solution: adding a compiler pass to
promote each object component within a large variable to a new
standalone variable, and prepend a store from this standalone variable
to the the original path of the component within the large variable;
relying on copy-prop and dce so that all references that refer to these
components end up pointing to the new standalone variable.
This allows us to use the fields of the hlsl_ir_var (in particular, the
register allocation) for each object component, which solves the problem.
While that strategy works, I then realized that we cannot use it when
targeting SM1 profiles, because we wouldn't be replicating the CTAB
(which doesn't introduce a new variable for each component in
multi-dimensional object arrays, just increases its register size).
SM4 on the other hand seems to do this, separating each sampler and
texture as its own variable in the RDEF block.
My conclusion then, is that it is a good idea to separate each object
component into its own variable in SM4 (as long they are named with the
correct subscripts, like "foo.tex[3]").
And while we can't do that in SM1, the good news is that SM1 doesn't
allow objects as components of larger types except within (possible
multi-dimensional) arrays.
Samplers and Textures cannot be components of structs within SM1
profiles, so we only have to support those cases in hlsl_sm1.c.
This approach also has the benefit that now, all the component register
allocation information should be representable using the fields in the
hlsl_ir_var struct and the register size data within hlsl_type [1],
since each variable should only care about a single register type now.
Storing register allocation data component-wise should not be necessary.
So I am implementing a series of patches to ensure that:
* When targeting SM4 profiles, all object components are separated into
a sole variable.
* When targeting SM1 profiles, it is not allowed to declare objects as
components of structs; so they will either be a sole variable, or belong
to either a (possibly multi-dimensional) array of elements of the same
type, as declared within the shader.
* SM1 resource loads work.
My plan for this series, in terms of patches is:
* Parse the tex3D() intrinsic. (already made by zf)
* Parse the tex2D() intrinsic. (already made by zf)
* Validate that object are not components of structs in SM1.
* Properly allocate registers for object arrays for SM1.
* Infer sampler register dimensions and write declarations in SM1.
* Write resource loads in SM1.
* Separate objects as standalone variables in SM4.
* Lower combined samplers to separate sampler and texture objects for
SM4. (already made by zf)
* Lower separate sampler and texture objects to combined samplers for
SM1. (already made by zf)
It remains to check if object arrays are handled properly in the last
two [2], or modify these patches accordingly.
Best regards,
Francisco.
---
[1] I know we intend to remove this field later, when we move register
allocation to each sm*_write.c, but I think we probably want to replace
those precomputed register offset fields with equivalent SM-specific
functions that retrieve the register offset for each component.
[2] In SM1 profiles, when there are texture arrays and they are used
together with samplers, only one variable is created (not one for each
pair of components).
This is allowed:
```
Texture2D tex[3][2];
sampler sam;
float4 main() : SV_TARGET
{
return tex[0][1].Sample(sam, float2(1, 2)) + tex[1][0].Sample(sam,
float2(1, 2));
}
```
This is allowed (but "not yet implemented" in fxc 9 and 10)
```
Texture2D tex;
sampler sam[2][3];
float4 main() : SV_TARGET
{
return tex.Sample(sam[0][2], float2(1, 2)) + tex.Sample(sam[1][0],
float2(1, 2));
}
```
This is not allowed:
```
Texture2D tex[3];
sampler sam[4];
float4 PSMain() : SV_TARGET
{
return tex[1].Sample(sam[1], float2(1, 2)) + tex[2].Sample(sam[0],
float2(1, 2));
}
```
gives:
```
error X4581: Cannot use texture arrays on DX9 targets with multiple
samplers.
```