+/** + * A structure describing the mapping of a source varying (output) register in + * a shader stage, to a destination varying (input) register in the following + * shader stage. + * + * This structure is provided as an array, and therefore the source register is + * named implicitly by its index in the array. + * + * This structure is used in struct vkd3d_shader_next_stage_info. + */ +struct vkd3d_shader_varying_map +{ + /** The index of the destination varying to map this register to. */ + unsigned int dst_index; + /** The mask consumed by the destination register. */ + unsigned int dst_mask; +};
How do we represent registers not read by the next stage? vkd3d_shader_build_varying_map() seems to suggest we should map them to a free register, but I'm not sure that's ideal. It also sets "dst_mask" to 0, which is perhaps easier to work with. (So far that and the FIXME also seem like the only usage of "dst_mask" though...) In any case, the API documentation here leaves it unspecified.
Can we split registers? E.g., map o0.xyz in the vertex shader to v0.x and v1.zw in the pixel shader. Is that something we need to be able to do?
+struct vkd3d_shader_next_stage_info +{ + /** Must be set to VKD3D_SHADER_STRUCTURE_TYPE_NEXT_STAGE_INFO. */ + enum vkd3d_shader_structure_type type; + /** Optional pointer to a structure containing further parameters. */ + const void *next; + + /** + * A mapping of output varying registers in this shader stage to input + * varying registers in the next shader stage. + * + * If absent, vkd3d-shader will map registers directly based on their + * register index. + * + * This field should be provied when compiling from legacy Direct3D + * bytecode. + */ + const struct vkd3d_shader_varying_map *varying_map;
"provided"
Is this strictly necessary for d3dbc shaders? I suppose "should" leaves the possibility of not providing it open, but remap_output_signature() seems to care less about it being absent than "should" may imply. Note also that shader model 1 and 2 d3dbc shaders have a natural mapping between vertex shader outputs and pixel shader inputs.
```diff + /** + * The number of registers provided in \ref varying_map. + */ + unsigned int varying_map_size; +}; ```
So, "register_count"? :)
+static inline int vkd3d_bit_scan(unsigned int *x) +{ + int bit_offset; +#ifdef HAVE_BUILTIN_FFS + bit_offset = __builtin_ffs(*x) - 1; +#else + for (bit_offset = 0; bit_offset < 32; bit_offset++) + if (*x & (1u << bit_offset)) break; +#endif + *x ^= 1u << bit_offset; + return bit_offset; +}
"x" should probably be a uint32_t, although I think there's also an argument for making it uintptr_t sized.
I don't think we want the #else implementation; it seems preferable to #error out. On Windows we could potentially use _BitScanForward(), and POSIX provides ffs(), but even without either of those we could probably do better in plain C.
+ * This mapping should be constructed by vkd3d_shader_build_varying_map().
Should it? If we're going to prescribe that, we might as well just pass the nest stage input signature in struct vkd3d_shader_next_stage_info, and only provide a way to retrieve the output map used, whether that's vkd3d_shader_build_varying_map() or simply vkd3d_shader_scan().
+ * This function should be called twice: once, with an input count of zero, in + * order to retrieve the necessary size of the varyings map, and again, with an + * input count containing the actual number of varyings to retrieve. + * + * \remark In practice, since register indices are assigned (from scanning) + * according to the actual Direct3D register index, it is safe to assume that + * the highest register index for a legacy Direct3D shader is no higher than 11, + * and hence one may simply call this function with a pre-allocated array of 12 + * varyings.
It might be nice to guarantee an upper bound of output_signature->element_count. That might require representing "varyings" in a way that omits unused registers, but that might not be a bad thing.
Of course, an alternative way to avoid calling vkd3d_shader_build_varying_map() twice would be to have vkd3d_shader_build_varying_map() allocate the output map itself.