On 8/31/21 4:32 PM, Matteo Bruni wrote:
On Mon, Aug 30, 2021 at 7:07 AM Nikolay Sivov nsivov@codeweavers.com wrote:
Signed-off-by: Nikolay Sivov nsivov@codeweavers.com
dlls/d3d10/effect.c | 172 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 166 insertions(+), 6 deletions(-)
Patch is okay, a few notes below.
diff --git a/dlls/d3d10/effect.c b/dlls/d3d10/effect.c index 113125e9cbd..c3c98a0da31 100644 --- a/dlls/d3d10/effect.c +++ b/dlls/d3d10/effect.c @@ -566,6 +592,127 @@ static HRESULT get_fx10_shader_resources(struct d3d10_effect_variable *v, const return S_OK; }
+struct d3d10_effect_so_decl +{
- D3D10_SO_DECLARATION_ENTRY *entries;
- SIZE_T capacity;
- SIZE_T count;
- unsigned int stride;
- char *decl;
+};
+static void d3d10_effect_cleanup_so_decl(struct d3d10_effect_so_decl *so_decl) +{
- heap_free(so_decl->entries);
- heap_free(so_decl->decl);
- memset(so_decl, 0, sizeof(*so_decl));
+}
+static HRESULT d3d10_effect_parse_stream_output_declaration(const char *decl,
struct d3d10_effect_so_decl *so_decl)
+{
- static const char * allmask = "xyzw";
Usually in HLSL you can use "rgba" in place of "xyzw" for swizzles / writemasks, it might be worth checking if those also work (and are passed through to the effect bytecode unchanged).
Yes, "rgba" also works apparently. Compiler preserves string as is, does not convert to coordinate mask names. I'll add a test for that.
- char *p, *ptr, *end, *next, *mask, *m, *slot;
- unsigned int len = strlen(decl);
- D3D10_SO_DECLARATION_ENTRY e;
- memset(so_decl, 0, sizeof(*so_decl));
- if (!(so_decl->decl = heap_alloc(len + 1)))
return E_OUTOFMEMORY;
- memcpy(so_decl->decl, decl, len + 1);
- p = so_decl->decl;
I think it would be nice to have a small comment showing the expected syntax of a stream out declaration (entry) somewhere around here. Just something to quickly glance at while reading the following code.
Okay.
- while (p && *p)
- {
memset(&e, 0, sizeof(e));
end = strchr(p, ';');
next = end ? end + 1 : p + strlen(p);
len = next - p;
if (end) len--;
/* Remove leading and trailing spaces. */
while (len && isspace(*p)) { len--; p++; }
while (len && isspace(p[len - 1])) len--;
p[len] = 0;
/* Output slot */
if ((slot = strchr(p, ':')))
{
*slot = 0;
ptr = p;
while (*ptr)
{
if (!isdigit(*ptr))
{
WARN("Invalid output slot %s.\n", debugstr_a(p));
goto failed;
}
ptr++;
}
e.OutputSlot = atoi(p);
p = slot + 1;
}
/* Mask */
if ((mask = strchr(p, '.')))
{
*mask = 0; mask++;
if (!(m = strstr(allmask, mask)))
{
WARN("Invalid component mask %s.\n", debugstr_a(mask));
goto failed;
}
e.StartComponent = m - allmask;
e.ComponentCount = strlen(mask);
Did you verify that it's illegal to have a wrongly ordered mask (e.g. something like .wzyx)? I expect that to be invalid but better to be sure. Likewise for other assumptions involved here, like repeating the same component twice.
Yes, it is disallowed to break the order. API manifestation is that you specify start and count, which assumes it's always forward looking. Regarding error, it won't pass compiler check first, complaining about "invalid mask declaration". As far as I can tell that's the only thing that compiler checks in this string, you can have non-existent semantics as much as you like. That will break effect creation later though, presumably because shader object can't be created with invalid declaration.
In general, some more tests with interesting stream out declarations would be nice. I guess you have more planned after the test in patch 5/5.
I can think of adding more variations with spaces around, and maybe with multiple entries. Do you have other ideas?
}
else
{
e.StartComponent = 0;
e.ComponentCount = 4;
}
/* Semantic index and name */
len = strlen(p);
while (isdigit(p[len - 1]))
len--;
if (p[len])
{
e.SemanticIndex = atoi(&p[len]);
p[len] = 0;
}
e.SemanticName = p;
if (!d3d_array_reserve((void **)&so_decl->entries, &so_decl->capacity, so_decl->count + 1,
sizeof(*so_decl->entries)))
goto failed;
so_decl->entries[so_decl->count++] = e;
if (e.OutputSlot == 0)
so_decl->stride += e.ComponentCount * sizeof(float);
This looks weird, but that's probably because the API for ID3D10Device_CreateGeometryShaderWithStreamOutput() is confusing (just one stride for up to 4 output buffers? What does it even mean?) and neither the documentation nor our tests seem to clarify that. Any clue what's supposed to happen with multiple output buffers?
FWIW that function's signature makes a lot more sense in d3d11.
My understanding is that in d3d10 you can specify stride only when you use one SO buffer. And if you use multiple buffers runtime will use default strides for all. For d3d11 you can set each explicitly.