Signed-off-by: Matteo Bruni mbruni@codeweavers.com --- This patch originally included the same typo fixes as those caught by François some time ago, so now it's just a slight clarification of the comment. I guess it may still be of some value.
dlls/wined3d/query.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/dlls/wined3d/query.c b/dlls/wined3d/query.c index 864a3b82f0a..6c1ed1d6b5a 100644 --- a/dlls/wined3d/query.c +++ b/dlls/wined3d/query.c @@ -1681,10 +1681,11 @@ static BOOL wined3d_query_vk_issue(struct wined3d_query *query, uint32_t flags) return false; }
- /* A query needs to either begin and end inside a single render pass, - * or begin and end outside of a render pass. Occlusion queries, if issued - * outside of a render pass, are queued up and only begun when a render - * pass is started, to avoid interrupting it when the query ends. */ + /* A query needs to either begin and end inside a single render pass + * or begin and end outside of a render pass. Occlusion queries, if + * issued outside of a render pass, are queued up and only begun when + * a render pass is started, to avoid interrupting the render pass + * when the query ends. */ if (context_vk->vk_render_pass) { wined3d_query_vk_begin(query_vk, context_vk, vk_command_buffer);
Hellblade: Senua's Sacrifice is known to have one such shader and the invalid entry isn't actually used by the shader.
Signed-off-by: Matteo Bruni mbruni@codeweavers.com --- dlls/wined3d/shader.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/dlls/wined3d/shader.c b/dlls/wined3d/shader.c index 2731639d554..967060bf0a9 100644 --- a/dlls/wined3d/shader.c +++ b/dlls/wined3d/shader.c @@ -1802,7 +1802,7 @@ static HRESULT shader_get_registers_used(struct wined3d_shader *shader, DWORD co if (input_signature->elements[i].register_idx >= ARRAY_SIZE(shader->u.vs.attributes)) { WARN("Invalid input signature register index %u.\n", input_signature->elements[i].register_idx); - return WINED3DERR_INVALIDCALL; + continue; } } else if (shader_version.type == WINED3D_SHADER_TYPE_PIXEL)
On Wed, 9 Feb 2022 at 16:50, Matteo Bruni mbruni@codeweavers.com wrote:
@@ -1802,7 +1802,7 @@ static HRESULT shader_get_registers_used(struct wined3d_shader *shader, DWORD co if (input_signature->elements[i].register_idx >= ARRAY_SIZE(shader->u.vs.attributes)) { WARN("Invalid input signature register index %u.\n", input_signature->elements[i].register_idx);
return WINED3DERR_INVALIDCALL;
continue; }
I don't know what specifically happens with the application mentioned in the commit message, but note that the WARN above can be somewhat misleading. vs_4_1 and vs_5_0 support 32 input registers, but wined3d's MAX_ATTRIBS (used to size the "attributes" array above) is defined as 16.
On Wed, Feb 9, 2022 at 7:43 PM Henri Verbeet hverbeet@gmail.com wrote:
On Wed, 9 Feb 2022 at 16:50, Matteo Bruni mbruni@codeweavers.com wrote:
@@ -1802,7 +1802,7 @@ static HRESULT shader_get_registers_used(struct wined3d_shader *shader, DWORD co if (input_signature->elements[i].register_idx >= ARRAY_SIZE(shader->u.vs.attributes)) { WARN("Invalid input signature register index %u.\n", input_signature->elements[i].register_idx);
return WINED3DERR_INVALIDCALL;
continue; }
I don't know what specifically happens with the application mentioned in the commit message, but note that the WARN above can be somewhat misleading. vs_4_1 and vs_5_0 support 32 input registers, but wined3d's MAX_ATTRIBS (used to size the "attributes" array above) is defined as 16.
Oh, right. Not sure what we want to do here e.g. just raise MAX_ATTRIBS or check against a per-shader profile limit. At any rate I will look into that.
The shader referred by the commit message is a bit weird in that it has in its input signature:
05ec:trace:d3d_shader:shader_parse_signature Stream: 0, semantic: "SV_InstanceID", semantic idx: 0, sysval_semantic 0x8, type 1, register idx: 16, use_mask 0, input_mask 0x1.
but then there is no corresponding dcl_input_sv for it. I thought that was because of the 16 inputs limit but, as you point out, that's not it (and indeed it's a vs_5_0 shader). Maybe that's what happens when the HLSL shader declares an input variable that's never used, potentially because of #ifdefs and such.
Signed-off-by: Matteo Bruni mbruni@codeweavers.com --- Seen in an Hellblade: Senua's Sacrifice log and elsewhere.
dlls/wined3d/shader_sm4.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/dlls/wined3d/shader_sm4.c b/dlls/wined3d/shader_sm4.c index 5afe612da21..061ab51d024 100644 --- a/dlls/wined3d/shader_sm4.c +++ b/dlls/wined3d/shader_sm4.c @@ -1320,6 +1320,9 @@ static void *shader_sm4_init(const DWORD *byte_code, size_t byte_code_size, { struct wined3d_shader_signature_element *e = &output_signature->elements[i];
+ if (priv->shader_version.type == WINED3D_SHADER_TYPE_PIXEL + && _strnicmp(e->semantic_name, "SV_TARGET", -1)) + continue; if (e->register_idx >= ARRAY_SIZE(priv->output_map)) { WARN("Invalid output index %u.\n", e->register_idx);
Signed-off-by: Henri Verbeet hverbeet@codeweavers.com
On Wed, 9 Feb 2022 at 16:50, Matteo Bruni mbruni@codeweavers.com wrote:
@@ -1320,6 +1320,9 @@ static void *shader_sm4_init(const DWORD *byte_code, size_t byte_code_size, { struct wined3d_shader_signature_element *e = &output_signature->elements[i];
if (priv->shader_version.type == WINED3D_SHADER_TYPE_PIXEL
&& _strnicmp(e->semantic_name, "SV_TARGET", -1))
continue; if (e->register_idx >= ARRAY_SIZE(priv->output_map)) { WARN("Invalid output index %u.\n", e->register_idx);
--
Note that similar code exists in vkd3d-shader.
On Wed, Feb 9, 2022 at 7:44 PM Henri Verbeet hverbeet@gmail.com wrote:
On Wed, 9 Feb 2022 at 16:50, Matteo Bruni mbruni@codeweavers.com wrote:
@@ -1320,6 +1320,9 @@ static void *shader_sm4_init(const DWORD *byte_code, size_t byte_code_size, { struct wined3d_shader_signature_element *e = &output_signature->elements[i];
if (priv->shader_version.type == WINED3D_SHADER_TYPE_PIXEL
&& _strnicmp(e->semantic_name, "SV_TARGET", -1))
continue; if (e->register_idx >= ARRAY_SIZE(priv->output_map)) { WARN("Invalid output index %u.\n", e->register_idx);
--
Note that similar code exists in vkd3d-shader.
Right, I wrote basically the same patch for vkd3d-shader (and then forgot about it). I'll send it soon.
Signed-off-by: Matteo Bruni mbruni@codeweavers.com --- Fences might not be supported at all.
dlls/wined3d/context_gl.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/dlls/wined3d/context_gl.c b/dlls/wined3d/context_gl.c index 2d37e29a655..d40772cbaa0 100644 --- a/dlls/wined3d/context_gl.c +++ b/dlls/wined3d/context_gl.c @@ -1377,18 +1377,13 @@ static void wined3d_context_gl_cleanup(struct wined3d_context_gl *context_gl) { /* If we're here because we're switching away from a previously * destroyed context, acquiring a context in order to submit a fence - * is problematic. (In particular, we'd end up back here again in the - * process of switching to the newly acquired context.) */ - if (context_gl->c.destroyed) - { - gl_info->gl_ops.gl.p_glFinish(); - } - else - { - wined3d_context_gl_submit_command_fence(context_gl); - wined3d_context_gl_wait_command_fence(context_gl, - wined3d_device_gl(context_gl->c.device)->current_fence_id - 1); - } + * is problematic. In particular, we'd end up back here again in the + * process of switching to the newly acquired context. + * + * We have to wait for all pending commands to complete and we might + * not support fences, so we might as well unconditionally + * glFinish(). */ + gl_info->gl_ops.gl.p_glFinish();
if (context_gl->dummy_arbfp_prog) GL_EXTCALL(glDeleteProgramsARB(1, &context_gl->dummy_arbfp_prog));
On Wed, 9 Feb 2022 at 16:50, Matteo Bruni mbruni@codeweavers.com wrote:
@@ -1377,18 +1377,13 @@ static void wined3d_context_gl_cleanup(struct wined3d_context_gl *context_gl) { /* If we're here because we're switching away from a previously * destroyed context, acquiring a context in order to submit a fence
* is problematic. (In particular, we'd end up back here again in the
* process of switching to the newly acquired context.) */
if (context_gl->c.destroyed)
{
gl_info->gl_ops.gl.p_glFinish();
}
else
{
wined3d_context_gl_submit_command_fence(context_gl);
wined3d_context_gl_wait_command_fence(context_gl,
wined3d_device_gl(context_gl->c.device)->current_fence_id - 1);
}
* is problematic. In particular, we'd end up back here again in the
* process of switching to the newly acquired context.
*
* We have to wait for all pending commands to complete and we might
* not support fences, so we might as well unconditionally
* glFinish(). */
gl_info->gl_ops.gl.p_glFinish();
I suppose this is probably fine in practice, but glFinish() always makes me a bit uncomfortable; I'd prefer just checking for the availability of fences, like in 5/5. Taking that one step further, if fences aren't supported, I don't think there's anything we would need to wait for in the first place; we can probably just avoid the glFinish() in that case as well.
On Wed, Feb 9, 2022 at 7:44 PM Henri Verbeet hverbeet@gmail.com wrote:
On Wed, 9 Feb 2022 at 16:50, Matteo Bruni mbruni@codeweavers.com wrote:
@@ -1377,18 +1377,13 @@ static void wined3d_context_gl_cleanup(struct wined3d_context_gl *context_gl) { /* If we're here because we're switching away from a previously * destroyed context, acquiring a context in order to submit a fence
* is problematic. (In particular, we'd end up back here again in the
* process of switching to the newly acquired context.) */
if (context_gl->c.destroyed)
{
gl_info->gl_ops.gl.p_glFinish();
}
else
{
wined3d_context_gl_submit_command_fence(context_gl);
wined3d_context_gl_wait_command_fence(context_gl,
wined3d_device_gl(context_gl->c.device)->current_fence_id - 1);
}
* is problematic. In particular, we'd end up back here again in the
* process of switching to the newly acquired context.
*
* We have to wait for all pending commands to complete and we might
* not support fences, so we might as well unconditionally
* glFinish(). */
gl_info->gl_ops.gl.p_glFinish();
I suppose this is probably fine in practice, but glFinish() always makes me a bit uncomfortable; I'd prefer just checking for the availability of fences, like in 5/5. Taking that one step further, if fences aren't supported, I don't think there's anything we would need to wait for in the first place; we can probably just avoid the glFinish() in that case as well.
Good point, I'll give that a try.
Signed-off-by: Matteo Bruni mbruni@codeweavers.com --- dlls/wined3d/adapter_gl.c | 1 + dlls/wined3d/adapter_vk.c | 1 + dlls/wined3d/context_gl.c | 9 ++++++--- dlls/wined3d/query.c | 5 ----- dlls/wined3d/swapchain.c | 3 ++- dlls/wined3d/wined3d_private.h | 6 ++++++ 6 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/dlls/wined3d/adapter_gl.c b/dlls/wined3d/adapter_gl.c index e903c67b56a..4c55177f7be 100644 --- a/dlls/wined3d/adapter_gl.c +++ b/dlls/wined3d/adapter_gl.c @@ -5159,6 +5159,7 @@ static void wined3d_adapter_gl_init_d3d_info(struct wined3d_adapter_gl *adapter_ d3d_info->scaled_resolve = !!gl_info->supported[EXT_FRAMEBUFFER_MULTISAMPLE_BLIT_SCALED]; d3d_info->pbo = !!gl_info->supported[ARB_PIXEL_BUFFER_OBJECT]; d3d_info->subpixel_viewport = gl_info->limits.viewport_subpixel_bits >= 8; + d3d_info->fences = wined3d_fence_supported(gl_info); d3d_info->feature_level = feature_level_from_caps(gl_info, &shader_caps, &fragment_caps); d3d_info->filling_convention_offset = gl_info->filling_convention_offset;
diff --git a/dlls/wined3d/adapter_vk.c b/dlls/wined3d/adapter_vk.c index 343abac1004..c894d9ebe79 100644 --- a/dlls/wined3d/adapter_vk.c +++ b/dlls/wined3d/adapter_vk.c @@ -2306,6 +2306,7 @@ static void wined3d_adapter_vk_init_d3d_info(struct wined3d_adapter_vk *adapter_ d3d_info->pbo = true; d3d_info->feature_level = feature_level_from_caps(&shader_caps); d3d_info->subpixel_viewport = true; + d3d_info->fences = true;
/* Like GL, Vulkan doesn't explicitly specify a filling convention and only mandates that a * shared edge of two adjacent triangles generate a fragment for exactly one of the triangles. diff --git a/dlls/wined3d/context_gl.c b/dlls/wined3d/context_gl.c index d40772cbaa0..d42711ac93d 100644 --- a/dlls/wined3d/context_gl.c +++ b/dlls/wined3d/context_gl.c @@ -2852,9 +2852,12 @@ static void *wined3d_bo_gl_map(struct wined3d_bo_gl *bo, struct wined3d_context_ ERR("Failed to create new buffer object.\n"); }
- if (bo->command_fence_id == device_gl->current_fence_id) - wined3d_context_gl_submit_command_fence(context_gl); - wined3d_context_gl_wait_command_fence(context_gl, bo->command_fence_id); + if (context_gl->c.d3d_info->fences) + { + if (bo->command_fence_id == device_gl->current_fence_id) + wined3d_context_gl_submit_command_fence(context_gl); + wined3d_context_gl_wait_command_fence(context_gl, bo->command_fence_id); + }
map: if (bo->b.map_ptr) diff --git a/dlls/wined3d/query.c b/dlls/wined3d/query.c index 6c1ed1d6b5a..432f9bd5516 100644 --- a/dlls/wined3d/query.c +++ b/dlls/wined3d/query.c @@ -173,11 +173,6 @@ static struct wined3d_pipeline_statistics_query *wined3d_pipeline_statistics_que return CONTAINING_RECORD(query, struct wined3d_pipeline_statistics_query, query); }
-static BOOL wined3d_fence_supported(const struct wined3d_gl_info *gl_info) -{ - return gl_info->supported[ARB_SYNC] || gl_info->supported[NV_FENCE] || gl_info->supported[APPLE_FENCE]; -} - enum wined3d_fence_result wined3d_fence_test(const struct wined3d_fence *fence, struct wined3d_device *device, DWORD flags) { diff --git a/dlls/wined3d/swapchain.c b/dlls/wined3d/swapchain.c index 49a40f3e1d6..eb973c5191a 100644 --- a/dlls/wined3d/swapchain.c +++ b/dlls/wined3d/swapchain.c @@ -633,7 +633,8 @@ static void swapchain_gl_present(struct wined3d_swapchain *swapchain, gl_info->gl_ops.wgl.p_wglSwapBuffers(context_gl->dc); }
- wined3d_context_gl_submit_command_fence(context_gl); + if (context->d3d_info->fences) + wined3d_context_gl_submit_command_fence(context_gl);
wined3d_swapchain_gl_rotate(swapchain, context);
diff --git a/dlls/wined3d/wined3d_private.h b/dlls/wined3d/wined3d_private.h index ef1062ec9b4..3aa0e740d1b 100644 --- a/dlls/wined3d/wined3d_private.h +++ b/dlls/wined3d/wined3d_private.h @@ -241,6 +241,7 @@ struct wined3d_d3d_info uint32_t scaled_resolve : 1; uint32_t pbo : 1; uint32_t subpixel_viewport : 1; + uint32_t fences : 1; enum wined3d_feature_level feature_level;
DWORD multisample_draw_location; @@ -3306,6 +3307,11 @@ struct wined3d_gl_info void (WINE_GLAPI *p_glEnableWINE)(GLenum cap); };
+static inline BOOL wined3d_fence_supported(const struct wined3d_gl_info *gl_info) +{ + return gl_info->supported[ARB_SYNC] || gl_info->supported[NV_FENCE] || gl_info->supported[APPLE_FENCE]; +} + /* The driver names reflect the lowest GPU supported * by a certain driver, so DRIVER_AMD_R300 supports * R3xx, R4xx and R5xx GPUs. */
On Wed, 9 Feb 2022 at 16:50, Matteo Bruni mbruni@codeweavers.com wrote:
@@ -241,6 +241,7 @@ struct wined3d_d3d_info uint32_t scaled_resolve : 1; uint32_t pbo : 1; uint32_t subpixel_viewport : 1;
- uint32_t fences : 1; enum wined3d_feature_level feature_level;
Is the expectation that we'll need this in common code? Currently this only appears to be used inside the GL backend.
On Wed, Feb 9, 2022 at 7:44 PM Henri Verbeet hverbeet@gmail.com wrote:
On Wed, 9 Feb 2022 at 16:50, Matteo Bruni mbruni@codeweavers.com wrote:
@@ -241,6 +241,7 @@ struct wined3d_d3d_info uint32_t scaled_resolve : 1; uint32_t pbo : 1; uint32_t subpixel_viewport : 1;
- uint32_t fences : 1; enum wined3d_feature_level feature_level;
Is the expectation that we'll need this in common code? Currently this only appears to be used inside the GL backend.
Yes, I have a patch lined up to disable CSMT when fences are not supported.