When using a video sample allocator, the session will allocate samples from it, then try to get some output from a transform, and if there's no output to be pulled, release the sample immediately. This will trigger the sample allocator notify callback, which will then trigger another try at pulling samples, ending up in a callback loop that can starve other working threads.
Another option here would be to keep the last returned status, and only re-try to pull samples from the allocator notify callback if we previously failed to process output because of a failed allocation.
From: Rémi Bernon rbernon@codeweavers.com
--- dlls/mf/session.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-)
diff --git a/dlls/mf/session.c b/dlls/mf/session.c index 9074ad2de25..39ce277e9af 100644 --- a/dlls/mf/session.c +++ b/dlls/mf/session.c @@ -156,6 +156,7 @@ struct transform_stream struct list samples; unsigned int requests; unsigned int min_buffer_size; + IMFSample *allocated_sample; BOOL draining; };
@@ -708,6 +709,12 @@ static void transform_stream_drop_samples(struct transform_stream *stream) { IMFSample *sample;
+ if (stream->allocated_sample) + { + IMFSample_Release(stream->allocated_sample); + stream->allocated_sample = NULL; + } + while (SUCCEEDED(transform_stream_pop_sample(stream, &sample))) IMFSample_Release(sample); } @@ -3131,14 +3138,26 @@ static void session_set_sink_stream_state(struct media_session *session, IMFStre static HRESULT transform_get_external_output_sample(const struct media_session *session, struct topo_node *transform, unsigned int output_index, const MFT_OUTPUT_STREAM_INFO *stream_info, IMFSample **sample) { + struct transform_stream *stream = &transform->u.transform.outputs[output_index]; IMFTopologyNode *downstream_node; IMFMediaBuffer *buffer = NULL; struct topo_node *topo_node; unsigned int buffer_size; - DWORD downstream_input; + DWORD downstream_input, sample_size; TOPOID node_id; HRESULT hr;
+ buffer_size = max(stream_info->cbSize, stream->min_buffer_size); + if ((*sample = stream->allocated_sample)) + { + stream->allocated_sample = NULL; + if (SUCCEEDED(IMFSample_GetTotalLength(*sample, &sample_size)) + && sample_size >= buffer_size) + return S_OK; + IMFSample_Release(*sample); + *sample = NULL; + } + if (FAILED(IMFTopologyNode_GetOutput(transform->node, output_index, &downstream_node, &downstream_input))) { WARN("Failed to get connected node for output %u.\n", output_index); @@ -3156,8 +3175,6 @@ static HRESULT transform_get_external_output_sample(const struct media_session * } else { - buffer_size = max(stream_info->cbSize, transform->u.transform.outputs[output_index].min_buffer_size); - hr = MFCreateAlignedMemoryBuffer(buffer_size, stream_info->cbAlignment, &buffer); if (SUCCEEDED(hr)) hr = MFCreateSample(sample); @@ -3221,7 +3238,12 @@ static HRESULT transform_node_pull_samples(const struct media_session *session, }
if (buffers[i].pSample) - IMFSample_Release(buffers[i].pSample); + { + if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT && !stream->allocated_sample) + stream->allocated_sample = buffers[i].pSample; + else + IMFSample_Release(buffers[i].pSample); + } }
free(buffers);
Hi,
It looks like your patch introduced the new failures shown below. Please investigate and fix them before resubmitting your patch. If they are not new, fixing them anyway would help a lot. Otherwise please ask for the known failures list to be updated.
The tests also ran into some preexisting test failures. If you know how to fix them that would be helpful. See the TestBot job for the details:
The full results can be found at: https://testbot.winehq.org/JobDetails.pl?Key=144024
Your paranoid android.
=== debian11b (64 bit WoW report) ===
d3dx10_34: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 00000000011BB9A0. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 00000000011A4920. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 00000000011A4920. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011E3A70. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011E3C40.
d3dx10_35: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 00000000011B24A0. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 0000000001195800. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 00000000011959D0. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011C35B0. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011C3780.
d3dx10_36: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 0000000001239F10. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 00000000011D49C0. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 0000000001239F10. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 0000000001195BA0. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011A4920.
d3dx10_37: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 00000000011E3EF0. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 0000000001195A10. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 0000000001195BE0. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011E3EF0. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011D5CC0.
d3dx10_38: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 00000000011A4690. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 00000000011A4690. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 00000000011A4B40. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011BFEB0. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011D56B0.
d3dx10_39: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 0000000001195C50. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 00000000011C3380. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 0000000001195B50. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011C3550. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011C3380.
d3dx10_40: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 00000000011BBA60. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 00000000011BB950. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 00000000011E3990. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011D5600. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011D57D0.
d3dx10_41: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 00000000011E3AB0. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 00000000011E3AB0. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 00000000011E3AB0. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011BBB50. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011E3C80.
d3dx10_42: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 0000000001239C80. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 00000000011BCAB0. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 0000000001239E50. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011E3400. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011B5D90.
d3dx10_43: d3dx10.c:4380: Test succeeded inside todo block: Got unexpected effect 00000000011D5530. d3dx10.c:4470: Test succeeded inside todo block: Got unexpected effect 00000000011D5530. d3dx10.c:4480: Test succeeded inside todo block: Got unexpected effect 00000000011BCAB0. d3dx10.c:4589: Test succeeded inside todo block: Got unexpected effect 00000000011A4860. d3dx10.c:4599: Test succeeded inside todo block: Got unexpected effect 00000000011BC970.
user32: input.c:3864: Test succeeded inside todo block: button_down_hwnd_todo 1: got MSG_TEST_WIN hwnd 0000000000B400D2, msg WM_LBUTTONDOWN, wparam 0x1, lparam 0x320032
So, do you think it would be better to track the last returned result here? Or maybe adding and waiting on a condition variable instead (though that doesn't sound very appealing)?
Well, after more investigation it seems likely after all that the session doesn't use an allocator itself but relies on the individual components being D3D aware. The same thing seems to happen with the source reader.
Although it reports MF_SA_D3D11_AWARE (and MF_SA_D3D_AWARE on Win7 only), the video processor *is* D3D aware, and I intend to implement it. I also suspect that it is somehow involved in the D3D9 case even though it doesn't have the MF_SA_D3D_AWARE attribute on more recent Windows. Maybe MF_SA_D3D11_AWARE implies MF_SA_D3D_AWARE.
As long as either a decoder or a video processor is present in the pipeline, there is no need for an external allocator or sample copier.
Still, I'm not completely sure how video sample notifications are supposed to be used, or if they even are. It looks to me that if we can starve the allocator somehow, and without notifications, it could stall the pipeline.
On Wed Apr 3 20:25:54 2024 +0000, Rémi Bernon wrote:
Well, after more investigation it seems likely after all that the session doesn't use an allocator itself but relies on the individual components being D3D aware. The same thing seems to happen with the source reader. Although it reports MF_SA_D3D11_AWARE (and MF_SA_D3D_AWARE on Win7 only), the video processor *is* D3D aware, and I intend to implement it. I also suspect that it is somehow involved in the D3D9 case even though it doesn't have the MF_SA_D3D_AWARE attribute on more recent Windows. Maybe MF_SA_D3D11_AWARE implies MF_SA_D3D_AWARE. As long as either a decoder or a video processor is present in the pipeline, there is no need for an external allocator or sample copier. Still, I'm not completely sure how video sample notifications are supposed to be used, or if they even are. It looks to me that if we can starve the allocator somehow, and without notifications, it could stall the pipeline.
Allocators are associated with stream sinks, and yes, they are provided by sink components normally I think. The situation with d3d11_aware could be different because there is no builtin d3d11 renderer, for some reason.
What happens if you do source -> decoder-with-optional-d3d-decoding -> sample grabber? Does that resolve to a topology with d3d(11)_aware attributes set? If it doesn't that still could mean that session enables it itself. That should be visible in final topology, or actually you can see that yourself by checking decoder attributes before are after.
On Wed Apr 3 20:25:54 2024 +0000, Nikolay Sivov wrote:
Allocators are associated with stream sinks, and yes, they are provided by sink components normally I think. The situation with d3d11_aware could be different because there is no builtin d3d11 renderer, for some reason. What happens if you do source -> decoder-with-optional-d3d-decoding -> sample grabber? Does that resolve to a topology with d3d(11)_aware attributes set? If it doesn't that still could mean that session enables it itself. That should be visible in final topology, or actually you can see that yourself by checking decoder attributes before are after.
I will try that, but fwiw I'm pretty sure that every MFT has its own allocator, initialized from the device manager that is passed on initialization. I'm not sure that sinks even need or have one.
This has visible side effect that after receiving a device manager, the MFT (h264 decoder, video processor, etc...) changes its stream properties to indicate that it will allocate its output samples itself (from the internal allocator it has initialized).
It also uses the MFT attributes and output stream attributes (that application can, and do modify through IMFTransform_GetAttributes and IMFTransform_GetOutputStreamAttributes) to control the allocator initialization parameters.
On Wed Apr 3 20:31:55 2024 +0000, Rémi Bernon wrote:
I will try that, but fwiw I'm pretty sure that every MFT has its own allocator, initialized from the device manager that is passed on initialization. I'm not sure that sinks even need or have one. This has visible side effect that after receiving a device manager, the MFT (h264 decoder, video processor, etc...) changes its stream properties to indicate that it will allocate its output samples itself (from the internal allocator it has initialized). It also uses the MFT attributes and output stream attributes (that application can, and do modify through IMFTransform_GetAttributes and IMFTransform_GetOutputStreamAttributes) to control the allocator initialization parameters.
Opened https://gitlab.winehq.org/wine/wine/-/merge_requests/5464 with a test case.
In theory it is possible that some components advertise being D3D-aware, but then still being unable allocate samples themselves. The test I added in https://gitlab.winehq.org/wine/wine/-/merge_requests/5459 with the video processor transform alone shows that it behaves like that on Windows 8.
In such case and because we can still see D3D9/DXGI buffers on the output end, I assume that the session/source reader will create an external sample allocator on behalf of the transform, but I don't think we need to support that scenario as long as we ensure that all our transforms (decoder and processor) support internal allocations.
On Thu Apr 4 17:36:30 2024 +0000, Rémi Bernon wrote:
Opened https://gitlab.winehq.org/wine/wine/-/merge_requests/5464 with a test case. In theory it is possible that some components advertise being D3D-aware, but then still being unable allocate samples themselves. The test I added in https://gitlab.winehq.org/wine/wine/-/merge_requests/5459 with the video processor transform alone shows that it behaves like that on Windows 8. In such case and because we can still see D3D9/DXGI buffers on the output end, I assume that the session/source reader will create an external sample allocator on behalf of the transform, but I don't think we need to support that scenario as long as we ensure that all our transforms (decoder and processor) support internal allocations.
Actually the remark about Windows 8 test might be wrong, it may just be a side effect of running in a VM with a D3D device that doesn't actually support video processing. Maybe the transform simply incorrectly returns success from ProcessMessage.
Closing this. I think the proper fix is to drop the sample allocator from the session entirely as well as the inserted copiers, once the video processor d3d awareness is implemented. Still not completely sure how allocator starvation should be handled but maybe we can assume that it has enough sample for it to never happen.
This merge request was closed by Rémi Bernon.