Previously, the paused state was not handled.
-- v3: mf: Reset audio client on flush. mf: Update state and start clock for both paused and stopped.
From: Brendan McGrath bmcgrath@codeweavers.com
Previously, the paused state was not handled. --- dlls/mf/sar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/dlls/mf/sar.c b/dlls/mf/sar.c index 84824f954dd..d908e2bb085 100644 --- a/dlls/mf/sar.c +++ b/dlls/mf/sar.c @@ -636,7 +636,7 @@ static HRESULT WINAPI audio_renderer_clock_sink_OnClockStart(IMFClockStateSink * EnterCriticalSection(&renderer->cs); if (renderer->audio_client) { - if (renderer->state == STREAM_STATE_STOPPED) + if (renderer->state != STREAM_STATE_RUNNING) { if (FAILED(hr = IAudioClient_Start(renderer->audio_client))) WARN("Failed to start audio client, hr %#lx.\n", hr);
From: Brendan McGrath bmcgrath@codeweavers.com
--- dlls/mf/sar.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/dlls/mf/sar.c b/dlls/mf/sar.c index d908e2bb085..7b7e2c592ad 100644 --- a/dlls/mf/sar.c +++ b/dlls/mf/sar.c @@ -1458,6 +1458,7 @@ static HRESULT WINAPI audio_renderer_stream_Flush(IMFStreamSink *iface) } } renderer->queued_frames = 0; + IAudioClient_Reset(renderer->audio_client); LeaveCriticalSection(&renderer->cs);
return hr;
On Mon Apr 21 01:21:55 2025 +0000, Nikolay Sivov wrote:
I think what's missing is an audio client Reset(). We currently do Stop+Reset when the clock is stopped, and just Stop when the clock is paused. We don't Reset on pause so we don't lose pending data. But when we restart from at a different timestamp I think we should.
I agree, we do need that. When testing on Windows, it looks like that might be done in `Flush`. And it looks like `Flush` needs to be called explicitly (i.e. it is not done implicitly with a call to `IMFClockStateSink::OnClockStart`). So I've added a commit which does exactly that.
On Mon Apr 21 01:21:55 2025 +0000, Brendan McGrath wrote:
I agree, we do need that. When testing on Windows, it looks like that might be done in `Flush`. And it looks like `Flush` needs to be called explicitly (i.e. it is not done implicitly with a call to `IMFClockStateSink::OnClockStart`). So I've added a commit which does exactly that.
How can I verify that on Windows?
On Mon Apr 21 08:30:53 2025 +0000, Nikolay Sivov wrote:
How can I verify that on Windows?
There might be a better way, but I tested by sending data directly to an instance of SAR (created with `MFCreateAudioRenderer`) and after an arbitrary number of samples (15 in my example) I would simulate a seek forward by calling `IMFPresentationClock_Pause` and then `IMFPresentationClock_Start` for 3 seconds ahead of now. I would also add 3 seconds to the PTS of all input samples from that point (the first of these samples would also have the Discontinuity flag). I could hear no audible distortion doing this, which to me suggested no buffers were ever dropped.
By contrast, if I added an explicit call to `IMFStreamSink::Flush` between the `IMFPresentationClock_Pause` and `IMFPresentationClock_Start` calls; I could hear an audible skip.
On Mon Apr 21 21:10:54 2025 +0000, Brendan McGrath wrote:
There might be a better way, but I tested by sending data directly to an instance of SAR (created with `MFCreateAudioRenderer`) and after an arbitrary number of samples (15 in my example) I would simulate a seek forward by calling `IMFPresentationClock_Pause` and then `IMFPresentationClock_Start` for 3 seconds ahead of now. I would also add 3 seconds to the PTS of all input samples from that point (the first of these samples would also have the Discontinuity flag). I could hear no audible distortion doing this, which to me suggested no buffers were ever dropped. By contrast, if I added an explicit call to `IMFStreamSink::Flush` between the `IMFPresentationClock_Pause` and `IMFPresentationClock_Start` calls; I could hear an audible skip.
Dropping buffers with PTS being too far behind would be SAR functionality I think, right away in ProcessSample() or maybe it makes more sense to do later in audio_renderer_render(). But Reset() is to drop buffered data that was already pushed to the audio device. And that last part would be difficult to test cleanly.
Do we know for sure that Flush() is used during seeking? If it is, I think it's fine.
Nikolay Sivov (@nsivov) commented about dlls/mf/sar.c:
} } renderer->queued_frames = 0;
- IAudioClient_Reset(renderer->audio_client);
Please add a similar warning if Reset() failed, and return this error from Flush(). That will help to see if we are in incorrect state - you can't Reset() from running state.
Do we know for sure that Flush() is used during seeking?
I just confirmed. I created an IMFMediaSession with a mock source, mock MFT and mock sink, if I call `IMFMediaSession::Pause` and then `IMFMediaSession::Start` with specific time I see the following: ``` 10410.469939|09800|transform_mock_ProcessMessage: Flush 10410.469975|03592|source_mock_Start: MEUpdatedStream 10410.470066|03592|transform_mock_ProcessOutput: need more input 10410.470078|03592|stream_mock_RequestSample: Work item scheduled 0000000000000000 10410.470171|09800|sink_mock_Flush: . 10410.470195|09800|clock_sink_OnClockStart: . ```
So it: 1. Flushes the MFT; 2. Starts the source; 3. Requests output from the MFT (which is probably related to an outstanding sink request. This is something I think we might still need to implement); 4. Requests a sample from the source stream (likely due to `MF_E_TRANSFORM_NEED_MORE_INPUT`) 5. Flushes the Sink (where I believe `IAudioClient_Reset` is called) 6. Starts the clock on the sink
In contrast, if I just restart without a seek (i.e. use current time) I get: ``` 9198.091103|03596|source_mock_Start: MEUpdatedStream 9198.091297|15820|clock_sink_OnClockStart: . ```
So no flushing or requesting of new samples.
Dropping buffers with PTS being too far behind would be SAR functionality I think
I haven't tested SAR when using an external clock. But I found that when it uses its own clock (it implements the `IMFPresentationTimeSource` interface), it never drops samples. It just updates its clock to reflect the sample PTS. I captured that in a test here: https://gitlab.winehq.org/redmcg/wine/-/commit/f940f12b3c7d562fb2fb00a68a3e7...
In that test, I confirm the clock is `5510000`, then I get SAR to process a sample with PTS `20000000` and duration `100000`, sleep for 150ms (to allow time for SAR to process the sample) and then confirm the clock has magically jumped to `20100000` (which is a 1.5 second jump in a tenth of that time).
I've found this matches the behavior of `IMFMediaEngine` too; thus it must use the `IMFPresentationTimeSource` provided by SAR. So audio is never out of sync, as it determines the presentation time. It's up to the video to pause or speed-up as necessary to keep audio and video in sync.
I found I needed to implement this behavior to fix seek in VRChat (which uses `IMFMediaEngine`). I plan to raise an MR upstream soon (once I'm happy with it).