[PATCH 0/2] MR10864: winegstreamer: Normalize PTS values so the stream starts at zero.
mpegpsdemux uses raw PTS values from the stream, which often start at a non-zero timestamp. To avoid an initial delay, we normalize the PTS by subtracting the first raw value so that the buffer timeline begins at zero. Without this adjustment, the first sample would not be presented until its original (non-zero) timestamp. Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=59565 -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864
From: Akihiro Sagawa <sagawa.aki@gmail.com> --- dlls/quartz/tests/mpegsplit.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/dlls/quartz/tests/mpegsplit.c b/dlls/quartz/tests/mpegsplit.c index 1d62f0000ee..9fd94f84b90 100644 --- a/dlls/quartz/tests/mpegsplit.c +++ b/dlls/quartz/tests/mpegsplit.c @@ -1084,6 +1084,7 @@ struct testfilter HANDLE eos_event; unsigned int sample_count, eos_count, new_segment_count, byte_count; REFERENCE_TIME segment_start, segment_end_min, segment_end_max, seek_start, seek_end; + REFERENCE_TIME sample_start; LONGLONG read_position; }; @@ -1212,6 +1213,9 @@ static HRESULT WINAPI testsink_Receive(struct strmbase_sink *iface, IMediaSample if (winetest_debug > 1) trace("%04lx: Got sample with timestamps %I64d-%I64d.\n", GetCurrentThreadId(), start, end); + if (filter->sample_count == 0 && filter->byte_count == 0) + filter->sample_start = start; + ok(filter->new_segment_count, "Expected NewSegment() before Receive().\n"); IPin_QueryInterface(iface->pin.peer, &IID_IMediaSeeking, (void **)&seeking); @@ -2105,6 +2109,11 @@ static void test_video_file(void) ok(testsink_video.byte_count == 1214, "Video sink got %u bytes.\n", testsink_video.byte_count); ok(testsink_audio.byte_count == 8777, "Audio sink got %u bytes.\n", testsink_audio.byte_count); + /* Native uses 55 for the first sample timestamp, but there's no need to copy the exact value. */ + todo_wine ok(min(testsink_video.sample_start, testsink_audio.sample_start) < 100, + "The first sample timestamp is off by more 100, video %I64d and audio %I64d.\n", + testsink_video.sample_start, testsink_audio.sample_start); + IAMStreamSelect_Release(sel); IPin_Release(source_video); IPin_Release(source_audio); -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10864
From: Akihiro Sagawa <sagawa.aki@gmail.com> mpegpsdemux uses raw PTS values from the stream, which often start at a non-zero timestamp. To avoid an initial delay, we normalize the PTS by subtracting the first raw value so that the buffer timeline begins at zero. Without this adjustment, the first sample would not be presented until its original (non-zero) timestamp. Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=59565 --- dlls/quartz/tests/mpegsplit.c | 2 +- dlls/winegstreamer/wg_parser.c | 23 ++++++++++++++++++++++- 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/dlls/quartz/tests/mpegsplit.c b/dlls/quartz/tests/mpegsplit.c index 9fd94f84b90..913c19a4f63 100644 --- a/dlls/quartz/tests/mpegsplit.c +++ b/dlls/quartz/tests/mpegsplit.c @@ -2110,7 +2110,7 @@ static void test_video_file(void) ok(testsink_audio.byte_count == 8777, "Audio sink got %u bytes.\n", testsink_audio.byte_count); /* Native uses 55 for the first sample timestamp, but there's no need to copy the exact value. */ - todo_wine ok(min(testsink_video.sample_start, testsink_audio.sample_start) < 100, + ok(min(testsink_video.sample_start, testsink_audio.sample_start) < 100, "The first sample timestamp is off by more 100, video %I64d and audio %I64d.\n", testsink_video.sample_start, testsink_audio.sample_start); diff --git a/dlls/winegstreamer/wg_parser.c b/dlls/winegstreamer/wg_parser.c index da414af6134..6f13047a4a6 100644 --- a/dlls/winegstreamer/wg_parser.c +++ b/dlls/winegstreamer/wg_parser.c @@ -68,6 +68,7 @@ struct wg_parser GstPad *my_src; guint64 file_size, start_offset, next_offset, stop_offset; + GstClockTime base_pts; guint64 next_pull_offset; gchar *uri; @@ -366,8 +367,17 @@ static NTSTATUS wg_parser_stream_get_buffer(void *args) * that this will need modification to wg_parser_stream_notify_qos() as * well. */ + /* Because mpegpsdemux reports a non-zero PTS for the earliest buffer among + * all streams, we normalize the PTS by subtracting base_pts so that our + * stream starts at zero. */ + if ((wg_buffer->has_pts = GST_BUFFER_PTS_IS_VALID(buffer))) - wg_buffer->pts = GST_BUFFER_PTS(buffer) / 100; + { + if (GST_CLOCK_TIME_IS_VALID(parser->base_pts)) + wg_buffer->pts = (GST_BUFFER_PTS(buffer) - parser->base_pts) / 100; + else + wg_buffer->pts = GST_BUFFER_PTS(buffer) / 100; + } if ((wg_buffer->has_duration = GST_BUFFER_DURATION_IS_VALID(buffer))) wg_buffer->duration = GST_BUFFER_DURATION(buffer) / 100; wg_buffer->discontinuity = GST_BUFFER_FLAG_IS_SET(buffer, GST_BUFFER_FLAG_DISCONT); @@ -725,6 +735,16 @@ static GstFlowReturn sink_chain_cb(GstPad *pad, GstObject *parent, GstBuffer *bu if (!stream->has_buffer) { stream->has_buffer = true; + + /* Keep the earliest PTS for adjusting. */ + if (GST_BUFFER_PTS_IS_VALID(buffer) && + (GST_BUFFER_PTS(buffer) < parser->base_pts)) + { + parser->base_pts = GST_BUFFER_PTS(buffer); + GST_LOG("Updated base PTS to %" GST_TIME_FORMAT ".", + GST_TIME_ARGS(parser->base_pts)); + } + pthread_cond_signal(&parser->init_cond); } @@ -1883,6 +1903,7 @@ static NTSTATUS wg_parser_create(void *args) parser->output_compressed = params->output_compressed; parser->err_on = params->err_on; parser->warn_on = params->warn_on; + parser->base_pts = GST_CLOCK_TIME_NONE; GST_DEBUG("Created winegstreamer parser %p.", parser); params->parser = (wg_parser_t)(ULONG_PTR)parser; return S_OK; -- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/10864
The following is a set of anticipated Q&A prepared in advance. 1. Why the code doesn't use Segment event? There are two reasons. First, the Segment event is emitted not only at the beginning of playback but also during seek operations, which makes it unsuitable for determining the initial playback position. Second, mpegpsdemux uses the first PTS it encounters rather than the minimum PTS within the first few megabytes of the stream, so relying on the Segment event would not provide a consistent or correct baseline timestamp. 2. Does this change the playback duration? No. The value returned by the GetDuration method remains unchanged. This is because mpegpsdemux returns the difference between the maximum PTS and the minimum PTS as the duration, so it is not affected even if the PTS is shifted. 3. Does this change the seeking behavior? No. mpegpsdemux performs seeking based on the SCR (System Clock Reference), so specifying a PTS does not allow accurate seeking. Since the SCR is generally smaller than the PTS, specifying 0 will correctly seek to the beginning of the stream. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864#note_139427
There's probably a way you're supposed to deal with those segment problems, but I never found segments to be an intuitive abstraction, so I'm in favor of not using them. However, is QC going to work right? Don't we need to do the opposite bias to account for that? I mostly just would like to understand where these offsets are coming from in the first place, and why they differ between GStreamer and Windows. But I don't think it's unreasonable to take this even without that understanding... -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864#note_139910
On Fri May 15 07:49:04 2026 +0000, Elizabeth Figura wrote:
There's probably a way you're supposed to deal with those segment problems, but I never found segments to be an intuitive abstraction, so I'm in favor of not using them. However, is QC going to work right? Don't we need to do the opposite bias to account for that? I mostly just would like to understand where these offsets are coming from in the first place, and why they differ between GStreamer and Windows. But I don't think it's unreasonable to take this even without that understanding... I actually has the game that have such this issue, and its video. did you need its ffprobe -show_packets result ?
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864#note_139943
On Fri May 15 07:49:04 2026 +0000, Chunhao Hung wrote:
I actually has the game that have such this issue, and its video. did you need its ffprobe -show_packets result ? Even the test video we already use in Wine has a similar situation, so I don't think that's necessary, but thank you for the offer.
-- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864#note_140054
On Fri May 15 16:51:48 2026 +0000, Elizabeth Figura wrote:
Even the test video we already use in Wine has a similar situation, so I don't think that's necessary, but thank you for the offer. Good point. Actually, quality control is something I already have in mind.
As I understand it, GStreamer expects the running time for a QoS event (ref. gst_event_new_qos). So as long as the running time is used, it works better with my changes, because the running time (on the GStreamer side) is based on the beginning of the segment. However, according to the comment added in commit 5e222064ab27266c0f44ec3986872f611772ff10, wg_parser_stream_notify_qos() is expected to receive stream time. On the other hand, the newly added gst_segment_to_running_time() in that commit converts a position within the segment into running time. I find this discrepancy confusing. I would appreciate it if you could clarify the intention behind this design. Once the intention behind these behaviors becomes clear, I plan to make the necessary changes for QC. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864#note_140272
On Sun May 17 11:49:44 2026 +0000, Akihiro Sagawa wrote:
Good point. Actually, quality control is something I already have in mind. As I understand it, GStreamer expects the running time for a QoS event (ref. gst_event_new_qos). So as long as the running time is used, it works better with my changes, because the running time (on the GStreamer side) is based on the beginning of the segment. However, according to the comment added in commit 5e222064ab27266c0f44ec3986872f611772ff10, wg_parser_stream_notify_qos() is expected to receive stream time. On the other hand, the newly added gst_segment_to_running_time() in that commit converts a position within the segment into running time. I find this discrepancy confusing. I would appreciate it if you could clarify the intention behind this design. Once the intention behind these behaviors becomes clear, I plan to make the necessary changes for QC. Regarding the offset, you can see the following output from `ffprobe -show_packets dlls/quartz/tests/test.mpg`:
[PACKET]
codec_type=video
stream_index=0
pts=48600
pts_time=0.540000 <-- (a)
dts=45000
dts_time=0.500000
duration=3600
duration_time=0.040000
size=859
pos=30
flags=K__
[/PACKET]
(...)
[PACKET]
codec_type=audio
stream_index=1
pts=47618
pts_time=0.529089 <-- (b)
dts=47618
dts_time=0.529089
duration=2351
duration_time=0.026122
size=1253
pos=2048
flags=K__
[/PACKET]
(...)
PTS values in MPEG-1 streams typically do not start at zero because encoders may apply an initial offset to keep PTS/DTS values in a convenient positive range. GStreamer (specifically mpegpsdemux) returns these PTS values as they are --- for example, 0:00:00.540000000 for the first video buffer (ref. a) and 0:00:00.529088888 for the first audio buffer (ref. b). On native, it returns 55 as the REFERENCE_TIME for the first audio sample and 109,166 for the first video sample. I suspect the audio timestamp is derived from (b) with an additional processing delay applied, whereas the video timestamp reflects (a) adjusted by the offset between (a) and (b). -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864#note_140277
Sorry about the delay, again >_> That explanation makes a lot of sense, thanks. The one thing I notice that worries me is that we're setting the base pts as the minimum, but we don't actually necessarily wait for a buffer from every stream (basically, we only actually wait for caps and tags, and rely on a buffer to tell us we won't get any more), so we could end up returning a sample and then base_pts gets updated later. I think the right solution here is to go ahead and make sure every stream has a buffer before we consider initialization done. For regression testing purposes I would highly recommend putting this in its own commit. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864#note_141447
Thanks for the feedback. While implementing, I realized that I need to check whether the start timestamp is set to 0 when using Media Foundation, so the next patch will be delayed a little longer. -- https://gitlab.winehq.org/wine/wine/-/merge_requests/10864#note_142422
participants (4)
-
Akihiro Sagawa -
Akihiro Sagawa (@sgwaki) -
Chunhao Hung (@otakuxtom) -
Elizabeth Figura (@zfigura)