[PATCH v4 0/2] MR7623: winegstreamer: Use provided PTS and duration in video_decoder.

List overview All Threads

newer

older

[PATCH 0/2] MR7771: win32u: Make...

Re: [PATCH v2 0/2] MR7764: win32u:...

Brendan McGrath (＠redmcg)

4 Apr 2025 4 Apr '25

6:07 a.m.

This MR modifies winegstreamer to match Windows behaviour in that: 1. the video decoders will output the PTS and duration of the input sample (if provided); and 2. the WMV decoder will set any value not provided to zero

It also adds support for supplying a DTS value to the MFTs.

I've marked this as draft as it fixes the tests in MR !7563 (in addition to fixing some existing `test_wmv_decoder` tests). Also, as demonstrated in MR !7569, our demuxers output different timestamps to Windows. This change will result in those different timestamps being forwarded from the decoder. So we may also want to address that difference prior to accepting this MR.

-- v4: winegstreamer: Use provided PTS and duration in video_decoder. mf/tests: Add negative timestamp tests for h264.

https://gitlab.winehq.org/wine/wine/-/merge_requests/7623

Show replies by date

Brendan McGrath

4 Apr 4 Apr

6:07 a.m.

New subject: [PATCH v4 1/2] mf/tests: Add negative timestamp tests for h264.

From: Brendan McGrath bmcgrath@codeweavers.com

--- dlls/mf/tests/transform.c | 118 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 118 insertions(+)

diff --git a/dlls/mf/tests/transform.c b/dlls/mf/tests/transform.c index dfe3bddd454..2ac7906eda6 100644 --- a/dlls/mf/tests/transform.c +++ b/dlls/mf/tests/transform.c @@ -5195,6 +5195,58 @@ static void test_h264_decoder_timestamps(void) { 4003999, 333667 }, { 4337666, 333666 }, }; + static const struct timestamps input_sample_neg_ts[] = + { + { -333667, 333667, -1001000 }, + { -333667, 333667, -1001000 }, + { -333667, 333667, -1001000 }, + { 1000999, 333667, -667334 }, + { 333666, 333666, -333667 }, + { 0, 333666 }, + { 667332, 333667, 333666 }, + { 2335666, 333667, 667333 }, + { 1668333, 333666, 1001000 }, + { 1334666, 333667 }, + { 2001999, 333667, 1668333 }, + { 3670333, 333667, 2002000 }, + { 3002999, 333667, 2335666 }, + { 2669333, 333666 }, + { 3336666, 333667, 3003000 }, + { 5004999, 333667, 3336666 }, + { 4337666, 333666, 3670333 }, + { 4004000, 333666 }, + { 4671332, 333667, 4337666 }, + { 6339666, 333667, 4671333 }, + { 5672333, 333666, 5005000 }, + { 5338666, 333667 }, + { 6005999, 333667, 5672333 }, + { 7674333, 333667, 6006000 }, + { 7006999, 333667, 6339666 }, + { 6673333, 333666 }, + { 7340666, 333667, 7007000 }, + { 9008999, 333667, 7340666 }, + { 8341666, 333666, 7674333 }, + { 8008000, 333666 }, + { 8675332, 333667, 8341666 }, + { 10343666, 333667, 8675333 }, + { 9676333, 333666, 9009000 }, + { 9342666, 333667 }, + { 10009999, 333667, 9676333 }, + { 11678333, 333667, 10010000 }, + }; + static const struct timestamps exp_sample_neg_ts[] = + { + { -333667, 333667 }, + { 0, 333666 }, + { 333666, 333666 }, + { 667332, 333667 }, + { 1000999, 333667 }, + { 1334666, 333667 }, + { 1668333, 333666 }, + { 2001999, 333667 }, + { 2335666, 333667 }, + { 2669333, 333666 }, + };

IMFSample *input_sample, *output_sample; const BYTE *h264_encoded_data; @@ -5376,6 +5428,72 @@ static void test_h264_decoder_timestamps(void) ret = IMFTransform_Release(transform); ok(ret == 0, "Release returned %lu\n", ret);

+ /* Test when supplied sample timestamps include negative values */ + hr = CoCreateInstance(&CLSID_MSH264DecoderMFT, NULL, CLSCTX_INPROC_SERVER, + &IID_IMFTransform, (void **)&transform); + ok(hr == S_OK, "Got %#lx\n", hr); + + load_resource(L"h264data.bin", &h264_encoded_data, &h264_encoded_data_len); + + check_mft_set_input_type(transform, input_type_24fps_desc, S_OK); + check_mft_set_output_type(transform, output_type_24fps_desc, S_OK); + + hr = IMFTransform_ProcessMessage(transform, MFT_MESSAGE_NOTIFY_START_OF_STREAM, 0); + ok(hr == S_OK, "Got %#lx\n", hr); + + j = 0; + output_sample = create_sample(NULL, actual_width * actual_height * 2); + for (i = 0; i < ARRAYSIZE(exp_sample_neg_ts);) + { + winetest_push_context("output %ld", i); + hr = check_mft_process_output(transform, output_sample, &output_status); + ok(hr == S_OK || hr == MF_E_TRANSFORM_STREAM_CHANGE || hr == MF_E_TRANSFORM_NEED_MORE_INPUT, "ProcessOutput returned %#lx\n", hr); + if (hr == S_OK) + { + hr = IMFSample_GetSampleTime(output_sample, &time); + ok(hr == S_OK, "Got %#lx\n", hr); + todo_wine_if(i) + ok(time == exp_sample_neg_ts[i].time, "got time %I64d, expected %I64d\n", time, exp_sample_neg_ts[i].time); + hr = IMFSample_GetSampleDuration(output_sample, &duration); + ok(hr == S_OK, "Got %#lx\n", hr); + todo_wine + ok(duration == exp_sample_neg_ts[i].duration, "got duration %I64d, expected %I64d\n", duration, exp_sample_neg_ts[i].duration); + ret = IMFSample_Release(output_sample); + ok(ret == 0, "Release returned %lu\n", ret); + output_sample = create_sample(NULL, actual_width * actual_height * 2); + i++; + } + else if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT && j < ARRAYSIZE(input_sample_neg_ts)) + { + input_sample = next_h264_sample(&h264_encoded_data, &h264_encoded_data_len); + hr = IMFSample_SetSampleTime(input_sample, input_sample_neg_ts[j].time); + ok(hr == S_OK, "Got %#lx\n", hr); + hr = IMFSample_SetSampleDuration(input_sample, input_sample_neg_ts[j].duration); + ok(hr == S_OK, "Got %#lx\n", hr); + if (input_sample_neg_ts[j].dts) + { + hr = IMFSample_SetUINT64(input_sample, &MFSampleExtension_DecodeTimestamp, input_sample_neg_ts[j].dts); + ok(hr == S_OK, "Got %#lx\n", hr); + } + j++; + hr = IMFTransform_ProcessInput(transform, 0, input_sample, 0); + ok(hr == S_OK, "ProcessInput returned %#lx\n", hr); + ret = IMFSample_Release(input_sample); + ok(ret <= 1, "Release returned %lu\n", ret); + } + else if (hr == MF_E_TRANSFORM_NEED_MORE_INPUT && j == ARRAYSIZE(input_sample_neg_ts)) + { + ok(0, "no more input to provide\n"); + i = ARRAYSIZE(exp_sample_neg_ts); + } + winetest_pop_context(); + } + + ret = IMFSample_Release(output_sample); + ok(ret == 0, "Release returned %lu\n", ret); + ret = IMFTransform_Release(transform); + ok(ret == 0, "Release returned %lu\n", ret); + failed: winetest_pop_context(); CoUninitialize();

-- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/7623

Brendan McGrath

6:07 a.m.

New subject: [PATCH v4 2/2] winegstreamer: Use provided PTS and duration in video_decoder.

From: Brendan McGrath bmcgrath@codeweavers.com

--- dlls/mf/tests/transform.c | 40 ++------------- dlls/winegstreamer/gst_guids.h | 1 + dlls/winegstreamer/unixlib.h | 10 ++-- dlls/winegstreamer/video_decoder.c | 43 +++++++++++++--- dlls/winegstreamer/wg_sample.c | 9 ++++ dlls/winegstreamer/wg_transform.c | 79 +++++++++++++++++++++++++----- 6 files changed, 124 insertions(+), 58 deletions(-)

diff --git a/dlls/mf/tests/transform.c b/dlls/mf/tests/transform.c index 2ac7906eda6..4578dbac591 100644 --- a/dlls/mf/tests/transform.c +++ b/dlls/mf/tests/transform.c @@ -5285,11 +5285,9 @@ static void test_h264_decoder_timestamps(void) { hr = IMFSample_GetSampleTime(output_sample, &time); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine_if(i) ok(time == exp_nofps[i].time, "got time %I64d, expected %I64d\n", time, exp_nofps[i].time); hr = IMFSample_GetSampleDuration(output_sample, &duration); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine ok(duration == exp_nofps[i].duration, "got duration %I64d, expected %I64d\n", duration, exp_nofps[i].duration); ret = IMFSample_Release(output_sample); ok(ret == 0, "Release returned %lu\n", ret); @@ -5335,11 +5333,9 @@ static void test_h264_decoder_timestamps(void) { hr = IMFSample_GetSampleTime(output_sample, &time); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine_if(i) ok(time == exp_24fps[i].time, "got time %I64d, expected %I64d\n", time, exp_24fps[i].time); hr = IMFSample_GetSampleDuration(output_sample, &duration); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine ok(duration == exp_24fps[i].duration, "got duration %I64d, expected %I64d\n", duration, exp_24fps[i].duration); ret = IMFSample_Release(output_sample); ok(ret == 0, "Release returned %lu\n", ret); @@ -5386,11 +5382,9 @@ static void test_h264_decoder_timestamps(void) { hr = IMFSample_GetSampleTime(output_sample, &time); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine_if(i) ok(time == exp_sample_ts[i].time, "got time %I64d, expected %I64d\n", time, exp_sample_ts[i].time); hr = IMFSample_GetSampleDuration(output_sample, &duration); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine ok(duration == exp_sample_ts[i].duration, "got duration %I64d, expected %I64d\n", duration, exp_sample_ts[i].duration); ret = IMFSample_Release(output_sample); ok(ret == 0, "Release returned %lu\n", ret); @@ -5452,11 +5446,9 @@ static void test_h264_decoder_timestamps(void) { hr = IMFSample_GetSampleTime(output_sample, &time); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine_if(i) ok(time == exp_sample_neg_ts[i].time, "got time %I64d, expected %I64d\n", time, exp_sample_neg_ts[i].time); hr = IMFSample_GetSampleDuration(output_sample, &duration); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine ok(duration == exp_sample_neg_ts[i].duration, "got duration %I64d, expected %I64d\n", duration, exp_sample_neg_ts[i].duration); ret = IMFSample_Release(output_sample); ok(ret == 0, "Release returned %lu\n", ret); @@ -6835,28 +6827,12 @@ static void test_wmv_decoder(void) .attributes = output_sample_attributes, .sample_time = 0, .sample_duration = 333333, .buffer_count = 1, .buffers = &output_buffer_desc_nv12, - .todo_duration = TRUE, - }; - const struct sample_desc output_sample_desc_nv12_todo_time = - { - .attributes = output_sample_attributes, - .sample_time = 0, .sample_duration = 333333, - .buffer_count = 1, .buffers = &output_buffer_desc_nv12, - .todo_time = TRUE, .todo_duration = TRUE, }; const struct sample_desc output_sample_desc_rgb = { .attributes = output_sample_attributes, .sample_time = 0, .sample_duration = 333333, .buffer_count = 1, .buffers = &output_buffer_desc_rgb, - .todo_duration = TRUE, - }; - const struct sample_desc output_sample_desc_rgb_todo_time = - { - .attributes = output_sample_attributes, - .sample_time = 0, .sample_duration = 333333, - .buffer_count = 1, .buffers = &output_buffer_desc_rgb, - .todo_time = TRUE, .todo_duration = TRUE, };

const struct transform_desc @@ -6890,7 +6866,7 @@ static void test_wmv_decoder(void) .expect_output_type_desc = expect_output_type_desc, .expect_input_info = &expect_input_info, .expect_output_info = &expect_output_info, - .output_sample_desc = &output_sample_desc_nv12_todo_time, + .output_sample_desc = &output_sample_desc_nv12, .result_bitmap = L"nv12frame.bmp", .delta = 0, }, @@ -6901,7 +6877,7 @@ static void test_wmv_decoder(void) .expect_output_type_desc = expect_output_type_desc_rgb, .expect_input_info = &expect_input_info_rgb, .expect_output_info = &expect_output_info_rgb, - .output_sample_desc = &output_sample_desc_rgb_todo_time, + .output_sample_desc = &output_sample_desc_rgb, .result_bitmap = L"rgb32frame-flip.bmp", .delta = 5, }, @@ -6912,7 +6888,7 @@ static void test_wmv_decoder(void) .expect_output_type_desc = expect_output_type_desc_rgb_negative_stride, .expect_input_info = &expect_input_info_rgb, .expect_output_info = &expect_output_info_rgb, - .output_sample_desc = &output_sample_desc_rgb_todo_time, + .output_sample_desc = &output_sample_desc_rgb, .result_bitmap = L"rgb32frame-flip.bmp", .delta = 5, }, @@ -6923,7 +6899,7 @@ static void test_wmv_decoder(void) .expect_output_type_desc = expect_output_type_desc_rgb, .expect_input_info = &expect_input_info_rgb, .expect_output_info = &expect_output_info_rgb, - .output_sample_desc = &output_sample_desc_rgb_todo_time, + .output_sample_desc = &output_sample_desc_rgb, .result_bitmap = L"rgb32frame-flip.bmp", .delta = 5, }, @@ -6946,7 +6922,7 @@ static void test_wmv_decoder(void) .expect_output_type_desc = expect_output_type_desc_rgb_negative_stride, .expect_input_info = &expect_input_info_rgb, .expect_output_info = &expect_output_info_rgb, - .output_sample_desc = &output_sample_desc_rgb_todo_time, + .output_sample_desc = &output_sample_desc_rgb, .result_bitmap = L"rgb32frame.bmp", .delta = 5, }, @@ -7306,11 +7282,9 @@ static void test_wmv_decoder_timestamps(void) { hr = IMFSample_GetSampleTime(output_sample, &time); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine_if(i) ok(time == exp_nofps[i].time, "got time %I64d, expected %I64d\n", time, exp_nofps[i].time); hr = IMFSample_GetSampleDuration(output_sample, &duration); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine ok(duration == exp_nofps[i].duration, "got duration %I64d, expected %I64d\n", duration, exp_nofps[i].duration); ret = IMFSample_Release(output_sample); ok(ret == 0, "Release returned %lu\n", ret); @@ -7364,11 +7338,9 @@ static void test_wmv_decoder_timestamps(void) { hr = IMFSample_GetSampleTime(output_sample, &time); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine_if(i) ok(time == exp_24fps[i].time, "got time %I64d, expected %I64d\n", time, exp_24fps[i].time); hr = IMFSample_GetSampleDuration(output_sample, &duration); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine ok(duration == exp_24fps[i].duration, "got duration %I64d, expected %I64d\n", duration, exp_24fps[i].duration); ret = IMFSample_Release(output_sample); ok(ret == 0, "Release returned %lu\n", ret); @@ -7423,11 +7395,9 @@ static void test_wmv_decoder_timestamps(void) { hr = IMFSample_GetSampleTime(output_sample, &time); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine_if(i) ok(time == exp_sample_ts[i].time, "got time %I64d, expected %I64d\n", time, exp_sample_ts[i].time); hr = IMFSample_GetSampleDuration(output_sample, &duration); ok(hr == S_OK, "Got %#lx\n", hr); - todo_wine ok(duration == exp_sample_ts[i].duration, "got duration %I64d, expected %I64d\n", duration, exp_sample_ts[i].duration); ret = IMFSample_Release(output_sample); ok(ret == 0, "Release returned %lu\n", ret); diff --git a/dlls/winegstreamer/gst_guids.h b/dlls/winegstreamer/gst_guids.h index 313066fa4d8..ac38d48dfb9 100644 --- a/dlls/winegstreamer/gst_guids.h +++ b/dlls/winegstreamer/gst_guids.h @@ -22,3 +22,4 @@ DEFINE_GUID(CLSID_decodebin_parser, 0xf9d8d64e, 0xa144, 0x47dc, 0x8e, 0xe0, 0xf5, 0x34, 0x98, 0x37, 0x2c, 0x29); DEFINE_GUID(CLSID_mpeg_layer3_decoder, 0x38be3000, 0xdbf4, 0x11d0, 0x86, 0x0e, 0x00, 0xa0, 0x24, 0xcf, 0xef, 0x6d); DEFINE_GUID(WINESUBTYPE_Gstreamer, 0xffffffff, 0x128f, 0x4dd1, 0xad, 0x22, 0xbe, 0xcf, 0xa6, 0x6c, 0xe7, 0xaa); +DEFINE_GUID(MFSampleExtension_WG_Timestamp, 0x42554ce2, 0xed8c, 0x43ee, 0x8a, 0x04, 0xd6, 0xa1, 0x68, 0xf1, 0x8f, 0xee); diff --git a/dlls/winegstreamer/unixlib.h b/dlls/winegstreamer/unixlib.h index 19270bd731b..0ce7d9ede50 100644 --- a/dlls/winegstreamer/unixlib.h +++ b/dlls/winegstreamer/unixlib.h @@ -179,13 +179,16 @@ enum wg_sample_flag WG_SAMPLE_FLAG_HAS_DURATION = 4, WG_SAMPLE_FLAG_SYNC_POINT = 8, WG_SAMPLE_FLAG_DISCONTINUITY = 0x10, + WG_SAMPLE_FLAG_HAS_DTS = 0x20, + WG_SAMPLE_FLAG_WG_TIMESTAMP = 0x40, };

struct wg_sample { /* timestamp and duration are in 100-nanosecond units. */ - UINT64 pts; - UINT64 duration; + INT64 pts; + INT64 duration; + INT64 dts; LONG refcount; /* unix refcount */ UINT32 flags; UINT32 max_size; @@ -196,7 +199,7 @@ struct wg_sample struct wg_parser_buffer { /* pts and duration are in 100-nanosecond units. */ - UINT64 pts, duration; + INT64 pts, duration; UINT32 size; UINT32 stream; UINT8 discontinuity, preroll, delta, has_pts, has_duration; @@ -335,6 +338,7 @@ struct wg_transform_attrs UINT32 input_queue_length; BOOL allow_format_change; BOOL low_latency; + BOOL preserve_timestamps; };

struct wg_transform_create_params diff --git a/dlls/winegstreamer/video_decoder.c b/dlls/winegstreamer/video_decoder.c index 4086b586d76..8be00558ddf 100644 --- a/dlls/winegstreamer/video_decoder.c +++ b/dlls/winegstreamer/video_decoder.c @@ -19,6 +19,7 @@ */

#include "gst_private.h" +#include "gst_guids.h"

#include "mfapi.h" #include "mferror.h" @@ -112,6 +113,7 @@ struct video_decoder DMO_MEDIA_TYPE dmo_output_type;

INT32 previous_stride; + BOOL provide_timestamps; };

static inline struct video_decoder *impl_from_IUnknown(IUnknown *iface) @@ -941,7 +943,7 @@ static HRESULT WINAPI transform_ProcessOutput(IMFTransform *iface, DWORD flags, { struct video_decoder *decoder = impl_from_IMFTransform(iface); UINT32 sample_size; - LONGLONG duration; + LONGLONG duration = 0; IMFSample *sample; UINT64 frame_size, frame_rate; GUID subtype; @@ -991,16 +993,38 @@ static HRESULT WINAPI transform_ProcessOutput(IMFTransform *iface, DWORD flags, if (SUCCEEDED(hr = wg_transform_read_mf(decoder->wg_transform, sample, sample_size, &samples->dwStatus))) { + UINT32 value; + wg_sample_queue_flush(decoder->wg_sample_queue, false);

- if (FAILED(IMFMediaType_GetUINT64(decoder->input_type, &MF_MT_FRAME_RATE, &frame_rate))) - frame_rate = (UINT64)30000 << 32 | 1001; + if (decoder->provide_timestamps) + { + if (FAILED(IMFMediaType_GetUINT64(decoder->input_type, &MF_MT_FRAME_RATE, &frame_rate))) + frame_rate = (UINT64)30000 << 32 | 1001;

- duration = (UINT64)10000000 * (UINT32)frame_rate / (frame_rate >> 32); - if (FAILED(IMFSample_SetSampleTime(sample, decoder->sample_time))) - WARN("Failed to set sample time\n"); - if (FAILED(IMFSample_SetSampleDuration(sample, duration))) - WARN("Failed to set sample duration\n"); + duration = ((UINT64)10000000 * (UINT32)frame_rate + (frame_rate >> 32)/2) / (frame_rate >> 32); + } + + if (SUCCEEDED(IMFSample_GetUINT32(sample, &MFSampleExtension_WG_Timestamp, &value)) && value) + { + if (FAILED(IMFSample_GetSampleTime(sample, (LONGLONG*)&decoder->sample_time))) + WARN("Failed to get sample time\n"); + if (FAILED(IMFSample_GetSampleDuration(sample, &duration)) && FAILED(IMFSample_SetSampleDuration(sample, duration))) + WARN("Failed to get and set sample duration\n"); + } + else + { + LONGLONG pts; + if (decoder->provide_timestamps) + pts = decoder->sample_time; + else + pts = 0; + + if (FAILED(IMFSample_SetSampleTime(sample, pts))) + WARN("Failed to set sample time\n"); + if (FAILED(IMFSample_SetSampleDuration(sample, duration))) + WARN("Failed to set sample duration\n"); + } decoder->sample_time += duration; }

@@ -1633,6 +1657,7 @@ static HRESULT video_decoder_create_with_types(const GUID *const *input_types, U goto failed;

decoder->wg_transform_attrs.input_queue_length = 15; + decoder->wg_transform_attrs.preserve_timestamps = TRUE;

*out = decoder; TRACE("Created decoder %p\n", decoder); @@ -1703,6 +1728,8 @@ HRESULT h264_decoder_create(REFIID riid, void **out) decoder->wg_transform_attrs.output_plane_align = 15; decoder->wg_transform_attrs.allow_format_change = TRUE;

+ decoder->provide_timestamps = TRUE; + TRACE("Created h264 transform %p.\n", &decoder->IMFTransform_iface);

hr = IMFTransform_QueryInterface(&decoder->IMFTransform_iface, riid, out); diff --git a/dlls/winegstreamer/wg_sample.c b/dlls/winegstreamer/wg_sample.c index 116dbb1f3ec..c98416435a1 100644 --- a/dlls/winegstreamer/wg_sample.c +++ b/dlls/winegstreamer/wg_sample.c @@ -17,6 +17,7 @@ */

#include "gst_private.h" +#include "gst_guids.h"

#include "wmcodecdsp.h" #include "mfapi.h" @@ -308,6 +309,7 @@ HRESULT wg_transform_push_mf(wg_transform_t transform, IMFSample *sample, { struct wg_sample *wg_sample; LONGLONG time, duration; + UINT64 dts; UINT32 value; HRESULT hr;

@@ -326,6 +328,11 @@ HRESULT wg_transform_push_mf(wg_transform_t transform, IMFSample *sample, wg_sample->flags |= WG_SAMPLE_FLAG_HAS_DURATION; wg_sample->duration = duration; } + if (SUCCEEDED(IMFSample_GetUINT64(sample, &MFSampleExtension_DecodeTimestamp, &dts))) + { + wg_sample->flags |= WG_SAMPLE_FLAG_HAS_DTS; + wg_sample->dts = dts; + } if (SUCCEEDED(IMFSample_GetUINT32(sample, &MFSampleExtension_CleanPoint, &value)) && value) wg_sample->flags |= WG_SAMPLE_FLAG_SYNC_POINT; if (SUCCEEDED(IMFSample_GetUINT32(sample, &MFSampleExtension_Discontinuity, &value)) && value) @@ -368,6 +375,8 @@ HRESULT wg_transform_read_mf(wg_transform_t transform, IMFSample *sample, IMFSample_SetUINT32(sample, &MFSampleExtension_CleanPoint, 1); if (wg_sample->flags & WG_SAMPLE_FLAG_DISCONTINUITY) IMFSample_SetUINT32(sample, &MFSampleExtension_Discontinuity, 1); + if (wg_sample->flags & WG_SAMPLE_FLAG_WG_TIMESTAMP) + IMFSample_SetUINT32(sample, &MFSampleExtension_WG_Timestamp, 1);

if (SUCCEEDED(hr = IMFSample_ConvertToContiguousBuffer(sample, &buffer))) { diff --git a/dlls/winegstreamer/wg_transform.c b/dlls/winegstreamer/wg_transform.c index 1d59f806a31..4e9aa9c5b61 100644 --- a/dlls/winegstreamer/wg_transform.c +++ b/dlls/winegstreamer/wg_transform.c @@ -96,6 +96,7 @@ struct wg_transform GstCaps *input_caps;

bool draining; + UINT64 ts_offset; };

static struct wg_transform *get_transform(wg_transform_t trans) @@ -823,9 +824,11 @@ NTSTATUS wg_transform_push_data(void *args) struct wg_transform_push_data_params *params = args; struct wg_transform *transform = get_transform(params->transform); struct wg_sample *sample = params->sample; + GstCaps *transform_timestamp; const gchar *input_mime; GstVideoInfo video_info; GstBuffer *buffer; + INT64 pts, dts; guint length;

if (transform->draining) @@ -863,8 +866,35 @@ NTSTATUS wg_transform_push_data(void *args) buffer_add_video_meta(buffer, &video_info); }

+ /* we perform DTS first as it should never be later than PTS, so any required timestamp offset should be the maximum here */ + if (sample->flags & WG_SAMPLE_FLAG_HAS_DTS) + { + dts = sample->dts += transform->ts_offset; + if (dts < 0) + { + transform->ts_offset -= dts; + dts = 0; + } + GST_BUFFER_DTS(buffer) = dts * 100; + } if (sample->flags & WG_SAMPLE_FLAG_HAS_PTS) - GST_BUFFER_PTS(buffer) = sample->pts * 100; + { + pts = sample->pts += transform->ts_offset; + if (pts < 0) + { + transform->ts_offset -= pts; + pts = 0; + } + GST_BUFFER_PTS(buffer) = pts * 100; + + if (transform->attrs.preserve_timestamps) + { + transform_timestamp = gst_caps_new_empty_simple("timestamp/x-wg-transform"); + gst_buffer_add_reference_timestamp_meta(buffer, transform_timestamp, pts * 100, + sample->flags & WG_SAMPLE_FLAG_HAS_DURATION ? sample->duration * 100 : GST_CLOCK_TIME_NONE); + gst_caps_unref(transform_timestamp); + } + } if (sample->flags & WG_SAMPLE_FLAG_HAS_DURATION) GST_BUFFER_DURATION(buffer) = sample->duration * 100; if (!(sample->flags & WG_SAMPLE_FLAG_SYNC_POINT)) @@ -947,22 +977,43 @@ static NTSTATUS copy_buffer(GstBuffer *buffer, struct wg_sample *sample, gsize *

static void set_sample_flags_from_buffer(struct wg_sample *sample, GstBuffer *buffer, gsize total_size) { - if (GST_BUFFER_PTS_IS_VALID(buffer)) + GstReferenceTimestampMeta *timestamps; + GstCaps *transform_timestamp; + + transform_timestamp = gst_caps_new_empty_simple("timestamp/x-wg-transform"); + timestamps = gst_buffer_get_reference_timestamp_meta(buffer, transform_timestamp); + gst_caps_unref(transform_timestamp); + + if (timestamps) { - sample->flags |= WG_SAMPLE_FLAG_HAS_PTS; - sample->pts = GST_BUFFER_PTS(buffer) / 100; + /* GStreamer can overwrite our timestamps, so we use the wg-transform timestamps instead */ + sample->flags |= WG_SAMPLE_FLAG_HAS_PTS | WG_SAMPLE_FLAG_WG_TIMESTAMP; + sample->pts = timestamps->timestamp / 100; + if (timestamps->duration != GST_CLOCK_TIME_NONE) + { + sample->flags |= WG_SAMPLE_FLAG_HAS_DURATION; + sample->duration = timestamps->duration / 100; + } } - if (GST_BUFFER_DURATION_IS_VALID(buffer)) + else { - GstClockTime duration = GST_BUFFER_DURATION(buffer) / 100; - - duration = (duration * sample->size) / total_size; - GST_BUFFER_DURATION(buffer) -= duration * 100; if (GST_BUFFER_PTS_IS_VALID(buffer)) - GST_BUFFER_PTS(buffer) += duration * 100; + { + sample->flags |= WG_SAMPLE_FLAG_HAS_PTS; + sample->pts = GST_BUFFER_PTS(buffer) / 100; + } + if (GST_BUFFER_DURATION_IS_VALID(buffer)) + { + GstClockTime duration = GST_BUFFER_DURATION(buffer) / 100; + + duration = (duration * sample->size) / total_size; + GST_BUFFER_DURATION(buffer) -= duration * 100; + if (GST_BUFFER_PTS_IS_VALID(buffer)) + GST_BUFFER_PTS(buffer) += duration * 100;

- sample->flags |= WG_SAMPLE_FLAG_HAS_DURATION; - sample->duration = duration; + sample->flags |= WG_SAMPLE_FLAG_HAS_DURATION; + sample->duration = duration; + } } if (!GST_BUFFER_FLAG_IS_SET(buffer, GST_BUFFER_FLAG_DELTA_UNIT)) sample->flags |= WG_SAMPLE_FLAG_SYNC_POINT; @@ -1169,6 +1220,10 @@ NTSTATUS wg_transform_read_data(void *args) else status = read_transform_output(sample, output_buffer);

+ if ((sample->flags & (WG_SAMPLE_FLAG_WG_TIMESTAMP | WG_SAMPLE_FLAG_HAS_PTS)) == + (WG_SAMPLE_FLAG_WG_TIMESTAMP | WG_SAMPLE_FLAG_HAS_PTS)) + sample->pts -= transform->ts_offset; + if (status) { wg_allocator_release_sample(transform->allocator, sample, false);

-- GitLab https://gitlab.winehq.org/wine/wine/-/merge_requests/7623

Brendan McGrath (＠redmcg)

6:38 a.m.

When testing these changes with FFmpeg as the demuxer, I found FFmpeg can produce negative PTS and DTS. And then on testing Windows, I discovered that Windows supports negative PTS and DTS, however, gstreamer does not.

So I've made some adjustments to this MR: 1. To add tests for negative PTS and DTS; and 2. To convert PTS and DTS to be zero or above on input to gstreamer and then convert back the output so that our MFTs can support negative values

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7623#note_99975

Nikolay Sivov (＠nsivov)

8 Apr 8 Apr

9:58 a.m.

On Tue Apr 8 09:58:08 2025 +0000, Brendan McGrath wrote:

...

When testing these changes with FFmpeg as the demuxer, I found FFmpeg can produce negative PTS and DTS. And then on testing Windows, I discovered that Windows supports negative PTS and DTS, however, gstreamer does not. So I've made some adjustments to this MR:

To add tests for negative PTS and DTS; and

To convert PTS and DTS to be zero or above on input to gstreamer and

then convert back the output so that our MFTs can support negative values

Are negative timestamps encoded in media data directly?

P.S. we'll need someone to review winegstreamer parts.

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7623#note_100250

Brendan McGrath (＠redmcg)

10:35 p.m.

On Tue Apr 8 09:58:08 2025 +0000, Nikolay Sivov wrote:

...

Are negative timestamps encoded in media data directly? P.S. we'll need someone to review winegstreamer parts.

This is in the MFT, which can be used by the application directly. In theory, they can assign any value they want (which is how I test the negative values on Windows).

But in terms of media data, I'll use the MPEG4 container as an example (as that's the one where I saw FFmpeg produce negative values). You get given three values/tables: 1. [Media Time](https://developer.apple.com/documentation/quicktime-file-format/edit_list_at...); 2. [Time to Sample](https://developer.apple.com/documentation/quicktime-file-format/time-to-samp...); and 3. [Composition Offset](https://developer.apple.com/documentation/quicktime-file-format/composition_...)

The composition offset is only used when you have a codec that needs decoding in a different order than presentation. For example, H.264 which uses B-Frames.

And it seems Windows, GStreamer and FFmpeg all use these values differently and produce different results.

As an example, say we have: 1. Media Time = 0.666666s 2. Time to Sample 1 = 0s 3. Composition Offset for Sample 1 = 0.333333s

DTS is calculated as PTS minus composition offset.

Windows asserts, due to a Media Time of 0.666666s, then that will be the first PTS: ``` PTS = 0.666666s, DTS = 0.333333 ```

GStreamer asserts that the first DTS will be 0s: ``` PTS = 0.333333s, DTS = 0s ```

FFmeg assets that the first PTS will be 0s: ``` PTS = 0, DTS = -0.333333s ```

In addition, I've seen FFmpeg produce a negative PTS on audio, but I'll have to double check the code to see how it got there. But I did note a couple of possibilities: 1. The value was the negative value of Media Time; 2. The value was the negative value of the duration, and the packet was flagged with `AV_PKT_FLAG_DISCARD`

-- https://gitlab.winehq.org/wine/wine/-/merge_requests/7623#note_100338

Age (days ago)

Last active (days ago)

wine-gitlab@winehq.org

5 comments

3 participants

tags (0)

participants (3)

Brendan McGrath
Brendan McGrath (＠redmcg)
Nikolay Sivov (＠nsivov)