On 6/21/21 9:35 AM, Anton Baskanov wrote:
Seems like GST_FORMAT_BYTES duration queries are forwarded all the way up to my_src, which returns the source file size. On the other hand, the convert query on stream->their_src expects the size of the decoded raw data. This patch uses parser->their_sink to convert the source file size to get a correct duration estimate.
Sorry for the complete silence. This patch set, and this patch in particular, has just been sitting there bothering me. I suspect that even if this patch set is correct it needs an audit of GStreamer elements to make sure it's always doing the right thing.
The basic problem is twofold: firstly, that GStreamer duration and convert queries are underspecified; and secondly, that the apparent actual behaviour is counterintuitive.
For example, I'd expect that a GST_FORMAT_TIME duration query should retrieve the actual running time of a stream (which might be different from the running time of a muxed file!), and that a GST_FORMAT_BYTES duration query should retrieve the total *output* size, in bytes, of a stream. But the default behaviour for any pad is to cheerfully ask the upstream filter, and GstBaseParse even explicitly does this *before* returning its own answers, so it'll end up returning the encoded size or even the whole (possibly muxed) file size. A similar problem occurs with GST_QUERY_CONVERT.
The ultimate effect is that the application can't predict who will respond to a duration or convert query, and thus they have no idea how to interpret the data they get back. In the case of DURATION + GST_FORMAT_TIME I suspect most applications get away with it, because usually the duration of a stream is (or is close enough to) the duration of a file. But GST_FORMAT_BYTES and any variation of CONVERT just seems completely broken...
Unfortunately, I'm not sure that downstream CONVERT queries are any better. The default behaviour is still to pass the query to a random src, and the first two demuxers I checked (avidemux and mpegpsdemux) both do that. I think both actually give us a correct duration in response to GST_QUERY_DURATION + GST_FORMAT_TIME, but it's not a good sign.
It may end up being best just to revert this path entirely, and rely on polling for GST_QUERY_DURATION + GST_FORMAT_TIME. The path was added by 613446d018. Looking back at mail history, I think I wrote it in response to [1], so it was actually just for dlls/quartz/tests/test.mp3.
It turns out I only got away with this because, in the case of mp3, there's no container, so the byte length of the encoded stream is the byte length of the file, mpg123 just happened to handle neither of these queries, and mpegaudioparse handled GST_QUERY_CONVERT in the intuitive way.
Unfortunately, unless I misread, I think there's currently no reliable way to get the duration of a CBR file from mpegaudioparse, except by relying on GST_QUERY_CONVERT.
So, what do we do going forward? Unfortunately I think this is the point where we need to ask GStreamer for help, ideally to clarify their documentation, and perhaps fix elements accordingly. And if we can't reliably use GST_QUERY_CONVERT, we may want to revert 613446d018 entirely, and try to fix mpegaudioparse to give us a sane duration. And the latter seems like probably a good idea anyway.
The rest of this series looks mostly correct, but patch 2/5 is tricky, because it's also not clear to me from the GStreamer documentation whether duration-changed will always be sent if the duration isn't immediately available. And if we need to ask for clarification anyway, it would probably be good to get that clarified at the same time.
Anyway, I'll start writing an email to gstreamer-devel soon.
ἔρρωσθε, Zebediah
[1] https://www.winehq.org/pipermail/wine-devel/2020-June/167776.html