On 9/28/21 03:08, Derek Lesho wrote:
On 9/28/21 07:23, Zebediah Figura (she/her) wrote:
On 9/24/21 14:23, Derek Lesho wrote:
On 9/24/21 12:38, Zebediah Figura wrote:
On 9/24/21 3:05 AM, Derek Lesho wrote:
On 9/21/21 13:02, Zebediah Figura wrote:
- Rely on the fact that seek-data is sent during a seek, and use it
to effectively invalidate any buffers sent before the seek. Unlike the aforementioned problem with validation, this will actually work: at the time that seek-data is sent appsrc should be flushing, so any buffers we send before it will be discarded by appsrc, and any buffers we send afterward can be discarded by us if they aren't valid. This is very fragile, though. There's really no reason for appsrc to send seek-data for random-access streams in the first place, and this kind of synchronization is easy to get wrong if I haven't already.
This sounds like the best way forward in my opinion, how is it fragile?
It's fragile in general because we're making a lot of assumptions, that aren't documented, about when and from which thread gstappsrc will send these signals, and what synchronization guarantees it applies when doing so.
My perspective is that if we can get it working, documenting the two required quirks to the GStreamer project, eventually the problem will be be cleared up and/or fixed, and we can remove the quirks then.
The problem is that we're still going to have to keep those workarounds for a long time. Not to mention that clearly filing a bug is not enough to get the attention of the GStreamer developers.
As the maintainer of this code, I don't think I feel very comfortable relying this much on the internals of appsrc, whether as workarounds or not.
It seems that if we just ensure that a pushed buffer takes into account the latest seek/need data pair, not much can go wrong.
That's not enough by itself. We can, generally speaking, send a buffer after a seek-data/need-data is triggered but before it is actually sent.
Before what is sent?
Before the seek-data or need-data signal is sent. There is a window there, however short.
I'm not sure exactly what the difference between triggering a signal and sending a signal is, but assuming you mean that if seek-data doesn't acquire the mutex, it would be possible to push-buffer to not see the seek, then send the buffer after seek-data, then yeah, we do need to put seek-data and push-buffer in the same mutex to make sure that push-buffer is responding to the latest seek/need data callbacks.
This happens in practice with flushes.
In order to get around that, you need the hack I mentioned: assume that a flush will be accompanied by a seek-data signal. This is especially fragile because there's no reason for appsrc to even be sending this signal in the first place (it only really makes sense in "seekable" mode), and they could easily decide to not do that.
I'm not sure I understand, whenever the next buffer app source wants is not consecutively after the last, seek-data is required (and sent). What does "assume the flush will be accompanied by a seek-data signal" mean? The app source client doesn't have any conception of a flush, just seek-data and need-data, and to fix this problem all we need to do is add a quite rational check that push_data is responding to the latest seek/need data pair.
Let me try to explain more clearly. At least one race, the one I've been trying to describe, looks like this:
read thread main thread
push seek event retrieve data validate offset send flush-start emit seek-data send flush-stop push-buffer
At which point appsrc will get the wrong buffer.
Solution 1 has us rely on the seek-data signal to effectively wait for the read thread (most likely using the parser mutex) and, in a sense, put a barrier between reads occuring before and after the flush.
I mean, yeah, but from the perspective of the interface, all we are doing is making sure that buffers that are responding to requests from before a seek-data don't get sent, which I really don't think should be too controversial. That the seek happens to be caused by a flush shouldn't matter, we're putting a barrier between buffers sent for a different offset, and the current offset we've gotten from seek-data. Honestly, I'm not even sure this is a GStreamer bug, should they really have to handle a case where the client responds with an incorrect buffer after a seek?
I don't think this is obvious at all. It's certainly not called out in the documentation. And, moreover, it's a lot of work, and it doesn't even look like a remotely idiomatic solution. Not to mention that, as I've said, appsrc really has no reason to send seek-data at all here.
I'm pretty sure that you're correct that putting the whole of wg_parser_push_data inside a mutex, and the seek-data callback inside the same mutex, is actually sufficient to fix this race. But that stops being true if appsrc stops sending seek-data here.
Fundamentally the API contract here is that you don't send old buffers after a flush stops. DirectShow and GStreamer are both built on that idea. I believe that Media Foundation does away with it, which is one of the good things about Media Foundation. appsrc doesn't expose flushes to us, and it doesn't seem to do a good enough job of taking care of them itself. Trying to infer when they happen is too fragile for my taste.