On Tue Nov 8 18:45:05 2022 +0000, Zebediah Figura wrote:
Actually, that implies a deeper synchronization problem in mfplat. If we can't clean up the condition variable, then we can't free the filter either. Do we know that Shutdown() should unblock RequestSample()? It seems quite plausible that it shouldn't (after all, I don't think RequestSample() should ever deadlock—it just might take a few ms to complete), and in that case it's possible that we should be doing some sort of locking on the mfplat side. Alternatively, mfplat could move the whole wg_parser destruction to Release, on the idea that any concurrent thread calling RequestSample() should have a reference anyway. (I know that in theory mfplat should clean up everything in Shutdown, but as I understand that's just to break circular refcounts within mfplat, so it wouldn't apply here.)
I don't know if `Shutdown()` should unblock `RequestSample()` or not, and it doesn't appear to be easy to check. I can theoretically think of a test for that (by creating a media source over some byte stream specifically crafted so that it start blocking at some point), but it seems a lot of work for a detail that is probably not particularly important to reproduce correctly.
But yeah, the idea is that one of the three things you mention should happen: 1. `Shutdown()` unblocks `RequestSample()`, which presumably generates an error event; 2. `Shutdown()` waits for `RequestSample()` to return, hoping that it does quickly; (though this is not really obvious: the reason why this bug became apparent to me was that in some cases GStreamer is not able to produce a sample for a stream unless all the other streams are actively pumping out samples, I guess because of buffering reasons); 3. `Shutdown()` only does the cleanup part needed to break refcount cycles, and defer the rest to the destructor; implicitly, this still means that destruction has to wait for `RequestSample()` to finish.
Solutions 2 and 3 have a problem when `RequestSample()` is never going to finish, which probably should never happen, but in practice it does (and that's precisely the reason why I became aware of this bug). Specifically, if a file contains a lot of audio samples and no video samples for a while, and the application requests a video sample without receiving the audio samples, it is possible that GStreamer never produces the video sample (for reasons that I presume related to not having enough look ahead in the file), so everything is deadlocked.
I guess this should be a bug in itself to solve, but it might require some more architectural work. Solution 1 doesn't depend on that being solved, which is why I ended up coding solution 1.