Fwiw like I said elsewhere, I don't think that hooking the decoderbin decoders input and stealing compressed buffers from the pipeline is the right way to do this, and I think it adds a lot of complexity.
I would like to be wary of situations where a contributor, having written code, to find it simple and straightforward (at least in the moment) while a reviewer finds it anything but. I did not write this code, but I did (IIRC) propose the general design, and so if someone else finds it complex or unintuitive, I'd like to take that into account.
Hence, while I know you said you've gone over the problems with this design in the past, but I'm afraid that so much has been said that I cannot recall those conversations, or at least, I cannot recall any specific problems that were raised, so I'd appreciate if you'd repeat here. What is it about this approach is it that seems wrong to you?
For what it's worth, here's an attempt at detailing the thought process that leads me to take stock in this design:
Some application depends on creating a demuxer, one that is currently backed by wg_parser. Instead of receiving fully uncompressed samples, it needs to recieve compressed samples, but in all other respects, the process of demuxing remains the same. Hence all we need to do is modify the current code so that we don't decode all the way. decodebin makes it very easy to do this, by simply stopping the process of decoding before it gets to the end.
One part of this problem statement is that "compressed samples" is vague. What do we count as a compressed format? In practice, I think this needs to be a manually written whitelist of formats. There are two reasons:
- If we suddenly *actually* output compressed samples for all types, an application which does ultimately want to autoplug to raw might be stymied by the lack of a decoder component in Wine. This way makes it very easy to stop on new formats one at a time, when we are relatively more sure that we can handle them. This point is less salient if the application has to currently manually enable the feature via a wmvcore API. On the other hand, I'm looking forward to using this feature in other places; e.g. I know that some mfplat applications need this.
- Not all formats exist on Windows. We want to support things like media players with arbitrary formats—we want users to be able to play their ogg vorbis files, and so on. We could add these formats, but that would be a lot of work, and some of them may be nontrivial. If we just keep decodebin instead, that way we get everything, right away, for free.
With that said, some questions that occurred to me while thinking about this:
* **Do we really want to use decodebin for this? Should we write our own autoplugging code?** I think using decodebin is the right choice. We need pretty much all of the decodebin features we're currently using, and rewriting them would be a lot of work for no benefit I can see.
Using decodebin currently comprises two steps: start the autoplugging, and provide a callback that determines whether we should stop at any given point. The approach that Ziqing takes is a bit more complex, because the autoplugging needs to be conditional: start the autoplugging, stop (always), then append another decodebin and autoplug from there. *Not* using decodebin means that we have to rewrite not just all of the autoplugging logic, but other decodebin features we need, such as detecting parsers and multiqueuing.
Note that we can't just look for a demuxer element and have done, for one, and possibly two reasons. The first reason is described above: we don't want to stop at *every* compressed format. The second reason is that we might need to append an extra element for some formats anyway, although I'm not sure of this. I'm thinking of elements like h264parse, which do some preprocessing on the compressed formats but don't actually decompress them. I don't know if that element is necessary, but even if not, something like it might be.
* **Should we use parsebin instead? It seems made for this purpose.** Problem is, parsebin is a relatively recent feature. More to the point, we can't "just" use parsebin everywhere, for the same reasons.
* **Should we actually use wg_parser for this, though? Adding another feature to wg_parser increases the complexity, and this does touch a lot of core code.** On the other hand, we do need the rest of wg_parser, so duplicating all that to another winegstreamer backend would not really help anything.
Some of the parts of Ziqing's patches that you mentioned also left a distaste in my mouth in I think the same way. In particular:
- the use of a probe never seems like the cleanest approach. On the other hand, probes are pretty clearly documented, so we know this works, and I don't have a better objection than "it feels ugly". That said... while recreating the pipeline *within* wg_parser gets complicated, as Ziqing says, I still wouldn't object to rewriting this so that we just destroy and recreate the wg_parser?
- the extra code to try to handle toggling mid-stream is added complexity that will probably never actually be required. I'd prefer to just get rid of it, honestly, but I find it difficult to say as much in a review, when it is a feature that exists, and I have a hard time saying I hate it.