I don't think a preallocated pool can work. Not with allocations callbacks
like in wmvcore and quartz. Especially as these may block and as they are supposed to be called at specific times and in specific threads.
Sorry, "pool" was the wrong word to use. I mean rather that we should be able
to provide a specific buffer to the winegstreamer backend, just like with the transform. We don't exactly know the frame size beforehand, but we can make a good guess in pretty much every case (after all, quartz basically has to), and if we fail then we can fall back to a blit.
I know there was discussion about this in merge request 197 already, but I'd
like to bring this up again. (Partly because that discussion operated under some false premises on my part, e.g. we don't actually have to establish a fixed size pool beforehand, at least not for mfplat and wmvcore.)
Ultimately it'd be nice if this can act exactly like wg_transform, i.e. pass
in a pre-allocated sample, let the backend fill that and zero-copy if it can, with the frontend agnostic as to whether zero-copy actually happened. That way we only have the one really complex pattern.
I don't think this can work. To the contrary to the wg_transform, which pretty much works in a serialized way, because we buffer the input samples and only provide them to GStreamer when we're ready to receive at least one output, the wg_parser is asynchronous in nature. Input buffers are provided on GStreamer requests, and we should do the same for buffer allocations I don't think we can reliably assume that buffer allocations are quickly followed by buffer output and that it's safe to wait for the output when we allocated a buffer.
And yes I have considered and tried it, and some alternate approaches, and I
believe the one implemented here is good, and that it is also not so complex. It brings wg_transform and wg_parser interfaces closer to each other and use the same zero copy primitives for both.
I understand it's frustrating to have design decisions questioned, or have
approaches suggested that you've already tried and found inviable, but from my perspective, it's also frustrating to have no explanation of that. If this patch series is a worthwhile tradeoff, I'd like to understand why. I currently don't understand why. I also feel like this (and other design decisions made in winegstreamer, including those I've made) are not as simple as they're made out to be, and I'm not sure if that's an effect of the submitter's then-familiarity with the code, or a failing on my part to understand and remember the structure. To be sure, sometimes it's necessary to accept complex code, because of things like PE conversion or zero-copy not working well with the winegstreamer threading model, but it's not as clear to me that this is one of those times.
Having taken some more time to look at this, and examine the context around
it: on its own, this change seems arguably like an improvement, on the grounds that we already need to buffer one sample on the wg_parser side anyway, and my counter-proposal would involve adding extra buffering. My concern, however, is that this buffering will make zero-copy support in the parser harder, rather than easier. Partly because it's more code to work around, but partly because I think that it would interact badly with a world where we don't call back to the PE side to allocate new samples, and it's really not clear to me that calling back to the PE side is what we want. But again, these are only the interactions that I can foresee; if there's others, it's not really clear to me what they are and why.
There's no PE callback at all in my implementation. I also don't think it's so complex as you seem to put it. I think it's actually simpler, and more flexible to support some future changes than the current design:
The design is based on a black-box wg_parser, which interfaces resumes in a request / reply model.
There's a wait_request function (currently also a wait_stream_request for some implementation detail but both could be merged), and reply functions for each request. Waiting for a request can be done with a specific stream, and specific request type mask.
Requests are basically what you can expect: 'Input' / 'Allocation' / 'Output'.
* Input requests are received when the wg_parser needs data, the request carries the offset and size of the data that needs to be fed.
* Allocation requests are received when GStreamer needs to allocate a sample for a future output. They can be replied with a sample, or no sample which will let GStreamer allocate its own memory.
* Output requests are received when a stream wants to output some data, with a token that can be used by the client to find a sample it previously allocated. If the sample was previously allocated, zero-copy happens when the lient replies to it. If not, then a copy happens.
This currently works with a 1-n stream model, like the wg_parser currently uses but it will very easily scale to a n-n stream model if, in the future, we want to support muxers for instance.
For that, the Input requests would come for input streams, and wait_request / wait_stream_request functions could then be merged.
This design complements the simpler wg_transform which is meant to stay 1-1 and synchronous, but which otherwise works with the same wg_sample primitive.