On 5/17/22 13:47, RĂ©mi Bernon (@rbernon) wrote:
Well it's actually pretty clear that it's not going to be possible to implement it unless you move most of wg_transform_read_data logic out of the unix side, to the PE-side, exposing additional entry points for each of the steps it goes through (which is what you are doing there).
I don't think it's a good idea, because:
- The entry points can actually be reduced to push_data / read_data operations, which intuitively maps to the MF transforms or the DMO ProcessInput / ProcessOutput methods.
But it doesn't map nicely to DirectShow, and it only works because ProcessOutput() is kind of badly designed.
To be clear, I don't object in principle to having an internal API that closely matches a badly designed external one, but that's not quite what we have here. Some aspects of the Media Foundation API are implemented on the PE side, and the way in which they're split up feels awkward and less than fully intuitive to me.
- It complicates the unixlib API by multiplying entry points, although they are only going to work when called in a specific sequence.
My proposal as given (and wg_parser as it exists, for that matter) is not great because get_buffer() kind of implies that it consumes a buffer, which it doesn't. But if you rename the first API to something like "wg_transform_get_output_format()" it becomes clearer. The point is that it doesn't actually affect the state and hence is well-defined at any point.
To get zero-copy you'll have to add even more entry points, to provide the output buffer beforehand, and release it after the processing.
I'm confused about this, because this doesn't match my (somewhat dim) awareness of what would be necessary for zero-copy. So maybe it would help to hash that out now completely.
My understanding is:
* in some cases we can and should transform data in-place (perhaps because the PE side expects it, not that I'm aware of any such cases). In that case we should feed a writable GstBuffer into the pipeline, and if the output buffer was not the same as the input buffer, blit the latter into the former.
* if we cannot transform data in-place (either because GStreamer doesn't support it or because it'd confuse the PE side, not that I'm aware of any such cases there either), we should try to let GStreamer elements fill buffers allocated by the PE side. In order to achieve this we should set up a GstBufferPool ahead of time containing a fixed number of buffers. (I think we can guarantee we won't run out?) If the output buffer comes from this pool, we can retrieve the corresponding input buffer (the original intent of struct wg_sample) and return that pointer to user space. If not, we should call gst_buffer_pool_acquire_buffer() ourselves and blit the output buffer into it.
In both cases, though, it doesn't make sense to force a specific output buffer unless it's the same as the input buffer, which means there's no need for an extra API. Is there some extra consideration I'm missing here?
The other consequence that falls out from this is that the act of pushing data will inherently cause data to be written; i.e. there is no concept of "peek" quite like pipes. On the other hand, we can still "peek" at the last output buffer (and, more importantly, its associated metadata) without actually dequeuing it (and hence without changing the wg_transform state at all), which is kind of what I was trying to propose in the first place. It's not obvious to me why this is problematic; am I missing something here as well?
- This specific call sequence (although some steps such as format change checks are not always useful, but will probably still be needed every time) will have to be duplicated in each transform when it can be factored in a single place.
Sure. On the other hand, we already have (somewhat misleadingly named) helpers that do most of the sample post-processing already. Maybe I'm missing a reason that they couldn't be extended to cover the entirety of ProcessOutput()...?