Ambiguity? What does that mean?
It means that we have no control over what `videoconvert | videoflip | videoconvert` will actually do, and we can only hope that it will do something sensible and, for instance, not decide to color convert unnecessarily or fail to pass our pool through when it could.
If videoflip is copying when it should be passthrough then that should be fixed upstream. I don't want to introduce a bunch of extra code just to work around a GStreamer bug, especially if it's already fixed upstream (which is not clear from this commit message).
Sure, but we cannot fix older GStreamer versions, so lets also fix it on our side the right way, which is to provide the correct stride information on our buffers, and reduce the complexity of our pipelines, which will help reducing the risk and ease the debugging.
I still am failing to understand this at all, sorry. Why is the format that's set on the transform not the format we store?
Like the test shows, the input and output media types don't have to match their frame size exactly or can include or omit frame padding freely. This ends up with buffers being passed through with padding, with padding added or cropped accordingly.
However, GStreamer is unable to convey the buffer padding information in its caps and it is a buffer property only. When it matches caps, both input and output frame sizes have to match exactly.
If the client called the video processor SetInputType with a frame size of 96x96 and an aperture to 82x84, and SetOutputType with a frame size of 96x96 but without any aperture, we will try to create input/output formats with two different frame size, which would fail the caps negociation.
We need to consider in this case that the client wanted a 82x84 output with 14x12 padding. Other combinations need to be handled as well, so this takes the smaller frame size on input and output and consider any extra to be included in buffer padding.