On the client side, frame aperture may be included or omitted. This can be used to include or drop the frame padding, and the VideoProcessor MFT supports it.
On GStreamer side, we can only use the unpadded frame size in the caps, because this is how decoders work, and padding needs to be added as a property of the input/output video buffers. Then, frame size needs to be adjusted to be consistent between input and output, and any difference considered as extra padding to be added on one side or the other.