I've been staring at this for a while, and I'm still not convinced that this abstraction is an improvement. I think I get the idea—to try to share code between the input and output sample path—but as far as I can tell the only code that's actually *shared* is the code to retrieve the buffer pointer itself and the maximum length. Everything else is either input-only or output-only. As such we end up introducing a lot of code into the common path that kind of obscures the otherwise simpler functionality of push/read. (It's also kind of confusing that we're reading the previous values of an output sample, or writing values into an input sample.)
I don't intend to let that block progress on the transform objects, but if it's possible I'd rather hold off on converting the parser to this abstraction—at least until the transforms have reached their "final" (zero-copy) form [not that it's immediately clear why this abstraction makes more sense in that case]. Would it be feasible to hold off on these patches for now?