In an attempt to simplify the code, this removes the ability to output uncompressed samples—which spans about 70 self-contained lines of code in pad_added()—and replaces it with a whole separate transform per frontend. Why is this an improvement?
It is an improvement because it is how native behaves. The feature is not going to be used by any application and therefore unnecessary.
This removes pull-mode support, which breaks seeking or playback in many demuxers, including demuxers for formats that Windows natively supports.
You've mentioned sfdec and musepack before, these aren't supported by Media Foundation.
Setting aside the above: removing pull mode is done with the express goal of removing the PE-side read thread.
However, removing the read thread can be done without removing support for pull mode, simply by calling wg_parser_get_next_read_offset() from the same thread before pushing.
Decoupling MF from the other frontends makes other things easier to change. Removing the PE-side thread isn't really the goal, making the demuxer calls synchronous is.
There's several other reasons to do this, which I've written about in length elsewhere and which I'm not going over again.
Since the plan is to both push and read data from the same thread, how do we make sure we've pushed enough data (to get a sample) without pushing too much (such that pushing blocks due to overflow)?
The answer I've been given is "we always assume that the GStreamer demuxer acts synchronously". This is not guaranteed and may easily be broken by future library changes. The same assumption has already been broken in the transform.
Simplicity cannot come at the cost of removing vital code.
The assumption made for the transform still holds. Requiring multiple threads to parse a file is even more silly, and it makes our work toward Windows compatibility harder.
If it ever breaks, we should probably take that opportunity to reconsider the choice of GStreamer as a backend. Maybe I should even start doing that instead of arguing in vain.