The AAC encoder actually doesn't expose the generated `MF_MT_USER_DATA` in the output type, you were right. Which only makes it more important that we either generate it in the IMFMediaSink or parse it from the input stream.
Interesting, the documentation claims otherwise. It may be worth adding a test that explicitly shows this.