If we don't use `stream_index`, then the first sample in the response queue is popped and delivered. However, this sample may not be from the stream that is making the sample request.
This fixes audio distortion in River City Girls. The video stream was hitting EOS prior to the audio stream. As a result, the video MFTs were drained and all the resultant samples were placed in the response. Then, when an audio sample was requested and then delivered, the `SOURCE_READER_ASYNC_SAMPLE_READY` operation was triggered which was delivering the first sample in that response queue; which was a video sample rather than the requested audio sample. As a result, the video data was rendered as audio (causing the audio distortion).