This is a summary of what I have been doing with winemp3 over the weekend.
First, the idea of a GStreamer wrapper for wine is a no-go because 1) would not solve the jittering problem (more on this later), 2) I have not found yet a streaming source for raw memory buffers (as opposed to file/network streaming), 3) although GStreamer is LGPL, some of the codecs might be proprietary, and this might cause legal problems later on.
I fired the ACMAPP.EXE demo program from MSDN (the one I used to improve the msacm implementation), and noticed that MP3-encoded WAV files play almost normally with this app. This means that the codec itself is not the problem - the most likely source is the (odd?) use of msacm by dsound, which is in turn invoked by native quartz.dll. Before reaching this conclusion, I found that ACMAPP.EXE played the files without jittering, but it stopped playing a few moments before the end of the song. How much before depends on the actual length of the song but this was annoying enough to be worth investigating. For example, one sample song stopped playing 30 seconds before the actual end. However, according to the app, playback was already at the end of the song.
It turns out that mpglib has a queue of undecoded data in its decoding structure. On every call to the decodeMP3() function of mpglib, the caller specifies either a buffer of new data, or NULL to use the remaining data from a previous invocation. The buffer grows and shrinks dynamically with the amount of unencoded data passed by the caller. What was going on was the following:
- Caller (winmm) computes required output buffer for encoded input buffer. This uses acmStreamSize, which in turn invokes MPEG3_StreamSize. - MPEG3_StreamSize return a factor 4X for the size of the output buffer (in my tests, a 2333-byte encoded buffer expands into (at most) 9332 bytes of output, or so reports winemp3). - winmm happily calls acmStreamConvert() with a 2333 source buffer and a 9332 destination buffer. - mpglib decodes as much as it can into the 9332-byte buffer. However, in almost all cases, 9332 bytes is not enough to decode 2333 bytes of MP3 data. This is important to note, because it means a non-trivial amount of the MP3 data remains in the mpglib queue. - acmStreamConvert() is called over and over, with a 2333 source buffer and a 9332 destination buffer. The output buffer is always too small, so the undecoded data in the mpglib buffers accumulates. - winmm sends the last encoded buffer. Supposedly this means that decoding is complete, but it isn't, because mpglib has a backlog of undecoded data which has not been played. However, winmm stops there, and the user notices the cut in the middle of the song.
In the same situation, L3CODECA.ACM reports a non-integer factor for the output buffer, a little under 12X. Therefore L3CODECA.ACM always has enough space to consume the full source buffer and therefore does not cut the song in the middle.
The ideal solution would be to dynamically compute the expansion factor based on the input and output bitrates. The attached patch raises the expansion factor instead to 12X to ensure that mpglib can always decode all the input data without running out of output space.
Changelog:
* Increase size factor from 4 to 12 in MPEG_StreamSize, otherwise mpglib buffer queue grows * Add TRACE of mpglib buffer queue for conversion.
Alex Villacís Lasso
Alex Villacís Lasso wrote:
- Caller (winmm) computes required output buffer for encoded input
buffer. This uses acmStreamSize, which in turn invokes MPEG3_StreamSize.
- MPEG3_StreamSize return a factor 4X for the size of the output buffer
(in my tests, a 2333-byte encoded buffer expands into (at most) 9332 bytes of output, or so reports winemp3).
- winmm happily calls acmStreamConvert() with a 2333 source buffer and a
9332 destination buffer.
- mpglib decodes as much as it can into the 9332-byte buffer. However,
in almost all cases, 9332 bytes is not enough to decode 2333 bytes of MP3 data. This is important to note, because it means a non-trivial amount of the MP3 data remains in the mpglib queue.
- acmStreamConvert() is called over and over, with a 2333 source buffer
and a 9332 destination buffer. The output buffer is always too small, so the undecoded data in the mpglib buffers accumulates.
- winmm sends the last encoded buffer. Supposedly this means that
decoding is complete, but it isn't, because mpglib has a backlog of undecoded data which has not been played. However, winmm stops there, and the user notices the cut in the middle of the song.
this explains a lot more why there's bad sound quality while playing mp3
In the same situation, L3CODECA.ACM reports a non-integer factor for the output buffer, a little under 12X. Therefore L3CODECA.ACM always has enough space to consume the full source buffer and therefore does not cut the song in the middle.
The ideal solution would be to dynamically compute the expansion factor based on the input and output bitrates. The attached patch raises the expansion factor instead to 12X to ensure that mpglib can always decode all the input data without running out of output space.
As you said, the factor should be computed from the input and output formats, and shall not be a fixed factor. We need to change the factor, but changing it for another fixed value is not the right way to go.
- buffered_before = get_num_buffered_bytes(&amd->mp);
only compute it when you need (TRACE_ON...) A+