http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #24 from Alexey Loukianov mooroon2@mail.ru 2011-11-03 17:00:44 CDT --- (In reply to comment #23)
What I'm going to test tonight is to try to report padding based on the amount of data that had been uploaded into ALSA and not on the amount of free buffer space as reported by ALSA. While being obviously wrong (it would report to an app that some samples had been already played while actually they might be not) it might be yet another working workaround for this bug.
As expected, using something like this:
/* FIXME by LeXa2: Lie app about current padding */ *out = This->held_frames;
in GetCurrentPad() workarounds this bug to a some extent. No more underruns, but a huge latency introduced due to using entire size of the ALSA buffer (which is by default is around 0.3s or 16368 frames on my rig) instead of its "duration" frames subchunk.
Using more artificial (and even more incorrect) construct like this:
*out = (UINT32)max(((1.0 * This->alsa_bufsize_frames - avail_frames) / This->mmdev_period_frames - 1), 0) * This->mmdev_period_frames + This->held_frames;
which effectively enlarges used buffer duration to be ~4x period size (that is 2x enlarge introduced by max(Y-1, 0)/X*X construct above + original 2x duration size) fixes the problem keeping the latency at adequate level (40ms is barely noticeable for general gaming use).
Having duration to be ~3x period size using the same method isn't enough to get rid of xruns. It seems that the real limiting factor for such case is that the fresh audio data submitted by app is held in local buffer until next timer event and only then is pumped out to ALSA. Couple it with the general granularity jitter causing timer event callback to be invoked at pretty unstable intervals. With 4x period size duration observed amount of buffer drain between subsequent calls to GCP ranges from 528 to 3 audio frames, with most frequent amount between 466 and 490 (take a look into PDF I would attach shortly after posting this comment).
Basically it means that most of the time we get ~10.84ms between timer events on average instead of requested 10.0ms. What is worse that sometimes we've got less than 10ms between subsequent timer events (due to previous timer callback invocation was delayed for, say, 1.1ms, while next invocation happen to be delayed by only 0.2ms - effective time between calls would be 20.2-11.1=9.1ms), and so it's possible for random jitter to combine in a way that XA2 wouldn't pump required amount data in time unless duration is 4x period size (or more).
Consider having 30ms duration/10ms period case. Let's assume that at the moment we start simulation ALSA buffer holds 30ms of data. Here is the sequence that would lead to underrun (for simplicity I use ms units instead of mixing frames count and ms): 1. Timer event #0 is late by 1.1ms. At the moment callback is called 1.1ms of data had been actually played, 28.9ms left. Alsa period equals to 1/10th of duration, i.e. 3ms, so reported padding would still be 30ms. XA2 sees that the buffer seems to be full and does nothing. 2. Timer event #1 is late by 0.5ms. At the moment callback is called 10.5ms of data had been actually played back, 19.5ms left. Due to alsa period padding value reported by ALSA is ~21ms. XA2 sees that the buffer have not enough space to fit another 10ms of data, does nothing. 3. Timer event #2 is late by 0.1ms. At the moment callback is called 20.1ms of data had been actually played back, 9.9ms left. Reported padding value is ~12ms. XA2 pumps in 10ms of data to the winealsa.drv local buffer. Data from this buffer would only be delivered to hardware at the next timer callback (!!!). 4. Timer event #3 is late by 0.7ms. At the moment callback is called ALSA should have been played back 30.7ms of data thus it had hit underrun 0.7ms ago. 5. Process repeats with similar mechanics in slight variations.
To the bottom line: A) To fix "XA2 + Duration = 3xPeriod, Pediod = 10ms" case we need to pass audio data from local buffer to the ALSA as soon as we receive it. Waiting for next timer event to pump out data leads to 100% underrun. Preliminary tests show that it "fixes" the bug, but I want to test it more throughly before posting the experimental patch and final testing results.
B) To fix "XA2 + Duration = 2xPeriod, Pediod = 10ms" case we also would need to compensate for reported alsa "avail" value lagging behind real playback pointer at least the duration of the alsa_period. Chances are that it wouldn't be enough as we would still be hitting underruns from time to time due to timer scheduling jitter issues and consequent drift in sync between alsa buffer exemption and mmdevdrv. It would be also required to use the fix from (A).
Preliminary testing with hack-n-dirty patch for case (B) showed that this guess seems to be correct: I still get underruns in case I have padding reported to app being subtracted with alsa_period_frames and then clamped to zero, and call alsa_push_buffer_data((void*)This, FALSE); at the end of ReleaseBuffer, but their amount is reduced to be about one underrun per several seconds. Additionally doubling timer rate makes underrunds almost unnoticeable - grepping logs resulted in 68 XRuns after 5 minutes of active gameplay.