http://bugs.winehq.org/show_bug.cgi?id=28723
Bug #: 28723 Summary: Sound stutter in Rage when emulated windows version is set to "Windows 7" (XAudio2 -> mmdevapi sound out path) Product: Wine Version: unspecified Platform: x86 OS/Version: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: winealsa.drv AssignedTo: wine-bugs@winehq.org ReportedBy: mooroon2@mail.ru CC: aeikum@codeweavers.com Classification: Unclassified Regression SHA1: b0652dd8bdb144645be4a6bf77cbb68a8ade49d9
This bug is twin-brother of fixed bug #28679. That one was when the app were using dsound API, this one seems to be when app uses mmdevapi directly.
Same "first bad commit", reverting it fixes sound stutter in RAGE.
I don't know any other Windows app I've got installed on my workstation which uses mmdevapi either directly or through XAudio2 (except for Starcraft II which seems to be unaffected by either this bug and bug #28679).
Andrew, do you need usual sound-related debug logs for tracking this down? I can capture them from RAGE but I'm afraid they would be a bit polluted by unrelated messages from Steam client.
http://bugs.winehq.org/show_bug.cgi?id=28723
Alexey Loukianov mooroon2@mail.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- Summary|Sound stutter in Rage when |Sound stutter in Rage when |emulated windows version is |emulated windows version is |set to "Windows 7" (XAudio2 |set to "Windows 7" (XAudio2 |-> mmdevapi sound out path) |-> mmdevapi sound output | |path)
http://bugs.winehq.org/show_bug.cgi?id=28723
Dmitry Timoshkov dmitry@baikal.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|unspecified |1.3.30 Severity|normal |minor
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #1 from Austin English austinenglish@gmail.com 2011-10-14 17:48:31 CDT --- (In reply to comment #0)
Andrew, do you need usual sound-related debug logs for tracking this down? I can capture them from RAGE but I'm afraid they would be a bit polluted by unrelated messages from Steam client.
Yes. I doubt Steam will have that much regarding sound..
http://bugs.winehq.org/show_bug.cgi?id=28723
elhana@sigil-guild.org changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |elhana@sigil-guild.org
--- Comment #2 from elhana@sigil-guild.org 2011-10-15 14:08:18 CDT --- For me sound just disappears in win7 mode after some time, which seems to be affected by pulse daemon.conf value - I've increased default-fragment-size-msec = 500 and could play for a few hours sometimes before sound stops again. Loading save helps for a bit, but eventually sound just won't work. At that point even winecfg sound test might get distorted sound. I have to exit game, pulseaudio -k and start game again - fixes it for a few hours more. Only affects wine, music player is fine at the same time.
It seems to me that raising default-fragment-size-msec makes problem less frequent with higher numbers.
With winxp mode sound just lags right away.
http://bugs.winehq.org/show_bug.cgi?id=28723
Luke Bratch l_bratch@yahoo.co.uk changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |l_bratch@yahoo.co.uk
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #3 from Alexey Loukianov mooroon2@mail.ru 2011-10-18 16:34:18 CDT --- Created attachment 36992 --> http://bugs.winehq.org/attachment.cgi?id=36992 WINEDEBUG="-all,+tid,+loaddll,err+all,warn+debugstr,+mmdevapi,+winmm,+midi,+dsound,+dmusic,+oss,+alsa,+coreaudio" log @wine-1.3.30-10-gb0652dd
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #4 from Alexey Loukianov mooroon2@mail.ru 2011-10-18 16:38:06 CDT --- Created attachment 36993 --> http://bugs.winehq.org/attachment.cgi?id=36993 WINEDEBUG="-all,+tid,+loaddll,err+all,warn+debugstr,+mmdevapi,+winmm,+midi,+dsound,+dmusic,+oss,+alsa,+coreaudio" log @wine-1.3.30-10-gb0652dd with reverted b0652dd
Here are requested debugs logs. It's pretty obvious that logs from Wine compiled with b0652dd reverted don't contain following messages
007c:trace:alsa:alsa_write_data XRun state, recovering
Hadn't inspected the logs in great detail though so there might be another extra error messages introduced after commit b0652dd.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #5 from Alexey Loukianov mooroon2@mail.ru 2011-10-18 16:46:49 CDT --- (In reply to comment #2)
... disappears in win7 mode after some time, which seems to be affected by pulse daemon.conf value ...
Had you tried to run the game without PA active on your system? I know it's a bit painful to get rid of this cr*ppy piece of software nowdays if you use some distro like Fedora or Ubuntu, but it's still possible and it fixes tonns of problems users experience with sound under linux.
I'm afraid the bugs you had run into are not related to this one: it is about the regression introduced after Wine 1.3.30. There were no 1.3.x release since the commit in question had landed into Wine git repo. Thus I would recommend you to test if removing PA helps and to file another bugreport in either cases: Wine should be working fine on modern linux installation no matter one uses PA+ALSA, pure ALSA, old-good OSS v3 or even brand-new OSS v4.
P.S. Forgot to mention another thing: try to compile alsa-plugins version from their current git HEAD. There were plenty of bug fixes inside "pulse" ALSA plugin that hadn't landed into major linux distros due to there were no releases by ALSA team for almost a year.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #6 from Alexey Loukianov mooroon2@mail.ru 2011-10-18 16:50:06 CDT --- Forgot to mention that I had tried applying patch from http://www.winehq.org/pipermail/wine-patches/2011-September/107108.html just "in case" and it didn't help.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #7 from Andrew Eikum aeikum@codeweavers.com 2011-10-19 08:55:29 CDT --- (In reply to comment #5)
Wine should be working fine on modern linux installation no matter one uses PA+ALSA, pure ALSA, old-good OSS v3 or even brand-new OSS v4.
Nitpick: Not quite, OSSv3 is not supported any longer. You must be using OSSv4 or later to build and use wineoss.drv.
Thanks for attaching logs. I'll take a peek soon. For future reference, it is not difficult to separate the Steam stuff from the application stuff in the logs, so you don't have to worry about that too much :)
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #8 from Alexey Loukianov mooroon2@mail.ru 2011-10-19 09:09:51 CDT --- (In reply to comment #7)
Thanks for attaching logs. I'll take a peek soon. For future reference, it is not difficult to separate the Steam stuff from the application stuff in the logs, so you don't have to worry about that too much :)
Yeah, it's pretty easy using tids especially if the logs you have contain crash dump with backtrace. I've been filtering logs from the Steam stuff previously but at one moment had been told by someone from Wine's devteam (it might have been Austin but I can't bet on it) that it's better to post unfiltered logs and let devs to filter themselves.
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |regression
http://bugs.winehq.org/show_bug.cgi?id=28723
Alexey Loukianov mooroon2@mail.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|1.3.30 |1.3.31
--- Comment #9 from Alexey Loukianov mooroon2@mail.ru 2011-11-02 06:51:07 CDT --- Bug present as of 1.3.31 release. Would test with current git in a couple of hours and report back.
P.S. Fixing reported bug version to actually point to the first Wine release affected by this bug (it was introduced during 1.3.30->1.3.31 devcycle, thus 1.3.30 was unaffected by this bug while 1.3.31 is affected).
http://bugs.winehq.org/show_bug.cgi?id=28723
Dmitry Timoshkov dmitry@baikal.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|1.3.31 |1.3.30
--- Comment #10 from Dmitry Timoshkov dmitry@baikal.ru 2011-11-02 07:10:09 CDT --- Please don't change an originally reported Wine version.
http://bugs.winehq.org/show_bug.cgi?id=28723
Alexey Loukianov mooroon2@mail.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|1.3.30 |1.3.31
--- Comment #11 from Alexey Loukianov mooroon2@mail.ru 2011-11-02 07:18:32 CDT --- (In reply to comment #10)
Please don't change an originally reported Wine version.
I was anticipating this comment :-). Read it once again: originally reported Wine version WAS INCORRECT due to bug was introduced and reported against Wine built from post 1.3.30 git. First release to "feature" this bug is Wine 1.3.31. IMO Wine Bugzilla should be extended to support such "bug report prior release" cases, for example by adding version tags like "wine-1.3.30-git" or "wine-git", with ability to later on change version tag to the first release version that suffers from reported bug.
http://bugs.winehq.org/show_bug.cgi?id=28723
Dmitry Timoshkov dmitry@baikal.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- Version|1.3.31 |1.3.30
--- Comment #12 from Dmitry Timoshkov dmitry@baikal.ru 2011-11-02 07:33:52 CDT --- See bug 12728.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #13 from Alexey Loukianov mooroon2@mail.ru 2011-11-02 07:51:32 CDT --- (In reply to comment #12)
See bug 12728.
Dmitry, thanks for providing the pointer. I have no objections against having reported Wine version mean "a bug that had been introduced somewhere between version X and X+1". It would be nice to provide this info in some kind of "bug reporters" FAQ, which might reduce the amount of cases per day when you have to correct reported version back. Or, maybe, there's a FAQ with the above info I had missed somehow?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #14 from Alexey Loukianov mooroon2@mail.ru 2011-11-02 08:15:04 CDT --- Tested with wine-1.3.31-261-g3f74c58, bug is still there.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #15 from Dmitry Timoshkov dmitry@baikal.ru 2011-11-02 08:23:14 CDT --- (In reply to comment #13)
It would be nice to provide this info in some kind of "bug reporters" FAQ, which might reduce the amount of cases per day when you have to correct reported version back.
It's not that bad.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #16 from Andrew Eikum aeikum@codeweavers.com 2011-11-02 08:47:38 CDT --- Created attachment 37257 --> http://bugs.winehq.org/attachment.cgi?id=37257 winealsa.drv: Don't set ALSA period time
Thanks for the nice logs and updates, Alexey. Sorry it's taken me so long to get to this bug.
I'll try to describe what I think is happening based on the log in comment 3. RAGE's audio thread runs in a loop polling IAudioClient::GetCurrentPadding(). When the current padding value is <= 441, it writes another packet (441 frames) of data. But, I think ALSA's period size is being set to something about the same size, so when GCP() is called, its value changes in chunks of approximately 441 frames. So you see patterns like this (trimmed for clarity):
007d:trace:alsa:AudioClient_GetCurrentPadding PCM state: 3, avail: 15927, pad: 441 007d:trace:alsa:AudioRenderClient_GetBuffer (0x48a9e70)->(441, 0x352de924) 007d:trace:alsa:AudioClient_GetCurrentPadding PCM state: 3, avail: 15927, pad: 441 007d:trace:alsa:AudioRenderClient_ReleaseBuffer (0x48a9e70)->(441, 0) 007d:trace:alsa:AudioClient_GetCurrentPadding PCM state: 3, avail: 15927, pad: 882 007d:trace:alsa:AudioClient_GetCurrentPadding PCM state: 3, avail: 15927, pad: 882 007d:trace:alsa:AudioClient_GetCurrentPadding PCM state: 3, avail: 15905, pad: 463 007d:trace:alsa:AudioClient_GetCurrentPadding PCM state: 3, avail: 16281, pad: 87 007d:trace:alsa:AudioRenderClient_GetBuffer (0x48a9e70)->(441, 0x352de924) 007d:trace:alsa:AudioClient_GetCurrentPadding PCM state: 4, avail: -32, pad: 0 007d:trace:alsa:AudioRenderClient_ReleaseBuffer (0x48a9e70)->(441, 0)
Notice the large skips in GCP() return values. When RAGE finally sees the <=441 value, ALSA will have run out of data by the time the GetBuffer() call completes.
I'm attaching a patch here which I hope fixes this. It just doesn't set the ALSA period time at all, letting ALSA figure out what it wants to use. I'm not sure about this in combination with fixes for Bug 28282, but it would be useful to know if this patch works for you.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #17 from Alexey Loukianov mooroon2@mail.ru 2011-11-02 13:40:13 CDT --- (In reply to comment #16)
Thanks for the nice logs and updates, Alexey. Sorry it's taken me so long to get to this bug.
You're welcome. IMO you're doing very well working on dsound/mmdevapi Wine subsystems and you are the person we all should thank for your hard work. No need to excuse for not being able to fight all the bugs at once due to time constrains :-).
I'll try to describe what I think is happening based on the log in comment 3. RAGE's audio thread runs in a loop polling IAudioClient::GetCurrentPadding(). When the current padding value is <= 441, it writes another packet (441 frames) of data. But, I think ALSA's period size is being set to something about the same size, so when GCP() is called, its value changes in chunks of approximately 441 frames. So you see patterns like this (trimmed for clarity): ... Notice the large skips in GCP() return values. When RAGE finally sees the <=441
Hmm, correct me if I'm wrong but ALSA reports periods per frame being equal to 88 (trace:alsa:AudioClient_Initialize alsa_period_frames: 88). It seems to be noticeably lower than 441 frames per chunk RAGE seems to be feeding mmdevapi with.
I'm attaching a patch here which I hope fixes this. It just doesn't set the ALSA period time at all, letting ALSA figure out what it wants to use. I'm not sure about this in combination with fixes for Bug 28282, but it would be useful to know if this patch works for you.
Tested with this patch applied, with patch from bug #28282, and with both patches applied at the same time. Also tried to experiment a bit with divider used to determine requested alsa_period_us (it's 100 by default, I tried to change it to 10, 200, 500 and 1000 and see if something changes). All the attempts failed to produce underrun-free result.
My next move was to add extra debug output to the wine-1.3.30-10-gb0652dd with reverted b0652dd so it would be as detailed as it the output from wine-1.3.30-10-gb0652dd. Examining resulting logs from stutter-free run I had found the following: ... 0053:trace:alsa:AudioClient_Initialize Setting ALSA period size to 10000 us. 0053:trace:alsa:AudioClient_Initialize This->bufsize_alsa: 16317 0053:trace:alsa:AudioClient_Initialize This->period_alsa: 441 0053:trace:alsa:AudioClient_Initialize This->period_is: 10000 ... strip ... 0053:trace:alsa:AudioRenderClient_GetBuffer (0x491a470)->(2646, 0x2acd880) 0053:trace:alsa:AudioClient_GetCurrentPadding (0x491a470)->(0x2acd84c) 0053:trace:alsa:AudioClient_GetCurrentPadding PCM state: 2, avail: 16317, pad: 0 0053:trace:alsa:AudioRenderClient_ReleaseBuffer (0x491a470)->(2646, 2) 0053:trace:alsa:AudioClient_Start (0x491a470) ... strip ... 007c:trace:alsa:AudioClient_GetCurrentPadding (0x491a470)->(0x352de97c) 007c:trace:alsa:AudioClient_GetCurrentPadding PCM state: 3, avail: 14998, pad: 1319 007c:trace:alsa:AudioRenderClient_GetBuffer (0x491a470)->(441, 0x352de924) 007c:trace:alsa:AudioClient_GetCurrentPadding (0x491a470)->(0x352de8f0) 007c:trace:alsa:AudioClient_GetCurrentPadding PCM state: 3, avail: 14998, pad: 1319 007c:trace:alsa:AudioRenderClient_ReleaseBuffer (0x491a470)->(441, 0) ...
Notice that the first call of GetBuffer that is being done prior calling Start is done using relatively big buffer size, namely 2646 frames (it is 6*441). Then the game proceeds with doing some other init tasks it needs and when it returns back and calls GetCurrentPadding some part of original 2464 buffer had already been played back (logs show 1319 frames left). Then the game starts pumping out data to the buffer in 441 bytes chunks until GetCurrentPadding returns something around 3200-3300 and tries its best to keep the buffer filled up at approx this level. I hadn't a part of the logs showing buffer fill process at it's pretty strait-forward: check the padding, add another 441 frames it's too low.
If you would compare this behavior with the one we get after the b0652d commit, it looks like that the game is suddenly pretty well content with the padding level of around 882 frames - it only tries to pump-in another 441 frames of data in case reported padding is around ~450 frames. Question is what had been changed by the patch that had caused such a dramatic change in the game behavior. Now I'm trying to dig a bit deeper in patch diffset and try to figure out the cause.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #18 from Austin English austinenglish@gmail.com 2011-11-02 13:42:40 CDT --- (In reply to comment #13)
(In reply to comment #12)
See bug 12728.
Dmitry, thanks for providing the pointer. I have no objections against having reported Wine version mean "a bug that had been introduced somewhere between version X and X+1". It would be nice to provide this info in some kind of "bug reporters" FAQ, which might reduce the amount of cases per day when you have to correct reported version back. Or, maybe, there's a FAQ with the above info I had missed somehow?
"Set Product to Wine and specify Wine version you are using, if in doubt run 'wine --version'."
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #19 from Alexey Loukianov mooroon2@mail.ru 2011-11-02 20:59:24 CDT --- [offtopic]Austin, IMHO adding a remark about "the correct" version to set when reporting bugs for Wine that had been built from git won't hurt.[/offtopic]
OK, I had tracked this bug down. Looks like this bug is caused by the XAudio2 usage scheme of the mmdevapi devices under certain circumstances. The "direct" bug-trigger is the "DefaultPeriod" value that is reported to XA2 by winealsa.drv mmdevdrv, but the actual bug cause it a bit more complicated. Lets get a bit deeper into details.
Looks like the code execution flow is as following: 1. XA2 calls GetDevicePeriod() to get default and minimum period size for selected Core Audio device. 2. Code from winealsa.drv/mmdevdrv.c handles the above call and returns hard-coded values, namely 100000 and 50000 for wine-1.3.30-10-gb0652dd+. 3. XA2 calls Initialize() with "duration" set based on the value of "DefaultPeriod" it had just retrieved and "period" set to 0 (request to use the "default device period"). 4. XA2 calls SetEventHandle to set an event it would use as "interrupt" determining when to check for buffer padding value. 5. XA2 calls GetBuffer to fill in the buffer with the initial portion of audio frames. 6. XA2 waits for event to be signaled and calls GetCurrentPadding as soon as event fires. In case retrieved padding indicates that there's enough free buffer space to hold next 10ms of audioframes (441 for 44.1kHz samplerate) - it pumps in 10ms audio data chunk.
First step on the road towards bug lies in p.p. 2 and 3 from the list above. As long as DefaultPeriod exposed by AudioClient COM object instance is equal or larger than 105000hns, XA2 requests shared buffer duration which is ~4x of the DefaultPeriod duration. First call to GetBuffer (p.5 in the above list) fills in 3/4 of the requested buffer duration.
When the advertised default period is less than 105000hns (104999hns and lower) XA2 switches into using duration buffer that is ~2x DefaultPeriod in length and it fills in only half of the requested buffer on the first call to GetBuffer.
I would attach PDF document illustrating experimentally determined dependency between the value of "DefaultPeriod" and the "duration" size that is requested by XA2 shortly after posting this comment.
What happens next is winealsa.drv schedule a timer that fires once per requested period time. Timer handler does nothing more than pumps as much data as it can to ALSA and then fires the app-provided event (if available). It has one important consequence: if an app (A) calls GetCurrentPadding/GetBuffer only as a reaction to the fired event, (B) wants to pumps the data into Core Audio device in 10ms chunks and (C) had requested pretty small buffer duration - underruns are unavoidable. That's exactly the case with XA2 + winealsa mmdevdrv with DefaultPeriod < 10.5ms. What we've got are roughly two event fires per requested buffer duration and amount of frames that are played back by ALSA between consequent event fires is about 10ms (actually it is roughly the amount of frames that fit into DefaultPeriod duration).
What we got as a result is the following sequence: 1. Let's assume that DefaultPeriod = 10ms. Buffer duration requested by XA2 would be 20ms, and XA2 would pump-in 10ms of data prior to starting up the sound output. 2. winealsa.drv pumps available 10ms of data to ALSA and schedules timer to fire once in every 10ms (+ scheduling delays). 3. 10ms+ passes by and timer callback is invoked. It founds (what a surprise!) that ALSA had hit underrun and recovers it. Then it checks if any data is available to pump into ALSA and founds that there's none. OK then, time to fire up event. 4. XA2 handles fired event and determines that current padding is 0. Instead of filling in the entire buffer (i.e. 20ms of data) it pumps in 10ms of audio data. 5. Rinse, wash, repeat. Scheduling delays and the fact that underrun recovery and data transfers require some random time brings in enough variability to the process. Logs show that sometimes stars and moon disposition allow XA2 to fill in the whole buffer, but the granularity of the fired events leaves no chance for this condition to remain stable.
Now to the hard part: actually I have no idea about how to "properly" fix this bug. What we want is to have finer granularity of the event signaling. One of the ways to achieve it is to use snd_async_add_pcm_handler(), but I don't know what are the possible consequences to receiving SIGIO callback in a Wine-driven process (if that's possible at all).
Another possible way to workaround the problem is to use higher rate for timer. I had tried to use 4x timer rate (used "This->mmdev_period_rt / 40000" it the CreateTimerQueueTimer() call) and it "fixed" the bug. Obvious downside of such approach is that we are relying on the timer scheduling granularity which might be not fine enough to handle extra-small buffer case, and also we hog CPU at least 4x times more when we use 4x timer fire rate.
A "quick" workaround would be to set DefaultPeriod to anything larger than 10.5ms. It would "automagically fix XA2+mmdevapi" making users happy but OTOH wouldn't really fix the bug.
Andrew, I hope you would come up with a brilliant idea about what to do to fix this pretty fun race condition in a "Right Way".
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #20 from Alexey Loukianov mooroon2@mail.ru 2011-11-02 21:00:53 CDT --- Created attachment 37265 --> http://bugs.winehq.org/attachment.cgi?id=37265 PDM illustrating the dependency between mmdevapi DefaultPeriod and buffer sizes requested by XAudio2.
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |hoehle@users.sourceforge.ne | |t
--- Comment #21 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-03 08:50:06 CDT --- Interesting issue. I've never seen native tests report anything else beside 10.0000 and 10.1587ms (alignment for Intel HDA) shared mode default periods. So reporting anything else is out of question.
However, what native is said to do and Wine doesn't is to clamp period and duration. From what I've read (but not yet tested, which is why I've not yet written that patch), - max duration is 2s in shared mode, 500ms in exclusive mode - min. duration should probably be 3*period (at least for Wine) - min period is 3ms (from GetDefaultPeriod) - max period is 100ms (I think I read that somewhere)
Note that Wine still has a related bug: it must ignore period in shared mode.
Alexey, how does that app react if you internally set duration to 30ms even when it asks for 20ms? From your description, it would initially fill half of it, i.e. 15ms, then receive an event and fill 10ms more, which would not be that bad. Does GetStreamLatency influence XA2?
Of course, what would really be nice is to signal events at the earliest possible time when timing is tight, i.e. ALSA -> mmdevapi -> app. The cascade of periodic timers obviously causes latency.
I believe native can get away with a buffer size slightly > 11ms. Presumably, when the periodic 10ms event fires, the mixer has mixed 10ms of data that was immediately fed to the HW and the app has ~9ms to provide the next chunk. Wine needs more latency because mixer and event and HW are not synchronised, e.g. winmm has a periodic feeder thread, mmdevapi adds another one etc.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #22 from Andrew Eikum aeikum@codeweavers.com 2011-11-03 10:18:45 CDT --- (In reply to comment #21)
However, what native is said to do and Wine doesn't is to clamp period and duration. From what I've read (but not yet tested, which is why I've not yet written that patch),
- min. duration should probably be 3*period (at least for Wine)
I tested this and found the minimum duration to be the length of two default periods on Windows 7.
That said, I'm still curious to see the results of this:
Alexey, how does that app react if you internally set duration to 30ms even when it asks for 20ms?
I believe native can get away with a buffer size slightly > 11ms. Presumably, when the periodic 10ms event fires, the mixer has mixed 10ms of data that was immediately fed to the HW and the app has ~9ms to provide the next chunk. Wine needs more latency because mixer and event and HW are not synchronised, e.g. winmm has a periodic feeder thread, mmdevapi adds another one etc.
I agree, this might be our best solution.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #23 from Alexey Loukianov mooroon2@mail.ru 2011-11-03 12:23:37 CDT --- (In reply to comment #21)
I've never seen native tests report anything else beside 10.0000 and 10.1587ms (alignment for Intel HDA) shared mode default periods. So reporting anything else is out of question.
IMHO it would be interesting to test for reported values with hardware other than Intel HDA. What I've got here and would be able to test is a laptop with installed Win7Starter. This laptop is equipped with Conexant HDA codec, it would be interesting to take a look at the GetDefaultPeriod values Core Audio would report on it. Actually it might be reasonable to ask users on Wine forums to do some tests on Seven and Vista with different types of audio hardware so we would get the entire picture and would be able to determine the rage which is sane to use for default and minimum periods and also for duration.
Alexey, how does that app react if you internally set duration to 30ms even when it asks for 20ms? From your description, it would initially fill half of it, i.e. 15ms, then receive an event and fill 10ms more, which would not be that bad.
Hadn't tested yet, going to do it tonight.
Does GetStreamLatency influence XA2?
It looks like XA2 doesn't calls it at all. At least I wouldn't able to grep its invocation in all the logs I've collected during experiments.
Of course, what would really be nice is to signal events at the earliest possible time when timing is tight, i.e. ALSA -> mmdevapi -> app. The cascade of periodic timers obviously causes latency.
Agreed. I've tried to quickly hack-in usage of async ALSA PCM callbacks into winealsa.drv but had failed to do it properly. Looks like I have to read more and increase my knowledge on the ALSA side to be able to done it right (or come to a conclusion that async ALSA IO can't be used as a solution for this case).
I believe native can get away with a buffer size slightly > 11ms. Presumably, when the periodic 10ms event fires, the mixer has mixed 10ms of data that was immediately fed to the HW and the app has ~9ms to provide the next chunk.
Emm, I'm a bit confused and can't understand what are you writing about here. Buffer size of 11ms - do you mean the "duration" that had been requested by app? Actually I can easily write an app that would be able to use 10ms (and maybe even less) buffers with current winealsa mmdevdrv implementation without hitting underruns. What is needed is to poll GetCurrentPadding frequently enough and feed in the data as soon as buffer would have some free space available. XA2 hits the bug because it only pumps out data in 10ms chunks and checks if there's enough free buffer space available are only done at event fire times. Pair it with small buffer size (2xDefaultPeriod) and events firing only roughly once per period and we've got perfect way to produce underruns.
Wine needs more latency because mixer and event and HW are not synchronised, e.g. winmm has a periodic feeder thread, mmdevapi adds another one etc.
winmm in unrelated to this bug but you're generally right - various mixing threads and buffers in different APIs downstream from app and to the OS sound driver add up a lot to latency. But increased output latency issue isn't a show-stopper for the most use cases (one would use something like wineasio->jack in cases where latency really matters), while underruns caused by race conditions introduced as a result of desync are real pain.
IMO, what winealsa.drv mmdevdrv is actually doing is emulating "DMA ring buffer with period boundary crossing interrupts" behavior in software, but reported "padding" value is being fetched from the OS driver (ALSA) and which is not closely in sync with emulated ring buffer "padding". As emulated period size and real ALSA period sizes are not the same (and it is not ensured/guaranteed that emulated HW period size is a multiple of ALSA hw period size) and timer events are not synchronized with real hardware events - padding size that is being reported by winealsa.drv mmdevdrv would always lag behind a bit from the real HW padding value (lag would be around ALSA hw period size at most). What I'm going to test tonight is to try to report padding based on the amount of data that had been uploaded into ALSA and not on the amount of free buffer space as reported by ALSA. While being obviously wrong (it would report to an app that some samples had been already played while actually they might be not) it might be yet another working workaround for this bug.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #24 from Alexey Loukianov mooroon2@mail.ru 2011-11-03 17:00:44 CDT --- (In reply to comment #23)
What I'm going to test tonight is to try to report padding based on the amount of data that had been uploaded into ALSA and not on the amount of free buffer space as reported by ALSA. While being obviously wrong (it would report to an app that some samples had been already played while actually they might be not) it might be yet another working workaround for this bug.
As expected, using something like this:
/* FIXME by LeXa2: Lie app about current padding */ *out = This->held_frames;
in GetCurrentPad() workarounds this bug to a some extent. No more underruns, but a huge latency introduced due to using entire size of the ALSA buffer (which is by default is around 0.3s or 16368 frames on my rig) instead of its "duration" frames subchunk.
Using more artificial (and even more incorrect) construct like this:
*out = (UINT32)max(((1.0 * This->alsa_bufsize_frames - avail_frames) / This->mmdev_period_frames - 1), 0) * This->mmdev_period_frames + This->held_frames;
which effectively enlarges used buffer duration to be ~4x period size (that is 2x enlarge introduced by max(Y-1, 0)/X*X construct above + original 2x duration size) fixes the problem keeping the latency at adequate level (40ms is barely noticeable for general gaming use).
Having duration to be ~3x period size using the same method isn't enough to get rid of xruns. It seems that the real limiting factor for such case is that the fresh audio data submitted by app is held in local buffer until next timer event and only then is pumped out to ALSA. Couple it with the general granularity jitter causing timer event callback to be invoked at pretty unstable intervals. With 4x period size duration observed amount of buffer drain between subsequent calls to GCP ranges from 528 to 3 audio frames, with most frequent amount between 466 and 490 (take a look into PDF I would attach shortly after posting this comment).
Basically it means that most of the time we get ~10.84ms between timer events on average instead of requested 10.0ms. What is worse that sometimes we've got less than 10ms between subsequent timer events (due to previous timer callback invocation was delayed for, say, 1.1ms, while next invocation happen to be delayed by only 0.2ms - effective time between calls would be 20.2-11.1=9.1ms), and so it's possible for random jitter to combine in a way that XA2 wouldn't pump required amount data in time unless duration is 4x period size (or more).
Consider having 30ms duration/10ms period case. Let's assume that at the moment we start simulation ALSA buffer holds 30ms of data. Here is the sequence that would lead to underrun (for simplicity I use ms units instead of mixing frames count and ms): 1. Timer event #0 is late by 1.1ms. At the moment callback is called 1.1ms of data had been actually played, 28.9ms left. Alsa period equals to 1/10th of duration, i.e. 3ms, so reported padding would still be 30ms. XA2 sees that the buffer seems to be full and does nothing. 2. Timer event #1 is late by 0.5ms. At the moment callback is called 10.5ms of data had been actually played back, 19.5ms left. Due to alsa period padding value reported by ALSA is ~21ms. XA2 sees that the buffer have not enough space to fit another 10ms of data, does nothing. 3. Timer event #2 is late by 0.1ms. At the moment callback is called 20.1ms of data had been actually played back, 9.9ms left. Reported padding value is ~12ms. XA2 pumps in 10ms of data to the winealsa.drv local buffer. Data from this buffer would only be delivered to hardware at the next timer callback (!!!). 4. Timer event #3 is late by 0.7ms. At the moment callback is called ALSA should have been played back 30.7ms of data thus it had hit underrun 0.7ms ago. 5. Process repeats with similar mechanics in slight variations.
To the bottom line: A) To fix "XA2 + Duration = 3xPeriod, Pediod = 10ms" case we need to pass audio data from local buffer to the ALSA as soon as we receive it. Waiting for next timer event to pump out data leads to 100% underrun. Preliminary tests show that it "fixes" the bug, but I want to test it more throughly before posting the experimental patch and final testing results.
B) To fix "XA2 + Duration = 2xPeriod, Pediod = 10ms" case we also would need to compensate for reported alsa "avail" value lagging behind real playback pointer at least the duration of the alsa_period. Chances are that it wouldn't be enough as we would still be hitting underruns from time to time due to timer scheduling jitter issues and consequent drift in sync between alsa buffer exemption and mmdevdrv. It would be also required to use the fix from (A).
Preliminary testing with hack-n-dirty patch for case (B) showed that this guess seems to be correct: I still get underruns in case I have padding reported to app being subtracted with alsa_period_frames and then clamped to zero, and call alsa_push_buffer_data((void*)This, FALSE); at the end of ReleaseBuffer, but their amount is reduced to be about one underrun per several seconds. Additionally doubling timer rate makes underrunds almost unnoticeable - grepping logs resulted in 68 XRuns after 5 minutes of active gameplay.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #25 from Alexey Loukianov mooroon2@mail.ru 2011-11-03 17:02:15 CDT --- Created attachment 37281 --> http://bugs.winehq.org/attachment.cgi?id=37281 PDF: Wine winealsa.drv mmdevapi buffer drain speed between consequent calls to GetCurrentPadding by XAudio2
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #26 from Alexey Loukianov mooroon2@mail.ru 2011-11-03 20:15:06 CDT --- OK, tested with following inside AudioClient_Initialize():
/*This->bufsize_frames = ceil((duration / 10000000.) * fmt->nSamplesPerSec); */ /* FIXME by LeXa2: Temporary hack to force using N*PeriodSize buffer size */ This->bufsize_frames = mmdev_period_frames * 3;
Despite XA2 calling GetBufferSize() behavior remains just as if XA2 ignores the GetBufferSize() output and sticks with originally requested buffer size. Using "mmdev_period_frames * 4" also has no effect of XA2 behavior.
Just to be sure I added TRACE() debug prints for "This->bufsize_frames" in Initialize() and for "*out" in GetBufferSize(). Both report expected buffer size being either 1323 or 1764 so it is XA2 who refuse to use full buffer size.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #27 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-04 08:03:11 CDT --- BTW, does the app use exclusive or shared mode? GetCurrentPadding jumps by 10ms deltas in shared mode instead of decreasing linearly over time (the latter is what *most* exclusive drivers do). Your *out = This->held_frames; is not unlike that.
If you play with padding, I encourage you to run my GetPosition tests, see bug #27937 (you'll also find updated versions in testbot from time to time).
I've been running tests from time to time to investigate whether GetCurrentPadding in exclusive mode is conservative or not, i.e. whether it returns positions ahead or not of the "true" play position. That's not easy to evaluate when doing remote tests only.
Basically, Wine ought to be able to play a 10ms period with buffer size = 2x period. 2x period is exactly what exclusive callback mode wants, and native can do that.
- DSound needs to use callback event mode and be synchronised with mmdevapi. MS is right when they say that event mode is better when time is critical. Looking at dsound, I can see how it gave MS ideas for the design of mmdevapi and its mixer. Wine's mixer in dsound should act as if it were mmdevapi's mixer. So when the event triggers, there's at least 10ms free, the dsound mixer fills that and sends it off to mmdevapi.
- Ideally (future enhancement), winealsa would use poll() to be woken up when the buffer drains instead of using a periodic timer. SIGIO is not the way to go.
- Some ALSA period control when timing is tight (how to determine when, base it on duration <= 50ms?). Alsa may not start when given less samples than period_size, see bug #27087, comment #10.
Buffer size of 11ms - do you mean the "duration"
Yes
we need to pass audio data from local buffer to the ALSA as soon as we receive it. Waiting for next timer event to pump out data leads to 100% underrun.
This should be a last resort as it has its own problems. Consider how native's mixer (presumably) works without that trick, it operates at a fixed rate, that's all (but as I've explained before, it would conceivably work even with 11ms duration).
BTW, I've trouble understanding both .pdf, It's not entirely clear what the X and Y directions are.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #28 from Andrew Eikum aeikum@codeweavers.com 2011-11-04 08:18:13 CDT --- (In reply to comment #27)
BTW, does the app use exclusive or shared mode?
It uses shared mode. I have yet to see a real-world application use exclusive mode.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #29 from Alexey Loukianov mooroon2@mail.ru 2011-11-04 10:12:50 CDT --- (In reply to comment #27)
BTW, I've trouble understanding both .pdf, It's not entirely clear what the X and Y directions are.
Both PDF expose data collected using shell-fu magic from Wine logs running RAGE/XA2+mmdevapi on my home PC. Here I've got no PulseAudio, emu10k-based sound card capable of doing HW wavetable synth (not relevant for our discussion) and HW multiple playback streams mixing (not relevant also) and alsalib/plugins 1.0.24 compiled from Fedora 14 sources.
First PDF aggregates experimentally collected data about the duration XA2 requests from mmdevapi depending on the DefaultPeriod size it retrieves from it. I.e. if mmdevapu AudioClient_GetDivicePeriod returns X default period to XA2, then XA2 would specify Y duration in the subsequent call to AudioClient_Initialize(). Graph visualizes most "interesting" part of dependency, where we've got discontinuity in curve when default period changes from 104999 to 105000. Supplementary Y axis (right side of graph) visualize the amount of samples XA2 pumps into mmdevapi before calling AudioClient_Start() - i.e. it is the initial buffer fill measured in samples. I hadn't took care to bring main and supplementary Y axis in sync because wasn't interested in it - my goal was to illustrate discontinuity/jump at DefaultPeriod = 105000 for both requested duration and size of initial buffer fill-in. All the values used to plot the graph are available on the first page of PDF, so one wishing to use them to plot the graph he wants would be able to do it :-).
Second PDF is an attempt to visualize average time gap between subsequent event fires (and thus between subsequent timer callbacks). It is pretty rough as time gap is measured in "audio frames" units returned by ALSA and not something precise like calls to clock_gettime(). Still it's precise enough as the obtained values are statistical: I took logs representing one minute of gameplay, greped/awked/sorted/uniqed values returned by ALSA at subsequent calls to snd_pcm_avail_update(), then calculated the difference between each N+1 and N records and visualized them. Each bar on the graph represents the amount of times (and thus it is approx. of frequency) each given difference had occurred throughout the entire logs. So it is possible to roughly estimate the timer scheduling jitter as if observed by ALSA. Peak at "-472" means that between subsequent calls to GetCurrentPadding 472 of samples had been consumed by ALSA which is roughly 10.7ms. The fact that "bar peaks" area is shifted above -441 shows that we have systematic error in timer resolution shifted towards longer times between subsequent callbacks (we request 10ms, in reality got up to 11.1ms) as observed by ALSA. The cause for this might either be the scheduling delays and the timer desync/drifts between HW soundcard timer (used by ALSA) and CPU/chipset timers (used by timer callbacks).
As for the rest of discussion - I would do tonight a couple of test on my linux PC and on Win7 laptop and then going to comment here on them.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #30 from Alexey Loukianov mooroon2@mail.ru 2011-11-04 10:35:59 CDT --- A quick question from "just curious" series: on a native system Core Audio API in shared mode implements a mixer. My tests show that on native system it operates in 10ms chunks of data (as observed by GetCurrentPadding). I don't see any signs that current implementation of mmdevapi in Wine has it's own mixer (either witch emulates behavior of native OS and drains data from buffer in 10ms chunks or really does some mixing duties). Does it means that design decision when implementing mmdevapi for Wine were to use mixing capabilities from downstream subsystems (ALSA dmix, PA, hw mixing in sound card)?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #31 from Andrew Eikum aeikum@codeweavers.com 2011-11-04 14:56:18 CDT --- Yes, that's correct. Implementing a mixer is "hard" and not really in scope for Wine if we can avoid it. Unfortunately, we can't avoid it in dsound, and since we've already got a mixer implementation in dsound, there might be some call for making it usable by the mmdevapi drivers, too. But that's a decision to make only if we can show that using the backends for mixing isn't sufficient.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #32 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-04 16:57:49 CDT --- *out = This->held_frames; Come to think of it, this is presumably what native GetCurrentPadding does in shared mode and what Wine should do. Native's shared mode GCP means the buffer is that much free, and the rest was sent to the mixer -- it does not mean that the rest was already "consumed" by the HW.
There are 2 issues, though: 1. Native mixes 10ms and sends them to the HW, whereas Wine has all of the ALSA buffer in front of it. As you experienced in comment #24, this adds latency as large as ALSA's buffer.
2. GetPosition in Wine is based on GetCurrentPadding. I wrote in http://www.winehq.org/pipermail/wine-devel/2011-August/091562.html that this is bogus, snd_pcm_delay must be used instead but I've not polished the patch since. With a fixed GetPosition, the modified GCP would pass my render.c tests.
Point 1. can be improved by calling snd_pcm_hw_params_set_buffer_time_max when duration is small (perhaps < 50ms?), revealing that the app wants tight timing. The final word has not been said yet on snd_pcm_set_period/buffer_min/max/near. I've not yet found an interesting formula and a good heuristic to distinguish 2 use cases: "size doesn't matter, just play" and "low-latency shooter reaction".
Something else to prevent underruns may be to pretend running at 10ms like native, but use a period of 5ms. This would hopefully account for the fact that ALSA's buffer filling need is not synchronized with mmdevapi's timer. Then again, try and find an heuristic not to always run at 5ms (e.g. PlaySound doesn't need that).
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #33 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-06 14:57:25 CST ---
Using more artificial (and even more incorrect) construct like this
Actually I believe it's quite close to correct, except for the -1.
Here's what I believe would happen to your scenario from comment #24 in native (with >=10.500ms period, duration x4 but let's compute with 10ms periods): 0. Before start, the app prefills 30ms according to your comment #19. At time ~0, 10ms of data is fed to the mixer, then the HW, so 20 from the initial 30ms of data remains -- for the next 10ms, native GCP will report 20ms padding.
- Timer event #0 is late by 1.1ms. At the moment callback is called 1.1ms of
data had been actually played, 28.9ms left. ... so reported padding would still be 30ms
Native GCP returns 20ms.
2. Timer event #1 is late by 0.5ms. At the moment callback is called 10.5ms of
data had been actually played back, 19.5ms left. Due to alsa period padding value reported by ALSA is ~21ms. XA2 sees that the buffer have not enough space to fit another 10ms of data, does nothing.
Native GCP report 10ms (441 frames), not 21. Hence according to your comment #16, XA2 writes 10ms of data.
- Timer event #2 is late by 0.1ms. At the moment callback is called 20.1ms of
data had been actually played back, 9.9ms left. Reported padding value is ~12ms. XA2 pumps in 10ms of data to the winealsa.drv local buffer.
Native GCP reports 10ms, again causing XA2 to add another 10.
From there on, XA2 will add 10ms at every event.
Whether the app is called late doesn't matter. It has nearly 9.9ms to supply data. All that matters is that the native mixer calls the HW in time. MS lets it run at one of the highest system priorities to ensure that audio will not present hickups. Wine can't run anything at such a priority, but it can change it's padding to accommodate XA2 and look more like native.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #34 from Alexey Loukianov mooroon2@mail.ru 2011-11-06 19:48:58 CST --- My tests on native system with shared mode mmdevapi shows that your estimation looks like pretty correct, but OTOH it's far from the real situation XA2 deals with on native system. Main point is that XA2 ignores any extra buffer space it had been offered by mmdev device and behaves as it had been provided with the buffer size it had requested. So if the default period size is something like 10ms (and that's approx the case for the hardware I have in my laptop) XA2 requests 20ms buffer and would use no more than 20ms from this buffer even if GetBufferSize() shows that it had been provided with something bigger.
- Before start, the app prefills 30ms according to your comment #19.
Em, I can't find where in the comment #19 I had been talking about 100% prefill (you're considering 30ms duration / 10ms period case, right?). For period sizes less than 10.5ms XA2 prefills 1/2 of duration before calling Start(), you can check it using the values from the first PDF I had attached to this report.
Rest of your simulation seems to be pretty close to what one might see with native XA2 playing data through mmdevapi in shared mode.
IMO, what we have here comes down to a XA2 relaying on the native mmdevapi shared mode mixer behavior (which drains data in 10ms chunks every 10ms) and as Wine's mmdevapi behaves differently - a problem arises. It had been masked by the fact XA2 using 4xPeriodSize for Period sizes larger than 10.5ms, but popped out of shadow as soon as DefaultPeriod size for Wine's mmdevapi had been cut down to 10ms.
Considering the information you had provided about GetCurrentPadding() vs. GetPosition(), I think that the "True Way (TM)" to fix it would be to:
a) Fix GetPosition() to use snd_pcm_delay() instead of basing it on GetCurrentPadding() and thus on snd_pcm_avail_update().
b) Use snd_pcm_avail_update() purely for private purposes, don't expose it's output to the client application - it's simply bogus to do so. Fix GetCurrentPadding() to report actual space that's available in local buffer, do not take into account ALSA buffer state when reporting padding.
c) Drain GetCurrentPadding() exactly in mmdev_period steps (in case we've got in buffer more data than mmdev_period), and be sure that minimum period is non less than 10ms for shared mode. Also, accordingly to MSDN, requested periods size should be ignored for shared mode and default period should be always used. As for duration - it is unclear from MSDN how should it be set WRT shared/exclusive mode. At one place of the docs it is noted that an app should specify here the minimal size of the buffer it requires to provide glitch-free operation. Later on it is strictly stated that for shared mode with event-driven stream duration and period should be set to 0. General idea looks like that and an app shouldn't bother with buffer sizes in event-driven mode. Instead it should call GetCurrentPadding() each time the event fires and fill-in the buffer to its maximum. It's obvious that MS own XA2 violates this requirement at least two times: it requests non-zero duration and it populates buffer with non less than 10ms chunks when event fires.
d) Make sure we don't pump more frames to ALSA than the minimum required amount to get xrun-free behavior. Failing to do so would introduce extra lag which is undesirable. With current scheme when we pump-out data into ALSA once in mmdev_period interval we have to make sure alsa buffer holds 2*mmdev_period_frames. Actually 2x it is a bit overkill but we have to have something left in ALSA buffer so it wouldn't xrun at the very moment we are pumping new data into it. Actual minimum for this might be something like mmdev_period + alsa_period * (N + 1), where N is the time needed for alsa_write_data() to complete expressed in alsa_period duration units and rounded up. I.e. we have to be sure that ALSA has something to play until next timer event pop-up (mmdev_period duration) and we pump-out more data to ALSA (alsa_write_data() execution duration). IMO it's easier just to stick with 2*mmdev_period_frames and not to hunt for a bit reduced latency while having risk to hit the xrun in unfortunate case.
e) Data that had been received through GetBuffer()/ReleaseBuffer() calls might be immediately pumped out to ALSA in case the amount of data that had been pumped out during previous timer callback was less than mmdev_pediod size. It would save us from xruns in case frames amount left to be played in ALSA buffer would be exhausted before next timer callback would arrive. Jörg, you had mentioned that there are some constraints against acting in a such way, would you please provide more info on the topic?
z) Unrelated to this bug, but would need fixing anyways: looks like that current implementation of EXCLUSIVE mode in Wine alsa is total bogus. Docs state that for event-driven exclusive mode stream period should equal to duration and that audio engine should allocate two buffers period size each and do a process called "ping-ponging" alternating the bugger that is being exposed to the app each time the event fires. I can't see any sign of this functionality implemented in winealsa mmdevdrv. OTOH, it might "just work" the way it is now, but it is surely a thing that should be throughly tested.
I would attach a proposed patch that seems to fix issues in RAGE + XA2 reported in this bug. With it applied I have xrun-free gameplay. There are still some XRuns in log but they seem to be related to game itself being too busy or re-initializing sound subsystem rather then winealsa.drv doing something wrong. Grepping logs collected after about 10 minutes of gameplay reveals 5 xruns: two of them were at game startup during sound subsystem init, one at map load moment, one at sound subsystem re-init when quiting the campaign and going back to the main menu, and one at the game exit. As a matter of test I had tested with another patch modification that forces 3x mmdev_period smaples being sent to ALSA and the amount of XRuns remained the same. "Make test" for mmdevapi seems not to be affected in any way by proposed patch.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #35 from Alexey Loukianov mooroon2@mail.ru 2011-11-06 19:50:10 CST --- Created attachment 37346 --> http://bugs.winehq.org/attachment.cgi?id=37346 Proposed patch to more closely mimic native OS behavior in winealsa mmdevdrv.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #36 from Alexey Loukianov mooroon2@mail.ru 2011-11-09 02:55:59 CST --- Andrew, Jörg, any comments on proposed patch?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #37 from Andrew Eikum aeikum@codeweavers.com 2011-11-09 08:04:13 CST --- Sorry, I've been busy with Codeweavers work lately and haven't had time to review. When reviewing your patch, we should also think about how it works with the fix for Bug 28282 and the patch from this bug's Comment 16. Do we want all three changes? Does your work here invalidate either of those two?
I don't know the answers, but stuff to think about before this gets in. I'll try to find time to review soon.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #38 from Alexey Loukianov mooroon2@mail.ru 2011-11-09 09:08:37 CST --- (In reply to comment #37)
When reviewing your patch, we should also think about how it works with the fix for Bug 28282 and the patch from this bug's Comment 16. Do we want all three changes? Does your work here invalidate either of those two?
Well, my patch requires ALSA period to be no longer than mmdev_period and ALSA buffer size to be at least 2x mmdev_period size. Failing to met any of these conditions wouldn't be a fatal fault but one should expect stutters in such case. I would try today to compile Wine with all three patches applied and report here if would hit any problems with it.
http://bugs.winehq.org/show_bug.cgi?id=28723
Alexey Loukianov mooroon2@mail.ru changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #37346|0 |1 is obsolete| |
--- Comment #39 from Alexey Loukianov mooroon2@mail.ru 2011-11-10 21:01:36 CST --- Created attachment 37437 --> http://bugs.winehq.org/attachment.cgi?id=37437 Proposed patchset, try #2
Here is patchset that includes my changes to GetCurrentPadding and alsa_write_data, requesting big-enough ALSA buffer size (I changed it to be at least 3x of mmdev_period size, or mmdev_buffer size in case it's bigger), and commented-out request to set ALSA period size. With this pathset applied I've got almost xrun-free behavior playing Rage with XA2=>mmdevapi sound render patch active. I still got some xruns, either caused by the game not sending audio data in time (during level loads, sound subsystem (re)initializations, e.t.c.), or my workstation not being able to handle load in time (for example if it is spending cycles in kernel space swaping out memory pages to disk, e.t.c.).
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #40 from Alexey Loukianov mooroon2@mail.ru 2011-11-10 21:30:26 CST --- Note: something obviously wrong either with the patch or with the way DSound uses mmdevapi. With my patchset in place everything if OK when playing RAGE with XA2->mmdevapi sound render patch but I can hear some underruns when playing using XA2->DSound->mmdevapi render patch (i.e. applying patchset fixes problems for Win7 mode but leads to xruns with DSound in WinXP mode). I suspect this might be due to DSound->mmdevapi wrapper seems to be using IAudioClock_GetPosition to get current playing position, and currently returned value is bogus - it relies on GetCurrentPadding which is wrong. Would try to re-implement IAudioClock_GetPosition using snd_pcm_delay() and check if it would fix underruns with dsound renderpath.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #41 from Alexey Loukianov mooroon2@mail.ru 2011-11-10 21:57:04 CST --- Changing dsound's primary.c so it don't try to set period size to pre-defined value of 50000 seems to fix the problem - no more unexpected xruns, I've got only four at expected places after ~5 minutes of RAGE gameplay. It should be investigated more closely why does setting period to 50000 causes spontaneous xruns (and does it have anything in common with bogus GetPosition return value), but in any case it is "yet another known mmdevdrv bug to be fixed": specs require to ignore requested period size and use default instead if AudioClient is being opened in shared more, which is not the case with the current implementation.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #42 from Andrew Eikum aeikum@codeweavers.com 2011-11-11 10:51:30 CST --- (In reply to comment #39)
Created attachment 37437 [details] Proposed patchset, try #2
The first patch gave me tons of underruns (all of the following is without PulseAudio). I have the following statistics in IAC::Initialize():
trace:alsa:AudioClient_Initialize alsa_period_frames: 470 trace:alsa:AudioClient_Initialize ALSA buffer size: 2352 frames trace:alsa:AudioClient_Initialize MMDevice period: 0x186a0 * 100ns, 220 frames trace:alsa:AudioClient_Initialize MMDevice buffer size: 2560 frames
But then I see: err:alsa:alsa_write_data avail: 1930, held_frames: 2496 trace:alsa:alsa_write_data XRun state, recovering
So it looks like ALSA is reporting an underrun when there isn't enough data in the ALSA buffer to take a full alsa_period_frames chunk (2352 - 1930 = 422 < 470). So perhaps we need to tweak the write_limit calculation to ensure there's always at least three mmdev periods _and_ three ALSA periods (three in case we get the period pattern Wine,Alsa-->Alsa,Wine).
This fixes the problem for me (should clean up the frames-in-alsa-buffer calculation redundancy, and you'll need to add the alsa_period_frames member):
- write_limit = This->mmdev_period_frames; - /* Make sure ALSA has enough samples to play till next timer event, to get - xrun-free behavior we need ALSA to have 2x mmdev_period frames to play. */ - if((This->alsa_bufsize_frames - avail) <= This->mmdev_period_frames) write_limit += This->mmdev_period_frames; + write_limit = 0; + while(This->alsa_bufsize_frames - avail + write_limit < This->mmdev_period_frames * 3 || + This->alsa_bufsize_frames - avail + write_limit < This->alsa_period_frames * 3) + write_limit += This->mmdev_period_frames;
I tested this with a few games with PulseAudio as well, and it worked. Does this continue to work with RAGE and your other test games?
Some comments on the code in your patches:
- TRACE("PCM state: %u, avail: %ld, pad: %u\n", - snd_pcm_state(This->pcm_handle), avail_frames, *out); + *out = This->held_frames; + TRACE("pad: %u\n", *out);
I think the snd_pcm_state() value is useful here, if only for debugging the PulseAudio bug. Unless you have a reason to get rid of it, I'd rather keep it in.
+#if 0 ... +endif
When you are getting ready to submit the patches, you can just remove this code. This patch also creates an unused variable which you can remove, alsa_period_us.
+ n = max(duration / 10, This->mmdev_period_rt * 3 / 10) ; /* duration converted to us or 3x our period size */ + d = 0;
I know they're taken from my hackish test patch, but please choose a better variable name than "n" (alsa_buffer_us or something). And you can get rid of "d" entirely and pass NULL instead.
+ TRACE("n: %u, d: %d\n", n, d);
We trace these values more formally below in that patch, so you can just get rid of these debug TRACEs.
(In reply to comment #41)
Changing dsound's primary.c so it don't try to set period size to pre-defined value of 50000 seems to fix the problem - no more unexpected xruns, I've got only four at expected places after ~5 minutes of RAGE gameplay.
This change is obviously correct, so there should be no problem submitting it.
specs require to ignore requested period size and use default instead if AudioClient is being opened in shared more, which is not the case with the current implementation.
This is pretty easy to fix with tests. It's on my list, but feel free to beat me to it :)
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #43 from Alexey Loukianov mooroon2@mail.ru 2011-11-11 11:28:20 CST --- (In reply to comment #42)
trace:alsa:AudioClient_Initialize alsa_period_frames: 470 trace:alsa:AudioClient_Initialize ALSA buffer size: 2352 frames trace:alsa:AudioClient_Initialize MMDevice period: 0x186a0 * 100ns, 220 frames trace:alsa:AudioClient_Initialize MMDevice buffer size: 2560 frames
Main problem here is that your ALSA period is bigger than MMDev period. As I had already mentioned you should expect underruns in such case. Probably we should not try to get of ALSA period setup during init but try to set it to be equal to mmdev period instead.
So perhaps we need to tweak the write_limit calculation to ensure there's always at least three mmdev periods _and_ three ALSA periods (three in case we get the period pattern Wine,Alsa-->Alsa,Wine).
Having three mmdev periods in ALSA buffer won't help in case (3*mmdev_period_frames < alsa_period_frames) == TRUE. What can be done when calculating write_limit is to also take into account the alsa_period timeframe. Easiest way is to take period=max(alsa,mmdev), but it would add excess latency which might be avoided by doing more precise calculations. I would see what can be done on this topic and post a new patch with proposed changes when done.
This fixes the problem for me ... I tested this with a few games with PulseAudio as well, and it worked. Does this continue to work with RAGE and your other test games?
The code you had written is exactly what I've been writing about in previous paragraph. I'm not too content with using 3*period while it should be enough to have 2*period - extra latency is not a good thing and it should be avoided if possible. 3*period would be especially harmful if we don't set ALSA period and it would be auto-set by ALSA to something really big.
I think the snd_pcm_state() value is useful here, if only for debugging the PulseAudio bug. Unless you have a reason to get rid of it, I'd rather keep it in.
OK. I just had removed it from there as with the modifications I made GetCurrentPadding had been no longer relaying on any data from ALSA. Maybe it would be reasonable to move snd_pcm_state() trace into alsa_write_buffer and/or to timer callback handler - IMO there it would be more logical place to call ALSA stuff.
When you are getting ready to submit the patches, you can just remove this code. This patch also creates an unused variable which you can remove, alsa_period_us.
Sure, I just had used the patch you had submitted as-is, without any attempts to make it more pretty. Having comments by preprocessor ifs in final patchset is no-no :-). I would clean it up in case we would head on "don't set ALSA period road" or uncomment it back in case "set ALSA period to match mmdev period" road would be chosen instead.
I know they're taken from my hackish test patch...
Sure it would be cleaned up. Just had been using your quick hackish patch as-is while testing.
(In reply to comment #41)
Changing dsound's primary.c so it don't try to set period size to pre-defined value of 50000 seems to fix the problem - no more unexpected xruns, I've got only four at expected places after ~5 minutes of RAGE gameplay.
This change is obviously correct, so there should be no problem submitting it.
This change is correct, but mmdev hitting xruns with my patch in + 5ms period - is wrong. My tests show that most likely xruns I hit are yet again caused by ALSA period ending up being bigger than mmdev period. Going to test more thoroughly today.
Thank you for review, I would come back with a new patchset for more :-).
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #44 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-11 12:15:37 CST --- Sorry for the short comment -- what I've been missing from the patches so far is a path for apps that don't care about short periods and buffers, e.g. winmm:playsound. I.e. when an app calls mmdevapi with a large duration, ALSA should not be used with a 5ms period, e.g. 200 timer interrupts per second.
BTW, I've been taking another look at several render.c test outputs sent to me. It appears that: - min. GetBufferSize (when duration=0) is 2x or 3x period (3x observed with a 5:1 card), so it's not always 20ms. - GetPosition seems 30-40ms away from padding position in shared mode, but extremely close to padding in exclusive mode. I'm unsure how to explain that. Simply say that the mixer adds 30ms? Or is that the buffer size? That may argue in favour of *out = This->held_frames in shared mode with a tiny 30ms ALSA buffer representing native's mixer latency.
BTW, about GetPosition, did you use my patch from comment #32 or something else?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #45 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-11 17:16:32 CST ---
I can't find where in the comment #19 I had been talking about 100% prefill (you're considering 30ms duration / 10ms period case, right?).
No. I was considering a 10ms period and a 40ms buffer filled 3/4 = 30ms you mentioned in comment #24 even though that requires a period >= 10.5ms as you explain in comment #19.
Data that had been received through GetBuffer()/ReleaseBuffer() calls might be immediately pumped out to ALSA. Jörg, you had mentioned that there are some constraints against acting in a such way
When you pump out data at fixed intervals, you can reason about the properties of your system: - if ALSA has less than mmdev_period samples after my write, there'll be an underrun. - if ALSA has not started yet and I write less than alsa_period samples, then it may not start -- bug #29056 -- so I must fill it up, which should be safe, as the app did not write more data in this (mmdev) period. - Get/ReleaseBuffer is presumably a fast operation in native, so the app never blocks. ReleaseBuffer used to wait while writing to OSS, see bug #28056, comment #35. - Several threads writing to the same ALSA stream, hmm... - You can even add code to detect that sound hung during the last events and reinitialize it.
If you write data anytime, you can't decide anything because you don't know whether the app may call Get/ReleaseBuffer again within the next millisecond. Yet I think that a heuristic like "if near underrun, feed ALSA a little" may be ok, nevertheless one must carefully assess how this affects GetCurrentPadding, as your analysis shows, and how to solve bug #29056, i.e. when to pad with silence.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #46 from Alexey Loukianov mooroon2@mail.ru 2011-11-11 21:53:28 CST --- (In reply to comment #45)
When you pump out data at fixed intervals...
Thanks for explaining, your points seems to make really big sense.
Yet I think that a heuristic like "if near underrun, feed ALSA a little" may be ok, nevertheless one must carefully assess how this affects..
Yeah, but then IMO it would be more important to get xrun-free behavior at first, regardless it would result in slightly bigger latency, and then proceed with adding heuristics trying to shrink latency carefully testing each and every optimizing change in code.
(In reply to comment #44)
Sorry for the short comment -- what I've been missing from the patches so far is a path for apps that don't care about short periods and buffers, e.g. winmm:playsound. I.e. when an app calls mmdevapi with a large duration, ALSA should not be used with a 5ms period, e.g. 200 timer interrupts per second.
Good point, but it would diverse winealsa.drv mmdev behavior from what is observed on native OS. My tests seems to show that from client POV Win7 mixer always "ticks" in 10ms, no matter the buffer size client requests. Having Wine behaving in some other way might be perfectly OK for some cases but would cause problems with poorly written apps that rely on this undocumented behavior. For this optimization to work we need some other criteria except for buffer size to proceed with long periods. Maybe best bet would be to make a judgment basing on presence of event: we may assume that in case app wants to receive events - it wants to minimize latency using event-driven shared mode buffer fills and react accordingly. Real apps tests would be needed to prove if this assumption is wrong or not.
Also, take a look into MSDN. There it is explicitly stated that for the shared mode non-event-driven render stream the latency is only dependent on the difference between the position where an app writes data to the buffer and the position an audio engine reads data from the buffer. It leaves us with a tiny period requirement no matter the buffer size :-(.
- GetPosition seems 30-40ms away from padding position in shared mode, but
extremely close to padding in exclusive mode. I'm unsure how to explain that.
Well, as (A) in the best case (from latency POV) the buffer for exclusive mode is a real buffer HW DMA engine is reading from, and (B) there's no mixer-in-the-middle - having padding to be extremely close to GetPosition is pretty reasonable IMO for exclusive mode. You simply don't have latency-introducing entities downstream from app to the sound hardware so there's no reasons for padding to differ heavily from real playback position. I assume that you're writing about non-event driven exclusive mode, as the value of GetCurrentPadding is meaningless for event-driven exclusive mode.
BTW, about GetPosition, did you use my patch from comment #32 or something else?
No, I hadn't tried to fixed Wine's GetPosition up till yesterday and had tried to quickly hack-in my own implementation afterwards. My implementation is pretty bogus ATM, thanks for reminding me about the existence of your WIP patch targeting this problem.
http://bugs.winehq.org/show_bug.cgi?id=28723
Raymond superquad.vortex2@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |superquad.vortex2@gmail.com
--- Comment #47 from Raymond superquad.vortex2@gmail.com 2011-11-11 22:57:25 CST --- (In reply to comment #21)
Interesting issue. I've never seen native tests report anything else beside 10.0000 and 10.1587ms (alignment for Intel HDA) shared mode default periods. So reporting anything else is out of question.
However, what native is said to do and Wine doesn't is to clamp period and duration. From what I've read (but not yet tested, which is why I've not yet written that patch),
- max duration is 2s in shared mode, 500ms in exclusive mode
- min. duration should probably be 3*period (at least for Wine)
- min period is 3ms (from GetDefaultPeriod)
- max period is 100ms (I think I read that somewhere)
Note that Wine still has a related bug: it must ignore period in shared mode.
http://git.alsa-project.org/?p=alsa-kernel.git;a=commit;h=2ae66c26550cd94b0e...
Unless you are using intel hda controller, you won't get exactly 10ms period time by default
Min period of 3 ms is not supported by all alsa sound drivers (e.g. ymf724 which update hw_ptr in 5.333 ms intervals)
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #48 from Alexey Loukianov mooroon2@mail.ru 2011-11-12 02:01:40 CST --- Oh my! Raymond, you would do me a big favor if you won't comment here on totally unrelated topics to the original discussion. Please, stay out of commenting in this bug report unless you want to post test results for some of proposed patches attached to this bug. Thank you in advance.
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |1.4.0
--- Comment #49 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-16 04:16:08 CST --- A closer inspection of test data submitted to me shows that GetPosition is behind the dynamic limit 'sum of ReleasedFrames - GetCurrentPadding' by 30-40ms in shared mode (24-34ms with one 5:1 card), i.e. it's outside the mmdevapi buffer. In exclusive mode, GetPosition alternates around that limit.
Thinking about a mixer, I've observed that it's easy to write a 3 period one (one period playing, one fully prepared which the HW already knows to play next and the third one where mixing occurs). Attempting to use 2 only is possible but requires the system to tell the HW which buffer to play next right before underrun would happen. This does not fit a periodic timer model.
Back to the 2 period 20ms buffer event-driven app. It's not the sustained 10ms rate that causes trouble, it's the assumption that Wine ought to feed ALSA from solely 20ms. That leads to seeking problematic speedups, like believing that ReleaseBuffer must call write. Add a 10ms lead, and you are in still water.
http://bugs.winehq.org/show_bug.cgi?id=28723
Jan Kalab pitlicek@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |pitlicek@gmail.com
http://bugs.winehq.org/show_bug.cgi?id=28723
Ventero wine@ventero.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |wine@ventero.de
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #50 from Andrew Eikum aeikum@codeweavers.com 2011-11-22 15:15:57 CST --- (In reply to comment #43)
Thank you for review, I would come back with a new patchset for more :-).
Had any time to improve this?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #51 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-24 19:51:15 CST --- Given what Alexey explained about 20ms bufer sizes, I'm convinced that -- which is easy to test -- after the first event callback, padding will have decreased by 10ms.
Second I'm convinced that, given Linux scheduler's properties (esp. CFQ), it would do no good to audio to insist on 10ms wake-up periods. Wine does not use soft real-time OS features. Instead the audio driver should really be written in such way that if it's fed more audio data, e.g. 333ms, it won't depend on a wake-up every 10ms. AFAIK, PA uses 50ms periods. A Linux user thead is somehow at the opposite of a real time thread: all root threads lie in between.
IOW, I'm opposed to a design that depends on 10ms feeder threads just because that would make Rage (or any 20ms XA2 client) work. XA2 ought to work, yet other apps with large buffer sizes must not stress the system ^H^H produce reliable audio output even if a 10ms event is missed. The general lesson is that I can't expect a Linux system to give repeated 10ms "interrupt" response. There will be random (=rare) cases with higher latency (upto 100 ms with user processes).
I also think that it would be very revealing to write the following test case: on every (10ms) event, feed 5ms of data. I conjecture (based on GetCurrentPadding's return value after underruns) that w7 will beep 100 times a second, while older systems (Vista) may well coalesce data until a full 10ms is full before sending it to the audio device, thus beeping 50 times a second. Obviously, such a test should be performed by somebody with access to both systems. I have none of them.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #52 from Alexey Loukianov mooroon2@mail.ru 2011-11-24 20:09:41 CST --- (In reply to comment #50)
Had any time to improve this?
Not anything polished-enough yet - "real life" (tm) is a bit time consuming thing :-). Hadn't had a chance to read new comment from Jörg too, going to spend a day testing and hacking on this weekend.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #53 from Alexey Loukianov mooroon2@mail.ru 2011-11-24 20:35:46 CST --- (In reply to comment #51)
Instead the audio driver should really be written in such way that if it's fed more audio data, e.g. 333ms, it won't depend on a wake-up every 10ms. ... IOW, I'm opposed to a design that depends on 10ms feeder threads just because that would make Rage (or any 20ms XA2 client) work. XA2 ought to work, yet other apps with large buffer sizes must not stress the system ^H^H produce reliable audio output even if a 10ms event is missed.
I had already posted it and want to point on it again: it seems that native system (w7 at least, I hadn't had a chance to test on Vista yet) acts in a way that latency is purely controlled by an amount of data an app feeds in to a buffer. This means that de-facto native infrastructure is designed in a way that allows for "dynamic latency control". In case I'm content with 0.5s latency - no problems, I just fed in 0.5s of audiodata to the audio subsystem and make sure to wake - say - every 0.25s and top the buffer to be contain 0.5s of data again. As soon as I need a lower latency - I simply may just wait for buffer to have desired amount of data left (roughly equal to the targeted latency) and start to feed in audio core more frequently. It is perfectly OK to do it from the same exact thread of the same exact app. It makes me believe that to satisfy this concerns it is required to use smallest possible period available on the system. OTOH I'm also against abusing timer API to produce hi-res events (10ms or even less) and I'm against stressing the system hard in order to achieve small latency for the cases where this isn't strictly required. Our goal is to find a code design solution for Wine mmdevapi audio driver which would be "less abusing but still doing the trick". Experimenting with patches vs. latency and testing on native systems is fun but to come up with a final patch we would eventually have to discuss and agree on the way audiodriver should be designed. By "we" here I mostly mean Jörg and Andrew as a main developers working with Wine audio subsystem codebase.
The general lesson is that I can't expect a Linux system to give repeated 10ms "interrupt" response. There will be random (=rare) cases with higher latency (upto 100 ms with user processes).
Sure, but it can be easily proven with experiments that high-latency cases are really rare on moderately loaded system and sticking to "worst latency case" to fully get rid of underruns doesn't seem reasonable to me. IMO it'd be much better to target at 30-40-50ms latency level and accept that there would be some cases here-and-there when userland scheduling latency spikes would cause some underruns.
I also think that it would be very revealing to write the following test case: on every (10ms) event, feed 5ms of data.... Obviously, such a test should be performed by somebody with access to both systems. I have none of them.
Hmm, I would try to hack-in such test on this weekend. I've got Win7 installed on my netbook and have a relative who owns laptop with Vista installed. Another way to test it on Vista might be to use IE-testing virtual appliances offered for download by MS - as far as I recall one of them was Vista-based and had been capable of running another Win32 binaries besides pre-installed IE stuff.
http://bugs.winehq.org/show_bug.cgi?id=28723
Brian Vuyk brian@brianvuyk.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |brian@brianvuyk.com
http://bugs.winehq.org/show_bug.cgi?id=28723
Andrew Eikum aeikum@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #37437|0 |1 is obsolete| |
--- Comment #54 from Andrew Eikum aeikum@codeweavers.com 2011-11-28 14:57:40 CST --- Created attachment 37683 --> http://bugs.winehq.org/attachment.cgi?id=37683 ALSA fixes patchset
Here's another version, cleaned up to submission quality. It contains Alexey's work, as well as Jörg's GetPosition() reimplementation and some other miscellaneous fixes.
This patchset does not address Bug 29056, but I don't think it makes the situation any worse. That bug can be dealt with separately.
These patches also make no changes or improvements for applications that aren't picky about period sizes (Comment 44). But again I think the solution for that can come later, if one is needed.
The issue of patch ownership will also need to be looked at.
Jörg: I removed the upper bound (5 seconds) on delay_frames in GetPosition(). How strongly do you feel we need that type of logic?
Alexey: Does this patchset fix RAGE for you?
Does anybody object to submitting this? Do we think this is a complete enough fix that we should ask people to retest ALSA-related bugs?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #55 from Alexey Loukianov mooroon2@mail.ru 2011-11-29 03:31:29 CST --- (In reply to comment #54)
Here's another version, cleaned up to submission quality. It contains Alexey's work, as well as Jörg's GetPosition() reimplementation and some other miscellaneous fixes.
Alexey: Does this patchset fix RAGE for you?
Yeah, and reducing in_alsa to be 2x period instead of 3x also works flawlessly. Actually I had switched from testing using RAGE into testing using a hackish testcase I had coded in last Sunday. I would attach it with next comment.
This testcase can be safely compiled as winelib app, and also can be compiled on native OS using fresh-enough ming-w64 toolchain (one which have required mmdev headers). What it does is opens up default WASAPI audio render device in shared mode and playbacks 2500hz tone generated on-fly using event-driven buffer fill. It reports some collected stats during initialization and while playing. Requested buffer duration is the same as XA2 requests - 20ms.
At current form it tries to play 5s of generated sine tone four times each time changing a buffer fill-in method slightly. For the first try it mimics closely XA2 behavior: each time event fires it fill in buffer with 441 audio frames data chunks in loop until buffer can't hold another 441 frames. Second try is just the same, but using 220f chunks. Third try is the same as first one, but for each event no more than one 441f chunk is pumped out to the audio engine. Fourth try is the same as third but using 220f chunks.
On Win7 native system first two tries are played back without noticeable underruns, third and second tries produce a lot of underruns. Fourth try is basically what Jörg had been writing about in comments recently.
What should be noted is that Win7 seems to use 16 frames align for allocated buffer size. I.e. when I request 20ms buffer for 44100Hz stream I got 896 samples buffer in Win7 which is ((20ms * nSamplesPerSec)/16 + 1)*16. At first I thought that actual alignment scheme used is 64 byte for buffer but using 32bit IEEE_FLOAT samples format resulted in same 896f buffer and not 888f buffer which should be expected if 64 byte alignment had been used.
With Wine I get 882 samples buffer for same case due to obvious reasons.
As for observed underrun behavior for the testcase proposed by Jörg: I got 100Hz periodic behavior on Win7 which conforms to the Jörg's prediction. Hadn't had a chance to lay my hands on laptop with installed Vista yet so can't tell what is the behavior under Vista. Hope I would find a moment to test it in a next few days.
P.S. Hadn't had a chance to look at proposed GetPosition implementation yet but can tell for sure that the reported values under Wine differ with ones reported under Win7, especially for cases when there are underruns. On Win7 for underrun free cases reported position lags ~28ms behind amount of pumped out data at the moment of event fires. As the event fires at 10ms intervals when about half of the buffer had been played it gives around 18ms latency introduced by sound engine. It is pretty much in line with values reported by Wine with applied patchset from comment #54 - average lag behind ~33ms, which suspiciously low as we should have around 30ms of data (3xPeriod) sitting at ALSA buffer and another 10ms of data held in mmdevdev buffer resulting in around 40ms of lag.
For cases with underruns (mostly for the fourth try on the testcase) devposition reported by Win7 equals to the amount of data pumped out to the audio engine. Thinking of it makes me believe that it is just right the expected result: audio engine had played back all the data it had been fed with and thus devposition should be equal to samples_sent. With Wine I still got devposition lagging behind about 5ms for fouth try and 10ms for third try even for the "hit underrun and stop feeding data" case. Inspecting +mmdevapi logs makes me believe that this is caused by following sequence of events:
1. ALSA hits underrun. 2. Timer callback gets called, alsa_write_data recovers from underrun and pumps out data to the ALSA buffer (no more than 3xMAX[ALSA Period, MMDEV Period]). For the fourth try from testcase it is exactly 220f of data, for the third try is is 441f. 3. Callback proc fires up event. If app would call GetPosition at event handler - it would find ~220f/441f samples of data laying inside ALSA buffer resulting in reported lag about 5ms/10ms.
We should think about how to rewrite proposed GetPosition logic to more closely match native behavior for such cases.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #56 from Alexey Loukianov mooroon2@mail.ru 2011-11-29 03:32:50 CST --- Created attachment 37689 --> http://bugs.winehq.org/attachment.cgi?id=37689 Wine mmdevapi testcase.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #57 from Alexey Loukianov mooroon2@mail.ru 2011-11-29 03:50:21 CST --- (In reply to comment #55)
It is pretty much in line with values reported by Wine with applied patchset from comment #54 - average lag behind ~33ms, which suspiciously low as we should have around 30ms of data (3xPeriod) sitting at ALSA buffer and another 10ms of data held in mmdevdev buffer resulting in around 40ms of lag.
Thinking more about it makes me fill that's everything is correct here - we have in ALSA buffer not strictly 3xPeriod, but some amount of data which is: 2xPeriod < amount_of_data <= 3xPeriod. Inspecting logs shows that during normal playback amount_of_data drifts from 3xPeriod downwards to 2xPeriod in around ~16 frames per 10ms steps, thus delay_frames should be ~25ms of audio data in average. ~25ms in ALSA + ~10ms in mmdevdrv = ~35ms of lag which is close to observed ~33.3ms. Thus GetPosition seems to be implemented in mostly correct way, the only problem is the devpos != total_samples_sent case which should be investigated further.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #58 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-30 04:07:41 CST ---
when I request 20ms buffer for 44100Hz stream I got 896 samples buffer
I conclude that you have an Intel HDA. That uses no 10ms period rather than a 10.1587ms one with 448 frames. IOW, you need to repeat your tests (441->448)!
((20ms * nSamplesPerSec)/16 + 1)*16
Some month ago, I tried to predict buffer sizes etc. in shared and excl. mode. I wasn't successful with the HDA's alignment requirements, yet the following code predicts non-HDA engines. Note the differences in rounding (round is like MulDiv, unlike floor & ceiling etc. -- see the Common Lisp HyperSpec). In particular, round is not C's / operator.
(defun mmdevapi (duration period freq) (let* ((shared (round (* (float duration) freq) 10000000)) (fragment (round (* (float period) freq) 10000000)) (parts (round (float duration) period)) (partu (ceiling (float duration) period))) (list shared fragment parts (* fragment parts) (* (round (* fragment partu) 128) 128)))) (mmdevapi 5000000 101587 44100)
It would be good if you could find the correct formula for exclusive mode with Intel HDA.
using 32bit IEEE_FLOAT
FP is convenient for mixing, but I'd expect the output of the mixer to be typical 16 bits. OTOH, some cards *do* support FP, so why would native perform an extra conversion then?
---- GetPosition AE>I removed the upper bound (5 seconds) on delay_frames in GetPosition(). AE>How strongly do you feel we need that type of logic? Well, Alexey even complains about 5ms :-) AL>With Wine I still got devposition lagging behind about 5ms
I introduced that logic to specifically acknowledge the idea that a true speaker position may be up to 5s behind the audio data we feed in the presence of *huge* latencies (audio over a network?). Latencies cause speakers to continue playing even after the front buffer hits an underrun.
GetPosition MUST ensure that it'll reach sum_written eventually.
Removing that logic also removed the underrun protection that worked in free-running mode. Something else is needed with the current XRUN=>stop mode.
Hopefully I can complete my GetPosition patch tonight. I'll also add the "IAC_Stop must not use snd_pcm_drop" one to pass even more of my render tests.
Alexey's comment is a bit troubling. If there was a 28ms lag while feeding data, I'd expect GetPosition to slowly reach sum_written within 28ms after feeding ends. He seems to say that GetPosition is bumped to sum_written as soon as there's no more data to feed. This would be important behaviour to mimic. Can you add please some logs to show this and report back?
---- Log File Analysis Analysis of my WINETEST_DEBUG=2 render.ok output shows that GetPosition is 41-83ms(!) behind frames_written-padding in Wine with the 4 patches from comment #54, and while GetPosition advances as regularly as the high performance counter, it's the padding that varies wildly at each iteration. The cause is probably the large ALSA period with dmix: AudioClient_Initialize ALSA buffer size: 8192 frames AudioClient_Initialize ALSA period size: 1024 frames AudioClient_Initialize MMDevice period: 480 frames AudioClient_Initialize MMDevice buffer size: 24000 frames or (period still not being ignored in shared mode): AudioClient_Initialize MMDevice period: 720 frames
---- Patches & Ownership I claim 0002. Date: Wed, 2 Nov 2011 09:27:46 +0100 Subject: [PATCH] winealsa: Don't set ALSA's period time. http://www.winehq.org/pipermail/wine-patches/2011-November/108479.html
Something is missing from patch 0003. IMHO, any patch that touches + if(period) + This->mmdev_period_rt = period; should clamp period and duration first, something like:
if(!duration) /* 3 times is compatible with native and our requirements*/ duration = 3 * DefaultPeriod; if(SHAREDMODE){ This->mmdev_period_rt = DefaultPeriod; if(duration > 2s) 2s if(duration < 3 * DefaultPeriod) duration = 3 * DefaultPeriod; }else{ if(period < MinPeriod) ... if(duration ...)
---- Alternatives It's good news that your patch set works well (the key seems to be the ALSA buffer size limitation?). Here's what I had sketched during the last days.
alsa_set_period_max 10ms /* guarantees start when writing 10ms */ alsa_set_buffer_min 30ms /* stream with at least 3 periods */ TODO set_buffer_max approx. duration, perhaps 30ms more when tiny.
if (to_write){ if (alsa_pad <= 0 /* just starting */ && to_write <= 10ms){ /* we want 3 periods for streaming */ write(10ms silence lead); } write(alsa_avail); /* as much as possible, not 10ms */ /* alsa_buffer >= 20ms => will write at least 10ms initially */ } if(alsa_avail>=10ms) if(gcp>10ms) # gcp = max(0,gcp-10ms); gcp -= 10ms; else gcp = 0;
It allows the callback to be late by up to 9ms in the case of Rage, but needs more thought about when the audio and system clocks differ or when e.g. due to system load or paging several events are missed. It should allow large ALSA buffers and a heuristic to write less often than every 10ms if ALSA already has a lot of data.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #59 from Alexey Loukianov mooroon2@mail.ru 2011-11-30 06:21:08 CST --- (In reply to comment #58)
when I request 20ms buffer for 44100Hz stream I got 896 samples buffer
I conclude that you have an Intel HDA. That uses no 10ms period rather than a 10.1587ms one with 448 frames. IOW, you need to repeat your tests (441->448)!
Actually it isn't Intel HSA, it's rather Conexant HDA codec sitting on HDA-bus. Nevertheless reported default period is really 448f. As for repeating tests - actually I've done testing with chunk_size = buffer_size >> 1, which is exactly 448f for 896f buffer.
.. yet the following code predicts non-HDA engines... (lisp black magic follows)
Oh my, there are some programming languages out there that I'm not comfortable with (IOW don't like them and/or find their syntax/concepts too confising). Lisp is one of those, other are prolog, perl, python :-). Would have to smoke in common list specs to decypher that cryptic construct to something that looks like more traditional-math like.
It would be good if you could find the correct formula for exclusive mode with Intel HDA.
I hadn't even touched the exclusive mode in my tests yet. And I'm a bit curious are there any real-world apps out in the wild which use exclusive mode?
Well, Alexey even complains about 5ms :-)
That's not complaints, it just the observation of the behavior that differs when running testcase on Win7 vs. Wine :-).
Alexey's comment is a bit troubling. If there was a 28ms lag while feeding data, I'd expect GetPosition to slowly reach sum_written within 28ms after feeding ends.
Yeah, that's what exactly happens.
He seems to say that GetPosition is bumped to sum_written as soon as there's no more data to feed. This would be important behaviour to mimic.
No-no-no, sorry for not being clear in my original comment. With both Wine and Win7 I've got devposition steadily increasing up to but no more than the amount of data I had pumped out to the audio engine. If I stop pumping the data out and start polling the value returned by IAudioClock_GetPosition - I get devpos steadily approaching the samples_written count and stay there constantly until I pump out some more data. Note, that this behavior is not the case for the devpost value returned by IAudioClock2_GetDevicePosition - it is totally different beast affected by devpos backwards jumps to 0 on hardware underruns and maybe some other circumstances I hadn't discovered yet.
---- Alternatives ...
Hmm, your pseudocode looks promising but I have a couple of questions concerning it: 1. Does it really matters would we write out data on each event or not if we still would be receiving timer callbacks every 10ms? The only real possible benefit I see is that we would kind-a be "protected" from "skips" in output stream when the system wouldn't be able to schedule out timer callback timely due to something like paging or executing some other high-prio thread. That's a good reason on its own but only in case actual implementation wouldn't be excessively complicated.
2. What is "gcp" here? GetCurrentPadding? held_frames? Something else?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #60 from Alexey Loukianov mooroon2@mail.ru 2011-11-30 08:00:35 CST --- Found another gotcha for GetPosition implementation from latest patchset. Suppose an APP had finished pumping out data and waiting some time for audio engine to finish actual playback. Here is what happens with proposed Wine GetPosition implementation:
... Event loop finished, waiting 20.0ms for playback to finish... T: 2005.3ms S: 87318f P: 0f DP: 86881 dDP: 437 dDPNZ: 437 T: 2006.4ms S: 87318f P: 0f DP: 86930 dDP: 388 dDPNZ: 388 T: 2007.6ms S: 87318f P: 0f DP: 86983 dDP: 335 dDPNZ: 335 T: 2008.7ms S: 87318f P: 0f DP: 87033 dDP: 285 dDPNZ: 285 T: 2009.8ms S: 87318f P: 0f DP: 87081 dDP: 237 dDPNZ: 237 T: 2010.9ms S: 87318f P: 0f DP: 87130 dDP: 188 dDPNZ: 188 T: 2012.1ms S: 87318f P: 0f DP: 87180 dDP: 138 dDPNZ: 138 T: 2013.2ms S: 87318f P: 0f DP: 87228 dDP: 90 dDPNZ: 90 T: 2014.3ms S: 87318f P: 0f DP: 87278 dDP: 40 dDPNZ: 40 WARNING: devpos backwards jump, prev = 87278, cur = 85963 T: 2015.4ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2016.6ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2017.7ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2018.8ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2019.9ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2021.1ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2022.2ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2023.3ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2024.5ms S: 87318f P: 0f DP: 85963 dDP:1355 dDPNZ:1355 T: 2025.8ms S: 87318f P: 0f DP: 87318 dDP: 0 dDPNZ:1355
We can see device position (DP) returned by IAudioClock_GetPosition() to steadily increase towards the value of samples_sent (S). Then we hit expected xrun and devpos experiences immediate backwards jump. The cause can be easily guessed from WINEDEBUG logs:
... 0025:trace:alsa:AudioClock_GetPosition written_frames: 0x1535d, held_frames: 0, delay_frames: 88, avail_frames: 1186, pad_frames: 137 0025:trace:alsa:AudioClock_GetPosition written_frames: 0x1535d, held_frames: 0, delay_frames: 38, avail_frames: 1235, pad_frames: 88 0025:warn:alsa:AudioClock_GetPosition snd_pcm_delay failed: -32 (Обрыв канала) 0025:trace:alsa:AudioClock_GetPosition written_frames: 0x1535d, held_frames: 0, delay_frames: 0, avail_frames: -32, pad_frames: 1355 0028:warn:alsa:alsa_write_data snd_pcm_delay failed: -32 (Обрыв канала) 0025:trace:alsa:AudioClock_GetPosition written_frames: 0x1535d, held_frames: 0, delay_frames: 0, avail_frames: 1323, pad_frames: 0 0025:trace:alsa:AudioClock_GetPosition written_frames: 0x1535d, held_frames: 0, delay_frames: 0, avail_frames: 1323, pad_frames: 0 ...
It's obvious that when we get avail_frames == -32 it doesn't mean that pad_frames = 1355. To fix it we should add check for XRUN state to the GetPosition right after the "avail_frames = snd_pcm_avail_update(This->pcm_handle);" line:
+ if(snd_pcm_state(This->pcm_handle) == SND_PCM_STATE_XRUN || + avail_frames > This->alsa_bufsize_frames || + avail_frames < 0){ + TRACE("XRun state detected\n"); + avail_frames = This->alsa_bufsize_frames; + }
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #61 from Jörg Höhle hoehle@users.sourceforge.net 2011-11-30 08:31:47 CST ---
Found another gotcha for GetPosition
No surprise, XRUN is not handled correctly yet.
If I stop pumping the data out and start polling the value returned by IAudioClock_GetPosition - I get devpos steadily approaching the samples_ written count and stay there constantly until I pump out some more data.
Thank you. Winealsa is a little bit different. As soon as the local ALSA buffer catches an underrun, ALSA stops and both avail_update and delay freeze. Typically avail_update frozen < buffer_size; frozen delay > 0 I.e. the reported values are the last ones before underrun in XRUN=>stop mode occurs. So if there were a 2s audio network latency, GetPosition would freeze before reaching samples_written. Winealsa must account for that and IMHO bump position as needed.
- What is "gcp" here? GetCurrentPadding?
Indeed the discrete value that it must return, jumping in increments of 10ms.
more traditional-math like
Well, assuming MulDiv is like round: buffer_size = MulDiv(duration, rate, 10000000); period_size = MulDiv(period, rate, 10000000); for Wine (which shouldn't care about HDA alignments and use 10ms).
we would kind-a be "protected" from "skips" in output stream
I recently came back from a meeting where a quadcore Linux was reportedly unable to steadily handle 50ms hard deadlines until after a central part of the app was moved to (soft) real time priorities. When you know that these run above normal root priorities, regularly feeding 10ms from a user app makes me worry. Occasionaly, there would be hick-ups, up to 150ms. Often enough, they would not happen in the first hour after booting...
In general, processing data in large chunks boosts speed. Same for audio, e.g. why feed BGM ("background!") in 10ms chunks?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #62 from Alexey Loukianov mooroon2@mail.ru 2011-11-30 15:19:38 CST --- (In reply to comment #61)
As soon as the local ALSA buffer catches an underrun, ALSA stops and both avail_update and delay freeze. Typically avail_update frozen < buffer_size; frozen delay > 0
Test on my linux workstation shows that typically a call to snd_pcm_delay fails with EPIPE upon hitting xrun, and snd_pcm_available_update also returns -32 instead of meaningful value. This is observed with emu10k1 accessed directly through hw:0,0 ALSA pcm.
Things get broken in a way you describe only if I reconfigure ALSA to use rate plugin which seems to be bugged - it ignores snd_pcm_sw_params_set_start_threshold() value set to 1 and starts slave pcm only in case it had received at least one full period of sound data. I had been able to reliably reproduce bugged behavior with pretty simple modification to my mmdev-test testcase and can attach it along with reproduction instructions here on request.
What is happening is that if the period size is big enough and we were unfortunate to hit XRun when amount of frames left to be played is less than period size - winealsa.drv would recover xrun and pump all available frames to ALSA but the slave pcm and would not be started. Thus we would be left in a state when observed delay_frames and avail_frames appear "frozen" just like you had described. In fact observed values are really correct at that moment: "rate" plugin simply had buffered delay_frames in its internal buffer and waits for more frames to form full period before starting slave device. Thus the "device position" is really delay_frames behind written_frames, the only problem that it stays there while winealsa expects it to be moving after it had finished with recovering from XRun state.
Real remedy for this problem would be to fix "rate" plugin (and all other plugins that are bugged in a same way) to respect sw_params.start_threshold value, but I'm not sure ALSA folks would agree that it should be done. Most likely it would be needed to implement a workaround in Wine for this case, for example using following logic:
if(frames_held == 0) { if( (avail_frames_now == avail_frames_@prev_timer_cb_invocation) && (delay_frames_now == delay_frames_@prev_timer_cb_invocation) && ((alsa_buffer_frames - avail_frames_now) < alsa_period_frames) ) { // We're stuck with non-runing pcm which pretends to be running // There's no really correct way to handle this situation :-(. // Options are: // a) supply ALSA with // alsa_period_frames - (alsa_buffer_frames - avail_frames_now) // of silence so pcm would start and play the samples it holds. // b) reset pcm discarding all the samples it holds; } }
In general, processing data in large chunks boosts speed. Same for audio, e.g. why feed BGM ("background!") in 10ms chunks?
Sure it's better to feed in largest chunks possible as long as it wouldn't result in broken API emulation. That's why I find the approach you had proposed very promising.
----- cut ------ Andrew, while I've been experimenting with dmix period settings I've got a following warning in Wine logs:
0009:fixme:alsa:AudioClient_Initialize ALSA buffer time is smaller than our period time. Expect underruns. (512 < 441) 0009:trace:alsa:AudioClient_Initialize ALSA buffer size: 512 frames 0009:trace:alsa:AudioClient_Initialize ALSA period size: 256 frames 0009:trace:alsa:AudioClient_Initialize MMDevice period: 441 frames 0009:trace:alsa:AudioClient_Initialize MMDevice buffer size: 882 frames
Rationale behind this warning is obvious but the actual text is misleading (512 is by no means less than 441 :-) ). IMO it'd be more correct to change it into something like: "Buffer size provided by ALSA is too small, expect underruns".
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #63 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-01 04:52:30 CST --- What you describe is very interesting. I've rarely used hw:0 myself as I typically use a converter plugin to allow 8000x8x1.
I've hit a new variant yesterday with dmix: Given a period of 170 frames, play Nx170 silence (N>=1), then 169 noise and wait. Underrun is entered but the noise is never heard. Dmix (or the rate converter you described) is again waiting for a full period.
This reminds me of test results on vista, where padding frames are left unconsumed after underrun. Not so with w7.
It means that the last milliseconds of any sound are not played in Wine using dmix unless the app adds silence -- which is what MSDN recommends for mmdevapi. http://msdn.microsoft.com/en-us/library/windows/desktop/dd316756%28v=VS.85%2...
I'm sorry GetPosition is not ready. Yesterday I had a case with PulseAudio where the first call to snd_pcm_delay after starting to play returned -5. GetPosition should return 0+old_samples in that case, which would need a new variable old_samples to cumulate samples played until the last underrun (may be valuable for MacOS too). Second, I've had cases where PulseAudio avail_update returned ~1.8s worth of frames, much more than the 50ms buffer_size I set. So computing alsa_padding as buffer_size-avail_update does not work. I'm not amused.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #64 from Alexey Loukianov mooroon2@mail.ru 2011-12-01 05:58:30 CST --- (In reply to comment #63)
What you describe is very interesting. I've rarely used hw:0 myself as I typically use a converter plugin to allow 8000x8x1.
emu10k1 allows to use "almost arbitrary" rate/bps/chan combos doing required conversions in hw (or in kernel space - hadn't had a chance to look at emu10k1 sources) so all the test's I've been ever doing had been done like hw vs. dmix plug only vs. rate plug only vs plug+dmix combo. You'd be surprised how many gotchas and black areas are there when some perfectly correct access schemes work as expected with direct hw access and fails silently when using some alsa-lib plugins.
I've hit a new variant yesterday with dmix: Given a period of 170 frames, play Nx170 silence (N>=1), then 169 noise and wait. Underrun is entered but the noise is never heard. Dmix (or the rate converter you described) is again waiting for a full period.
I had spent some time yesterday looking at alsa-lib sources. I think that it's not dmix who causes the behavior you describe, most likely it is rate plugin again. Take a look at it's sources (http://git.alsa-project.org/?p=alsa-lib.git;a=blob_plain;f=src/pcm/pcm_rate....), grep for "start_pending" and the root cause of the problem would become obvious for you.
This reminds me of test results on vista, where padding frames are left unconsumed after underrun. Not so with w7.
Oh, thanks for reminding about this. I would try to repeat the same tests I've done for alsa-lib rate/dmix on laptop with Vista installed - today it had finally arrived at my disposition and going stay here for a next few days.
It means that the last milliseconds of any sound are not played in Wine using dmix unless the app adds silence -- which is what MSDN recommends for mmdevapi.
Well, they don't actually recommend it - they just describe a way that the provided example code is expected to work. What they actually do is append about half a buffer of silence to the end of real audio samples just to make sure that the audio engine wouldn't use stale data residing in a ring buffer and thus it "prevents unwanted sounds before the call to the IAudioClient::Stop method stops the audio stream." Actual tests using your N*period silent frames + (period-1) of noise (sine wave in my case) produce a beep on win7. Would try to test with Vista today and report back.
I'm sorry GetPosition is not ready. Yesterday I had a case with PulseAudio where the first call to snd_pcm_delay after starting to play returned -5.
Smells like -EIO. Was the call successful or it was just the delay_frames that been set to a negative value? Is it a repeatable result or not?
Second, I've had cases where PulseAudio avail_update returned ~1.8s worth of frames, much more than the 50ms buffer_size I set. So computing alsa_padding as buffer_size-avail_update does not work. I'm not amused.
Certainly it is a bug in pulse alsa-plugin. Yeah, ALSA codebase is not the finest thing ever by itself, but bugs in PA-related stuff sometimes make me rage (you know, spending several days debugging just to find out that it's yet another PA-related bug I have to invent a workaround for).
Back on topic, in any case we need some data source from the ALSA part of a world to get an idea about current padding and/or delay. The fact that reported avail_update and delay values can't be always trusted gives us a misery but there's not too much at all we can do about it except for filing-out bug reports to alsa and/or PA bug trackers.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #65 from Alexey Loukianov mooroon2@mail.ru 2011-12-02 14:55:09 CST --- Here are results for tests Jörg had asked to conduct in a off-bugzilla mail message:
02.12.2011 14:29, Joerg-Cyril.Hoehle@<domain-stripped> wrote:
- Native needs no prefill. Writing one period worth of samples at every event
produces continuous sound.
From now on when I write "native" I really mean "Windows 7 Started edition
32bit SP1 running on Acer Aspire One 522 laptop (AMD C-50 CPU/2Gb RAM/Conexant HDA)". Hadn't done any tests under Vista yet as it turned out that the laptop I've been given to carry tests on had an OS corrupted by viruses so ATM I'm "having fun" reinstalling Vista on it :-).
Back to your question: I can confirm that feeding one period of data per event produces continuous sound on native as long as system hadn't been busy doing something else and missed the required timeframe because of that. For example, heavy screen updates may cause stutters, e.t.c.
- That GetBuffer/ReleaseBuffer may even happen
half a period after the event (even 8 from 10ms). (don't use Sleep(5000ms) for that, it's not precise. Prefer a busy loop with a high performance counter)
Here's what I use for doing less than 10ms "sleeps":
void lx2Sleep(UINT32 dwMsec) { LARGE_INTEGER qpcPosCur, qpcPosOld, qpcFreq; QueryPerformanceCounter(&qpcPosCur); QueryPerformanceFrequency(&qpcFreq); qpcPosOld = qpcPosCur; while(*(INT64*)&qpcPosCur < (*(INT64*)&qpcPosOld + *(INT64*)&qpcFreq * dwMsec / 1000)) QueryPerformanceCounter(&qpcPosCur); }
Am I right that by "may even happen half a period after the event" you mean the following sequence:
1. WaitForSingleObject(event, timeout); 2. "Sleep" for ~5ms. 3. GetBuf/Fill/ReleaseBuff.
To carry this test most likely it would be needed to rewrite my tone generator not to call sin() for each output sample. I think it would be sufficient to pre-calculate, say, 50ms of sine-wave and use it as a source data ring buffer so system won't spend too much time doing float math in realtime. Would write a separate message with test results as soon as I would done this.
I believe the mixer pre-exists, running regularly system wide. Therefore, an app that calls Start should see the first event at seemingly random delays from 0 to 10 ms. (However IIRC, the first event only occurs after something has been written initially).
Yes, my observations on native confirms it: first event happens at seemingly random interval ranging from 0 up to period ms time after the stream start. In some rare circumstances I even been getting about ~20ms before the first event - for example when the system been stressed by some other threads.
- GetPosition freezes after Stop, even though there's some data still flowing through the mixer.
- Does GetPosition bump to 'sum written - padding' after Stop?
Important note: On native IAudioClock_GetPosition() seems to return devpos in a "bytes played" units instead of "frames played". I.e. if I read devpos to be 0 and then send 441 frames of S16_PCM data - devpos would eventually end up equal to 441 * bytes_per_sample * channels_qty = 1764. What I do in mmdev-test is recalculating devpos to frames units using following math:
devpos_frames = 1.f * IAudioClock_GetPosition() / IAudioClock_GetFrequency() * fmt.nSamplesPerSec;
Answering your question, yes it is: ... T: 1471.3ms S: 66304f P: 448f DP: 64384 dDP:1920 T: 1481.4ms S: 66752f P: 448f DP: 64829 dDP:1923 T: 1491.6ms S: 67200f P: 448f DP: 65280 dDP:1920 Stopped stream T: 1501.7ms S: 67648f P: 448f DP: 67200 dDP: 448 T: 1509.0ms S: 67648f P: 448f DP: 67200 dDP: 448 T: 1513.9ms S: 67648f P: 448f DP: 67200 dDP: 448 ...
Legend: S - samples_stored; P - GetCurrentPadding; DP: GetPosition; dDP: (S - DP);
Stats at 1501.7ms are collected right after the call to IAC_Stop(). Notice immediate bump of dDP from 1920f to 448f lag. IOW at the moment Stop() is called DP is immediately set to be (S - P).
- What's GetPosition immediately after Re-Starting? Does it jump to
count all data previously written?
No, as it had "jumped to count it" at the moment Stop() had been called. Here's what's happening on native: ... T: 2493.9ms S: 67648f P: 448f DP: 67200 dDP: 448 T: 2499.0ms S: 67648f P: 448f DP: 67200 dDP: 448 Started stream back T: 2504.0ms S: 67648f P: 448f DP: 67200 dDP: 448 T: 2507.5ms S: 68096f P: 448f DP: 67200 dDP: 896 T: 2517.6ms S: 68544f P: 448f DP: 67219 dDP:1325 ...
Stats are collected and displayed before data pump-out. At 2504.0ms a call to IAC_Start() had been made, then stats had been colected and printed. As padding allowed for another 448f - a chunk of audio data had been pumped out and execution proceeded to the next loop iteration to eventually hit WaitForSingleEvent(). At 2507.5ms an event fired (first one after stream re-start). Devpos remained the same is was ~3.5ms ago, but the audio engine had already consumed another 448f of samples from the output buffer. Next event arrived ~10ms later, at 2517.6ms. At that moment audio engine consumed another 448 frames and DP just had started to move forward. Results are repeatable.
... I believe that Start after Stop can only resume from the currently available padding.
Testing results prove that you're right.
There's another interesting case to look at: suppose you had stopped pumping the data out and monitor what happens: ... Event loop finished, waiting at least 20.3ms for playback to finish... T: 2011.6ms S: 88704f P: 448f DP: 87452 dDP:1252 T: 2012.9ms S: 88704f P: 448f DP: 87509 dDP:1195 T: 2014.1ms S: 88704f P: 448f DP: 87565 dDP:1139 T: 2015.5ms S: 88704f P: 448f DP: 87624 dDP:1080 T: 2016.9ms S: 88704f P: 448f DP: 87687 dDP:1017 T: 2018.9ms S: 88704f P: 448f DP: 87774 dDP: 930 T: 2020.6ms S: 88704f P: 0f DP: 87851 dDP: 853 T: 2021.6ms S: 88704f P: 0f DP: 87895 dDP: 809 T: 2022.6ms S: 88704f P: 0f DP: 87939 dDP: 765 T: 2023.6ms S: 88704f P: 0f DP: 87983 dDP: 721 T: 2024.6ms S: 88704f P: 0f DP: 88028 dDP: 676 T: 2025.8ms S: 88704f P: 0f DP: 88080 dDP: 624 T: 2027.1ms S: 88704f P: 0f DP: 88138 dDP: 566 T: 2028.7ms S: 88704f P: 0f DP: 88208 dDP: 496 T: 2030.5ms S: 88704f P: 0f DP: 88704 dDP: 0 T: 2031.5ms S: 88704f P: 0f DP: 88704 dDP: 0 T: 2032.5ms S: 88704f P: 0f DP: 88704 dDP: 0 T: 2033.5ms S: 88704f P: 0f DP: 88704 dDP: 0 T: 2035.4ms S: 88704f P: 0f DP: 88704 dDP: 0
Final stats: T: 2035.4ms S: 88704f P: 0f DP: 88704 dDP: 0 dAvgDP:1342.21 Average time between events: 10.20ms Total events count: 197 (197 with non-zero DP delta)
What it interesting to note here is a final "jump" for dDP from 496 down to 0 in ~1.8ms. Looks like as soon as mixer pumps out last period into hw it immediately updates the DP to be equal to samples_stored thus not reflecting the fact that the the last period is still being played by HW.
What I plan for Wine is to let GetPosition continue to grow after Stop, because the "sample that is currently playing through the speakers" will advance a little after stopping, due to HW or network latency.
That's incorrect - it contradicts both with MSDN docs and with observed behavior on native. Your planned behavior is more like what's observed for IAudioClock2_GetDevicePosition() - and that's expected behavior for _real_low-level_device_position_ as actual HW continues to iterate through its ring buffer. That is an important difference which is clearly documented in MSDN: IAudioClock_GetPosition() is supposed to represent client-side of view of "stream device", and a position returned by it is documented to freeze at Stop() and continue to move forward after next call to Start(). Also it should be possible to reset the position back to 0 by calling IAudioClient_Reset() while stream is as "stopped" state.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #66 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-02 16:20:55 CST --- Alexey, thank you very much for this analysis. Now that we know almost all and everything about mmdevapi, let's write a driver that works with hw:0, plugw:0, dmix, plug:dmix and pulse, with ALSA from 2004 to present. Hopefully X-Mas vacation will be long enough :-)
Looks like as soon as mixer pumps out last period into hw it immediately updates the DP
My interpretation is different. The actual underrun condition is not when GetCurrentPadding (P) is 0 at event time. It's when it's been 0 for a whole period and the app supplied no frames during that period. T: 2028.7ms S: 88704f P: 0f DP: 88208 dDP: 496 T: 2030.5ms S: 88704f P: 0f DP: 88704 dDP: 0 The last period was pumped out at 2020.6ms, but it's at 2030.5 that mmdevapi appears to put the stream into XRUN mode and bump GetPosition to S - P, as if Stop had been called. OTOH, 496 is to close to 448 to be sure mmdevapi isn't waiting for the last frame.
Meanwhile, I had an idea how Wine could implement GetPosition despite adding silence: it can keep an array of ~32 entries about how much silence frames where added during the last 32 periods, assuming GetPosition lags by no more than 32x10=320ms. Perhaps it's even easier: every time it adds silence, it'll bump GetPosition at the next event (like in this XRUN case) and make sure to never report a position smaller than before. Analysing GetPosition logs in the "write 5ms every 10ms" test case would reveal what native does.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #67 from Alexey Loukianov mooroon2@mail.ru 2011-12-02 18:03:54 CST --- You're welcome. Meanwhile I've been just reported about a real-world bug caused by "plug" alsa-lib plugin which affects - surprise! - RAGE under Wine with prefix set to Win7.
Here is my comment in AppDB about it: http://appdb.winehq.org/objectManager.php?sClass=version&iId=24512#Comme...
What happens is rate plugin not draining samples from the ALSA buffer waiting for entire period to be pumped into it so the GetCurrentPadding() reported by winealsa to XA2 remains equal to 896 (i.e. entire "mmdev duration buffer") and thus XA2 is unable to pump-out more data (buffer is constantly full from it's point of view) and so it results in no sound produced at all :-). Pretty funny case IMO.
Concerning your interpretation - hmm, it might be as valid as mine is. How the things actually are can only be determined by reverse-engineering MS code, but it is in any case irrelevant for us - we just have to mimic the observed devpos change behavior in winealsa and nothing more :-).
As for remark about GP vs. adding silence - correct me if I'm wrong but isn't the only reason to add the silence is to workaround the "pcm is not starting unless at least one period of data had been pumped out"? I mean, as a part of XRun recovery process we add up some silence if required to be sure pcm really starts, but we are not obliged to reflect these artificially-inserted frames in the devpos reported to an app. As we had XRun - the stutter had already happed so adding some silence wouldn't be fatal corruption to and already corrupted sound output. Taking into account silent frames it the devpos reported to an app is pretty simple.
At stream Init(): This->inserted_silent_frames = 0; This->devpos = 0;
At stream Start() or on Xrun recovery, after inserting N silent frames in order to force pcm start: This->devpos = This->written_frames - This->held_frames; This->inserted_silent_frames = N;
On Stop(): This->devpos = This->written_frames - This->held_frames;
At GetPosition():
if(This->started) { if(delay_frames > 0) This->devpos = max(This->written_frames - This->held_frames - delay_frames - This->inserted_silent_frames, This->devpos); else if(pad_frames > 0) This->devpos = max(This->written_frames - This->held_frames - pad_frames - This->inserted_silent_frames, This->devpos); else This->devpos = max(This->written_frames - This->held_frames - This->inserted_silent_frames, This->devpos); } *pos = This->devpos;
IOW silent frames which we use to force pcm start - they are pretty equivalent to the artificially added latency and thus should contribute to it. The logic above it something I had just now quickly scratched and taking into account that I'm pretty sleepy ATM chances are it's partly or completely wrong. It's a deep night here so I'm going to take a rest for now and would try to come back and think more about it tomorrow.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #68 from Alexey Loukianov mooroon2@mail.ru 2011-12-02 18:09:27 CST --- Spelling corrections: "... to and already corrupted ..." should be "...to an already corrupted...", "... account silent frames it the devpos reported ..." should be "... account silent frames in the devpos value reported ...". I really shouldn't write comments being sleepy.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #69 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-03 14:28:02 CST ---
isn't the only reason to add the silence is to workaround the "pcm is not starting unless at least one [ALSA] period of data had been pumped out"?
I now see three. 1 help start up; 2. trailing samples not heard, comment #63; 3. play 100Hz beeps when app fills 5ms at every 10ms event, comment #55.
Taking into account silent frames [...] is pretty simple.
With either lead-in or trailing only silence, not in the 100Hz scenario.
How the things actually are can only be determined by reverse-engineering
How can you conclude this? Look at how much we've found out simply by printing GetPosition & GetCurrentPadding and writing tests. Writing tests teaches a lot. Have a look at the blackbox game inside Emacs. Type Esc-x blackbox Return and ^H f blackbox Return. Or google the file "blackbox.el" and read the documentation string inside the Lisp stuff :-)
Back to bugs, one can derive the requirement set_hw_params_period_max 20ms based your appDB comment when one uses my pseudo code from comment #58 given ALSA's insistence on being fed a full period before doing anything. 10ms lead-in and 10ms prefill will then ensure ALSA starts. However, 20ms is too large for sustained sound as it means that ALSA is fed a full 20ms period only around the time of ALSA underrun. Rage's scenario really needs set_hw_params_period_time_max 10ms to work reliably -- unless latency is artificially increased (by maintaining GetCurrentPadding e.g. 50ms lower than required using a silence lead of same size). The device could even adapt dynamically and reduce the period from e.g. 100ms downto 10ms should an underrun occur. Daydreaming?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #70 from Alexey Loukianov mooroon2@mail.ru 2011-12-04 06:01:40 CST --- (In reply to comment #69)
- trailing samples not heard, comment #63;
Ah, thx for reminding about it - I really should do this test on native today. It's a pity I forgot to do it while been spending time looking at Vista installer prompts.
Taking into account silent frames [...] is pretty simple.
With either lead-in or trailing only silence, not in the 100Hz scenario.
Correct me if I'm wrong but this 100Hz scenario is: a) An extreme usage variant having no real-world use. I can't imagine anyone would use it unless conducting tests like we do. b) It is essentially a constant xrun case, which may be perfectly handled by add silence at start case in case alsa period is less or equal to mmdev period. I.e. at the moment of timer callback we already have an underrun (or would have it in less than 1ms - due to at the previous timer callback we only pumped out one mmdev period of data - 5ms of silence + 5ms of real data - which is equal in duration to the timer callback period). Since last event we have 5ms of data held in mmdev buffer (event is fired after feeding alsa and handling underruns). Thus proposed xrun handling logic prepends - in this case - 5ms of silence (note - it should use max[alsa_period, mmdev_peiod] as "period length" when calculating the amount of silent frames it should add) to the 5ms available data and pumps out this to ALSA. Then we signal the event and at the time next callback will be invoked we would end up at just the same situation xrun + 5ms of data held in mmdevdrv buffer. Devpos readings would also look like they are on native for this case.
How the things actually are can only be determined by reverse-engineering
How can you conclude this? Look at how much we've found out simply by printing GetPosition & GetCurrentPadding and writing tests.
I'm also a big fan on "black box games", just don't like emacs (due to lisp) and prefer to toy with other puzzles like: "here is a binary which takes something at input and produces something as output; investigate it any way you want except for disassembling and try to produce the code that does the same thing; the less time it takes to accomplish the task and the less complicated a resulting code is - the better". My experience with solving such puzzles taught me that sometimes there are several different ways to achieve the same observed behavior - and what's pretty funny, in such cases the original algo used in the "black box" doe the thing in yet another third way most of the times :-). It seems for the that the "devpos jump at underrun" is just like one of those cases. We see it, we know it's there - but we don't know what is the underlying logic behind it and may only make guesses. I can't devise a test that would allow us to distinguish the "bump at software underrun case" which is your interpretation from the "bump after the last period had been written to hw". It doesn't mean that there's no such test, it just mean that I'm too dumb to invent it :-). Em, and a bit lazy too to head on and attach my laptop to the oscilloscope to check what is actually emitted from hardware at the end of stream playback :-).
Back to bugs, one can derive the requirement ... ... The device could even adapt dynamically and reduce the period from e.g. 100ms downto 10ms should an underrun occur. Daydreaming?
Yes as yes for maximum alsa period requirements vs. rage vs. artificially increased latency. Actually I had come up with just the same conclusions yesterday after spending some time thinking about this problem.
As for dynamically adapting period size - don't think it's a daydreaming as I've had just the same idea and was going to propose it in "implementation design details" text I've been going to compose at the beginning of the next week. The only difference is that I've been thinking about fixing ALSA period at some sane small value (like 10ms), requesting pretty big ALSA buffer (say, ask for 1s and use as much as we would be presented with) and dynamically changing duration between sequential timer callback basing on the padding value an app seems to maintain. I.e. if an app seems to keep padding as low as 10-20ms - we have to run timer callbacks at "full speed" (i.e. every 10ms). OTOH, if an app tends to store 300ms of audiodata in buffer - it most likely would be OK to have mmdevperiod be something like 150ms. Algo may be something like: evaluate the duration we have until xrun basing on the amount data we have buffered currently, schedule next callback to come in a time equal to the half of this duration, but non earlier than 10ms from now. That's a preliminary thought actually as I hadn't done any test yet that would prove if this approach is viable or not.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #71 from Alexey Loukianov mooroon2@mail.ru 2011-12-06 20:06:55 CST --- Some news from Vista side:
1. Same hardware and same driver (i.e. - notebook with Vista I have here ATM is exactly the same model I had been running my tests on Win7 Starter, namely Acer Aspire One 522 equipped with Conexant HDA codec) results in different default period size (441f on Vista vs. 448f on Win7) and different "20ms buffer size" (882f on Vista vs. 896f on Win7).
2. Test case "feed exactly 441f once per event": produces constant xrun-free sound if the system isn't under heavy load.
3. Test case "feed 5ms of data every event": produces buzz which - accordingly to the readings my oscilloscope shows - had somewhat different waveform compared to produced by Win7, but the period seems to be 100Hz and not 50Hz Jörg had been predicting.
Still have more tests to run/analyze but, as always, middle of the week take its toll of spending time on work projects rather than on "areas of personal interests" :-(. Would post some more as soon as would have results ready.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #72 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-07 04:19:28 CST --- Created attachment 37844 --> http://bugs.winehq.org/attachment.cgi?id=37844 GetPosition with snd_pcm_delay
I spent days fighting an old PulseAudio trying to figure a repeatable way to have avail_update or snd_pcm_delay produce reasonable values after underrun recovery. No deal.
Here's a version of GetPosition that exhibits the following properties: - monotonically increasing (not documented, but I read in dlls/ Quartz' GetPosition has such a requirement); - protection against garbage delay data; - never using ALSA padding, because that can't be relied upon in PulseAudio after an underrun; - works perfectly with dmix, reasonably well with old Pulseaudio -- don't dump old Linux distributions! - after IAC_Stop, freeze to sum_written - mmdevapi_padding, comment #65.
The latter causes 2 of my tests to fail ("position X too far after only 100ms") for as long as Wine forwards a lot of audio down to ALSA, whereas native feeds 10ms at a time.
Before submitting, all I plan to change is to move TRACE after LeaveCS. I've not yet tested whether it passes tests on top of current git. So far, I used it with patches from comment #54. It would be nice if it could be independent, though ("with GP fixed, let's tackle GCP").
Regarding set_hw_params_period_*, I'm leaning towards this patch: if(duration <= 80ms) /* app shows it wants tight timing */ set_hw_params_period_time max(20ms); /* abort on error not shown */ set_hw_params_period_time_near(10ms); 2 calls because _near won't prevent using 100ms if the device doesn't support smaller periods. _max will ensure it fails.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #73 from Alexey Loukianov mooroon2@mail.ru 2011-12-07 05:10:37 CST --- Thanks for submitting new patch, would take a look on it this evening.
As for making short periods a hard requirement - wouldn't it be nicer to add-up latency and emulate required period timing in software (actually it is already as timer callbacks are independent from ALSA periods) rather than failing completely? I.e. if ALSA reports that it supports requested small period - excellent, use it. If not - pump-out 2*alsa_period audio data in advance to ALSA and then continue feeding it with alsa_period data every alsa_period elapses at the same time keeping firing up events every emulated mmdev_period.
P.S. Concerning monotonically increasing GetPosition(): by "monotonically" here you mean standard math definition for term "monotonically" (i.e. derivative for func is always greater or equal to zero)? To be more precise, it'd be more correctly define GP as monotonically non-strictly increasing sequence - we're in digital world here so there's no such thing as a real continuity here. If you meant that - it's sort of documented in MSDN (quote from GetPosition() docs, see remarks section): "... The device position is a stream-relative offset. ... Immediately following creation of a new stream, the device position is 0. Following a call to the IAudioClient::Start method, the device position increments at a uniform rate. ... Successive device readings are monotonically increasing. Although the device position might not change between two successive readings, the device position never decreases from one reading to the next. ...".
http://bugs.winehq.org/show_bug.cgi?id=28723
DL dredgingthelake@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |dredgingthelake@gmail.com
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.4.0 |---
--- Comment #74 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-08 05:05:51 CST ---
"the device position never decreases from one reading to the next."
Exactly. When I was at school, mononotic by default implied not strictly.
After more testing with the rate plugins and failing underrun detection, I think I'll add: if(0 < delay_frames < alsa_period) This->last_pos = written-mmdev_padding; to accept the current value once but make sure GP bumps to maximum the next time it's called -- instead of immediately bumping to max. The effect is close to your observation about native in comment #65.
pump-out 2*alsa_period audio data in advance
How? Rage only gives you 10ms and you cannot predict the future. That's what lead me to the idea of extending time in the other direction and introduce the 10ms lead-in, comment #58. If ALSA period is say 50ms, then the lead-in must be of similar size. Alas, lead-ins increase underrun recovery time. I've read that PulseAudio uses snd_pcm_rewind to overwrite (when mixing) already written data with new samples. You could try out that route however my experience with seeking with snd_pcm_forward is bad, see bug #28517.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #75 from Alexey Loukianov mooroon2@mail.ru 2011-12-08 11:16:14 CST --- (In reply to comment #74)
pump-out 2*alsa_period audio data in advance
How? Rage only gives you 10ms and you cannot predict the future.
Well, we can "predict" with a pretty high probability that if an app acquire some pointers to WASAPI COM interfaces, calls IAC_Initialize() followed by IAC_Start() - it's going to playback some sound :-). Making small alsa_period a hard requirement would left behind an unfortunate user whose hardware is not capable of using small periods. IMO it'd be better to still have sound produced but lagging way behind (due to 2*alsa_period lead-in silence chunk) than not to have sound at all. Yes, we would need to pump-out 2*alsa_period lead-in silence again in case of xrun but as we're talking about case when ALSA isn't capable of using small periods - chances to hit xrun are pretty slim. And yes, if the minimal period that device supports is - say - 1s, then it would end up in sound lagging 2s at the worst case (and slightly more than 1s at best). It's bad, but it's still better than having no sound at all. Of course we should warn user loudly in such case.
I've read that PulseAudio uses snd_pcm_rewind to overwrite (when mixing) already written data with new samples. You could try out that route however my experience with seeking with snd_pcm_forward is bad, see bug #28517.
No-no-no-no-no, it is "not a way to go". I've seen vast amount of different failures caused by bugs in various parts of ALSA (kernel drivers, alsalib, plugins and so on) which arise when you try to use rewind/forward. Using these would introduce more problems (esp. on older systems like CentOS 4 or 5) than it's expected to solve.
PS. BTW, in bug #28517 you had mentioned the following test case: "...I've been thinking about an audible test with two streams with 180° phase delta not written at the same time, yet they should all enter the same 10ms period and might annihilate themselves...". I'm thinking about trying to implement it this weekend. It'd be pretty curious to take a look at the results :-).
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #76 from Alexey Loukianov mooroon2@mail.ru 2011-12-08 11:18:18 CST --- PPS. Forgot to wrote that I had took a look into your latest patch yesterday evening and it looks like a pretty correct one for my understanding.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #77 from Andrew Eikum aeikum@codeweavers.com 2011-12-08 15:03:49 CST ---
Jörg Höhle changed: What |Removed |Added
Target Milestone|1.4.0 |---
Did you do this intentionally, Jörg? I think it's critical that we get this fixed before the code freeze for 1.4.
I looked at your GetPosition patch. It makes sense to me, and passes all tests. I ran a few different applications against it without issue as well. A couple of tiny comments:
This seems strange to me: + if(!This->started || alsa_state > SND_PCM_STATE_RUNNING) I think it can be made more explicit by spelling out which states are wanted here.
Perhaps s/5/EIO/ here: + /* Pulse bug: err -5 shortly after starting: nothing played */
As you say, moving the final TRACE out of the critical section is a good idea.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #78 from Alexey Loukianov mooroon2@mail.ru 2011-12-09 02:02:49 CST --- Took another - now more thorough - look at latest path. Here are two minor remarks about this section of the code:
+ else if(err<0 || delay_frames > position - This->last_pos_frames) + /* Pulse bug: after an underrun, despite recovery, avail_frames + * may be larger than alsa_bufsize_frames, as if cumulating frames. */
a) Signess issue. position and last_pos_frames are both unsigned, wouldn't we hit an "undeflow" with flipping to a huge numbers (due to sign bit treated as a part of unsigned value) in case position is less than last_pos_frames?
b) IMO aside from possible signess issues it'd be better to rewrite comparison to this:
+ else if(err<0 || This->last_pos_frames > position - delay_frames)
In this form it'd be obvious from a first glance that we're clamping position here to be at least last_pos_frames.
c) In comment you write about avail_frames which is not used for calculations in surrounding code. Does the same bug affect delay_frames as well?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #79 from Alexey Loukianov mooroon2@mail.ru 2011-12-09 03:05:31 CST --- Thought a bit more about (a) and (b) remarks I had posted and it looks like (a) is invalid (there's no chance for position to be less than last_pos_frames at the moment of comparison), and rewriting it as (b) would possibly cause signess issues I've been afraid of in (a) :-).
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |1.4.0
--- Comment #80 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-09 10:54:53 CST --- GetPosition patch submitted http://www.winehq.org/pipermail/wine-patches/2011-December/109646.html
Perhaps s/5/EIO/ here:
/* Pulse bug: err -5 shortly after starting: nothing played */
Done. -5 is what people see in the logs and I'm fed up with googling the possible ERR* name when reading Spanish, Russian, Polish or Ukrainian snd_strerr() output in logs submitted to the bugtracker. IMHO it should print the symbolic name too.
Removed Target Milestone
!?! No idea. Restored.
Does the same bug affect delay_frames as well?
Yes, I updated the comment. I really observed cases where avail would resume from former values despite intermittent snd_pcm_reset, and snd_pcm_delay would reflect that too. It looks like PA doesn't master its backend. I fear the comments are not enough to understand the PulseAudio and rate plugin mess when you'll read the code in 2 years or later.
Note that PulseAudio in the old Ubuntu Intrepid produces reasonable values with Andrew's "draining" patch from comment #54, whereas the more recent PulseAudio in Ubuntu Lucid (LTS) is broken beyond repair, so it seems. I wonder what kind of testsuite that project is using.
if(duration <= 80ms) /* app shows it wants tight timing */ set_hw_params_period_time max(20ms); /* abort on error not shown */
Well, plug:dmix' minimum appears to be 21.333ms in Ubuntu (@8000x1x8)... That code would fail to initialize! Something else is needed.
1s, then it would end up in sound lagging 2s
Unless we play native, implement a mixer running constantly. BTW, regarding comment #51, I've now received comments about Linux CFS causing not only audio delays but also video drop outs. If you want to play games, you better use a kernel < 2.6.23 and no backports or you manage to compile Linux without the Completely Fair Scheduler.
Now that we've seen (from a few tests) that the GetPosition patch stands on its own, what about the "change drain behaviour" patch and the 2 subsequent ones? Are they ready to enter Wine and be later replaced by the lead-in concept (once written and proven) and a "GetCurrentPadding modulo mmdev period size" patch? Or should we dump them and wait for the lead-in?
I propose: 1. "change drain behaviour" (which I need to review closely); 2. Modify "set ALSA period size" to set_period_near 10ms; Maybe that's enough for Rage in Alexey's case? 3. Modify "request big enough buffer" to match native's clamping of values and rounding (using MulDiv).
Later: stable 10ms Rage sound (if Linux can take it), 10ms padding increments, better underrun handling, ...
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #81 from Alexey Loukianov mooroon2@mail.ru 2011-12-09 12:08:23 CST --- (In reply to comment #80)
I wonder what kind of testsuite that project is using.
We all know that PA is a real pain in the ass :-). Actually the first advice I give at work when a customer calls and complains about misbehaving the sound-related software produced by the company I work is to get rid of the PA and never install it back. Helps to fix about 90% of failures.
Well, plug:dmix' minimum appears to be 21.333ms in Ubuntu (@8000x1x8)... That code would fail to initialize! Something else is needed.
I see at least two possible approaches: a) Run a loop starting from period size 10ms trying to set time_max and doubling it in case try fails. b) Use time_near with desired period time and hope that ALSA would be smart enough and provide us closest match that is supported.
1s, then it would end up in sound lagging 2s
Unless we play native, implement a mixer running constantly.
I'm not a fun of implementing yet another sound mixer in Wine. We already got one in DS and most users have yet another one already running on their systems (dmix or PA most of the times). And playing native won't safe us from lag that is equal to at least one alsa_period, as to play safe we have to upload at least this amount of data to ALSA at the stream start and on underrun recovery. IMO playing native won't be of a much benefit for this specific case.
I propose:
- "change drain behaviour" (which I need to review closely);
- Modify "set ALSA period size" to set_period_near 10ms;
Maybe that's enough for Rage in Alexey's case?
I bet that it most probably would be enough to fix this Rage-specific case.
http://bugs.winehq.org/show_bug.cgi?id=28723
Otto Rey otto_rey@yahoo.com.ar changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |otto_rey@yahoo.com.ar
http://bugs.winehq.org/show_bug.cgi?id=28723
Andrew Eikum aeikum@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #37683|0 |1 is obsolete| |
--- Comment #82 from Andrew Eikum aeikum@codeweavers.com 2011-12-12 11:19:04 CST --- Created attachment 37943 --> http://bugs.winehq.org/attachment.cgi?id=37943 ALSA fixes patchset #2
(In reply to comment #80)
I propose:
- "change drain behaviour" (which I need to review closely);
- Modify "set ALSA period size" to set_period_near 10ms;
Maybe that's enough for Rage in Alexey's case? 3. Modify "request big enough buffer" to match native's clamping of values and rounding (using MulDiv).
Okay, here's another patchset implementing this, rebased on top of your GetPosition() patch. It works fine here both with and without PulseAudio. The "change drain behavior" patch is unchanged from the last version, and the later patches have been updated a little bit based on your suggestions.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #83 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-13 09:56:52 CST --- Here's my mini-review. 0003 alsa_period_us = 10000; If a constant, use DefaultPeriod. But that's wrong. It should be mmdev_period after due clamping of values. In exclusive mode, the app could ask for a period not 10ms.
0002/0004 -/+ TRACE("alsa_period_frames... The first patch should move it once and for all.
0004 Use MulDiv for both duration and period + This->mmdev_period_frames = (fmt->nSamplesPerSec * This->mmdev_period_rt) / 10000000.; + This->bufsize_frames = MulDiv(duration, rate, 10000000); Why rate here vs nSamplesPerPsec there? bufsize_frames=GetBufferSize should be computed with app-visible units, i.e. nSamplesPerSec, not ALSA's actual internal rate.
0002 In alsa_write_data I found the logic not clear enough. I suggest /* try to keep 3 ... */ max_period = 3 * max(); if(in_alsa >= max_period) return; and would get rid of the "while () write_limit+=;" loop which is bad with alsa_period=8.
0002 Continue using the same variable (who remembers to_write == written?): + if(This->held_frames && (to_write < write_limit)){ + min(This->held_frames, write_limit - written), This);
Overall, I don't like the complexity that 0002 adds to alsa_write_data. It would be preferable to use set_hw_params_buffer_size_max(3xboth_periods) because that's what the code in effect realizes! Often enough, set_period_near returns the period that ALSA chose, even before set_buffer_time|size or the final set_hw_params. (Another idea would be to iterate twice through set_hw_params. Such initialisation overhead is still preferable to overhead in the audio loop).
http://bugs.winehq.org/show_bug.cgi?id=28723
Andrew Eikum aeikum@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #37943|0 |1 is obsolete| |
--- Comment #84 from Andrew Eikum aeikum@codeweavers.com 2011-12-13 13:56:49 CST --- Created attachment 37960 --> http://bugs.winehq.org/attachment.cgi?id=37960 ALSA fixes patchset #3
(In reply to comment #83)
Here's my mini-review. 0003 alsa_period_us = 10000; If a constant, use DefaultPeriod. But that's wrong. It should be mmdev_period after due clamping of values. In exclusive mode, the app could ask for a period not 10ms.
0002/0004 -/+ TRACE("alsa_period_frames... The first patch should move it once and for all.
0004 Use MulDiv for both duration and period
- This->mmdev_period_frames = (fmt->nSamplesPerSec * This->mmdev_period_rt)
/ 10000000.;
- This->bufsize_frames = MulDiv(duration, rate, 10000000);
Why rate here vs nSamplesPerPsec there? bufsize_frames=GetBufferSize should be computed with app-visible units, i.e. nSamplesPerSec, not ALSA's actual internal rate.
0002 Continue using the same variable (who remembers to_write == written?):
- if(This->held_frames && (to_write < write_limit)){
min(This->held_frames, write_limit - written), This);
All of these are done in the attached patchset. As usual, it passes all tests with and without PA.
0002 In alsa_write_data I found the logic not clear enough. I suggest /* try to keep 3 ... */ max_period = 3 * max(); if(in_alsa >= max_period) return; and would get rid of the "while () write_limit+=;" loop which is bad with alsa_period=8.
I added an "if(write_limit == 0)" early return. I don't understand your other objection.
Overall, I don't like the complexity that 0002 adds to alsa_write_data. It would be preferable to use set_hw_params_buffer_size_max(3xboth_periods) because that's what the code in effect realizes! Often enough, set_period_near returns the period that ALSA chose, even before set_buffer_time|size or the final set_hw_params. (Another idea would be to iterate twice through set_hw_params. Such initialisation overhead is still preferable to overhead in the audio loop).
We can't set max(3xperiod) because then we will get too small a buffer on someone's system. We can't set exact 3xperiod because it will fail on someone's system. If we set min(3xperiod) we must also have the 3-period limiter in alsa_write_data because it will give us too large a buffer on someone's system. So that's what we have to do. (Aside: of what use is an API that makes no guarantees?)
Note that I tweaked the buffer size call slightly in this version. We now use set_buffer_size_min() instead of _near(). We also only request a minimum of 3 MMDevAPI periods, instead of max() between that & duration. It would be 3 of either period size, but as you pointed out, we would have to do two passes to get that information. While ALSA often surprises me, I'm sure they already have at least 3 ALSA periods in their buffer by default...... right?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #85 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-14 09:22:28 CST ---
We can't set max(3xperiod) because then [...] on someone's system.
I see. Too bad.
they already have at least 3 ALSA periods in their buffer by default
No idea. I've typically observed 4-6, but I feel ALSA ought to work with 2.
We now use set_buffer_size_min() instead of _near().
That's a bad move. near is a clear indicator of what we want. min or max are completely ignored unless the limit is hit during constraint resolution. As a result, with min only, PA will use its typical 2s buffer.
I've not yet thought about the smallest buffer size that winealsa needs to work.
+ if(period) + This->mmdev_period_rt = period; I'd like to eventually see if(!period || mode==SHARED) = DefaultPeriod I'm currently using testbot to obtain the min & max values.
"while () write_limit+=;"
I've had cases with alsa_period=8 frames vs 448 mmdevapi period. That needs quite a few iterations through the loop and should be replaced by a straight formula to directly compute the limit instead.
I've made some progress with PA. The sequence prepare + drop + prepare(!) manages to prevent the too large avail & delay values reported in comment #63 past an underrun, at least with the old Ubuntu Intrepid PA. There's still an issue that it may spit out a sequence of "handling underrun" for over 100ms until it gets back to normal behaviour.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #86 from Andrew Eikum aeikum@codeweavers.com 2011-12-14 09:39:23 CST --- (In reply to comment #85)
We now use set_buffer_size_min() instead of _near().
That's a bad move. near is a clear indicator of what we want. min or max are completely ignored unless the limit is hit during constraint resolution. As a result, with min only, PA will use its typical 2s buffer.
But _near() runs the same risk as _max(), that is, it allows ALSA to give us too small a buffer (Bug 28282). Other than wasting memory, what's the harm of a 2s buffer since we limit our writes in alsa_write_data() anyway?
- if(period)
This->mmdev_period_rt = period;
I'd like to eventually see if(!period || mode==SHARED) = DefaultPeriod I'm currently using testbot to obtain the min & max values.
This is patch 4 in the patchset.
"while () write_limit+=;"
I've had cases with alsa_period=8 frames vs 448 mmdevapi period. That needs quite a few iterations through the loop and should be replaced by a straight formula to directly compute the limit instead.
But we add max_period to write_limit in the loop body, so I'm pretty sure the loop can execute at most three times.
I've made some progress with PA. The sequence prepare + drop + prepare(!) manages to prevent the too large avail & delay values reported in comment #63 past an underrun, at least with the old Ubuntu Intrepid PA. There's still an issue that it may spit out a sequence of "handling underrun" for over 100ms until it gets back to normal behaviour.
Are these future concerns, or do you think they should block this patchset? I would really like to get it in ASAP.
We should also talk about attribution. I am fine with the following: 1/4 goes to Alexey 2/4 goes to Jörg 3/4 and 4/4 go to me (unless you want 4/4, Jörg, but it's pretty trivial) AJ tends to prefer patch that authors submit their own patches, so we'll have to either hope for an exception or submit them one-per-day or with coordinated patch counts in the subject. What a mess :)
http://bugs.winehq.org/show_bug.cgi?id=28723
Michael Cronenworth mike@cchtml.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |mike@cchtml.com
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #87 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-15 03:48:17 CST ---
But _near() runs the same risk as _max(), that is, it allows ALSA to give us too small a buffer (Bug 28282).
Do you mean? fixme:alsa:AudioClient_Initialize ALSA buffer time is smaller than our period time. Expect underruns. (352 < 441) I thought this is caused by the bogus period=duration/10. With duration/10 gone, I believe ALSA will heed our advice.
Perhaps we should enforce set_buffer_size_min = 2 * mmdevapi_period in addition to set_buffer_near? But I've not yet thought about the theoretical lower limits that ensure a continuous stream.
Other than wasting memory, what's the harm of a 2s buffer since we limit our writes in alsa_write_data() anyway?
SW may select varying behaviour based on the hints it gets. IIRC MSDN says that duration is a hint that allows to select between fast & "don't care" behaviour; XAudio2 changes behaviour when period > 10.5ms. Perhaps PA too? That's why I consider it important to give adequate hints. In comment #72, I've myself proposed changing behaviour when duration<=80ms.
do you think they should block this patchset?
No. Mastering buffer size is another topic than mastering PA.
I can't take ownership of 0002 for as long as it contains the bogus if(period) logic that goes against my current knowledge. Either merge 0002 and 0004, or perhaps wait for my 0000 patch that sets period & duration limits properly. The missing piece from 0000 is the exact upper limit on period and duration sizes in exclusive mode (which do not look like what MSDN documents about Initialize).
My yesterday buffer size tests have shown: shared mode: duration is min. 3*periods on testbot excl. mode: duration is min. 8*periods! That shows how much the MS developers thought would be needed to provide stable output.
get it in ASAP
This Friday's 1.3.35 would be nice.
submit them one-per-day
I don't believe that's needed as e-mail does not guarantee ordering. If we all send numbered patches, the committer should apply them in order.
When that'll be done, more patches will be needed - IAC_Stop no snd_pcm_drop (in my queue) - alsa_period filler to ensure ALSA starts - alsa stalling detection - change OSS accordingly
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #88 from Andrew Eikum aeikum@codeweavers.com 2011-12-15 15:32:31 CST --- (In reply to comment #87)
But _near() runs the same risk as _max(), that is, it allows ALSA to give us too small a buffer (Bug 28282).
Do you mean? fixme:alsa:AudioClient_Initialize ALSA buffer time is smaller than our period time. Expect underruns. (352 < 441) I thought this is caused by the bogus period=duration/10. With duration/10 gone, I believe ALSA will heed our advice.
Perhaps we should enforce set_buffer_size_min = 2 * mmdevapi_period in addition to set_buffer_near? But I've not yet thought about the theoretical lower limits that ensure a continuous stream.
Other than wasting memory, what's the harm of a 2s buffer since we limit our writes in alsa_write_data() anyway?
SW may select varying behaviour based on the hints it gets. IIRC MSDN says that duration is a hint that allows to select between fast & "don't care" behaviour; XAudio2 changes behaviour when period > 10.5ms. Perhaps PA too? That's why I consider it important to give adequate hints. In comment #72, I've myself proposed changing behaviour when duration<=80ms.
I see. Perhaps the combination of _min() and _near() that you suggest is the best solution.
I can't take ownership of 0002 for as long as it contains the bogus if(period) logic that goes against my current knowledge. Either merge 0002 and 0004, or perhaps wait for my 0000 patch that sets period & duration limits properly. The missing piece from 0000 is the exact upper limit on period and duration sizes in exclusive mode (which do not look like what MSDN documents about Initialize).
Okay, I'll merge them and create another patchset tomorrow morning. Hopefully we can agree on it and get it off.
Alexey, are you okay with taking ownership of patch 1 and sending it in this weekend?
get it in ASAP
This Friday's 1.3.35 would be nice.
Probably too late for that, but I'm fine with next Monday.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #89 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-15 16:32:38 CST --- Created attachment 37981 --> http://bugs.winehq.org/attachment.cgi?id=37981 Compute mmdevapi duration & period limits
This is the 0000 patch to compute period and duration limits like native does. I'm going to submit this tomorrow (adding OSS and MacOS). It will end the if(!period) issue. It requires a rebase & editing of the remaining 3 patches.
Alexey, I'd be pleased if you could grab all my mmdevapi .exe on testbot and run them on your HDA machine with both MS OS (= with and without the alignment restriction). The good news is that I now have a good understanding of native's buffer and period size (cf. the patch in testbot), when there's no alignment restriction:
bufsize = MulDiv(duration, pwfx->nSamplesPerSec, 10000000); fragment = MulDiv(period, pwfx->nSamplesPerSec, 10000000); parts = MulDiv(bufsize, 1, fragment); // not (duration, 1, period) if(mode==AUDCLNT_SHAREMODE_SHARED) ok(gbsize == bufsize, ... else ok(gbsize == (parts < 8 ? 8 : parts) * fragment, ...
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Attachment #37960|0 |1 is obsolete| | Attachment #37981|0 |1 is obsolete| |
--- Comment #90 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-15 18:09:52 CST --- Created attachment 37982 --> http://bugs.winehq.org/attachment.cgi?id=37982 Rebased & edited patchset incl. 0000 renumbered to 0001
Authors like Andrew suggested in comment #86. I put back set_period_near into the last patch. But I'm not satisfied with the results. My render test produces a spurious xrun notification with plug:dmix that disappears when running with full logs or when using set_period_min. Where does that xrun come from? OTOH, set_near allows PA to produce better position values in Intrepid. Oh well, it really looks like min&max is not over yet.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #91 from Andrew Eikum aeikum@codeweavers.com 2011-12-16 09:11:51 CST --- (In reply to comment #89)
This is the 0000 patch to compute period and duration limits like native does. I'm going to submit this tomorrow (adding OSS and MacOS). It will end the if(!period) issue. It requires a rebase & editing of the remaining 3 patches.
Looks good, thanks.
We haven't heard from Alexey in a little while. If we don't hear from him by Monday, I'll send patch 2 in my name with a note crediting him for the algorithm.
(In reply to comment #90)
I put back set_period_near into the last patch. But I'm not satisfied with the results. My render test produces a spurious xrun notification with plug:dmix that disappears when running with full logs or when using set_period_min. Where does that xrun come from? OTOH, set_near allows PA to produce better position values in Intrepid. Oh well, it really looks like min&max is not over yet.
What about first setting _min(3*period) and then _near(3*period)? Then we get our desired minimum and set the size hint as well.
http://bugs.winehq.org/show_bug.cgi?id=28723
Bruno Jesus 00cpxxx@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |00cpxxx@gmail.com
--- Comment #92 from Bruno Jesus 00cpxxx@gmail.com 2011-12-16 09:34:30 CST --- (In reply to comment #91)
... We haven't heard from Alexey in a little while. If we don't hear from him by Monday, I'll send patch 2 in my name with a note crediting him for the algorithm.
You can also send the patch in his name like written in http://wiki.winehq.org/SubmittingPatches#head-999769efcab5caf7f727820e161f4d...
Shouldn't this bug be marked as NEW?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #93 from Alexey Loukianov mooroon2@mail.ru 2011-12-16 09:34:50 CST --- (In reply to comment #91)
We haven't heard from Alexey in a little while. If we don't hear from him by Monday, I'll send patch 2 in my name with a note crediting him for the algorithm.
Sorry for being quiet, I'm a bit off from the computers this week due to making my preparations to having eyes LASIK surgery on Sunday this weekend. I do not have any claims on the authorship rights for algo, patch, e.t.c. Every attachment I had posted on the comments to this bug should be treated as a public domain and can be used freely without any charges/requirements/restrictions by anyone who wants it. Giving the credit in the commit message would be nice though.
P.S. Due to being off-computers I hadn't been able to test latest git + proposed version of the patch yet, going to do it as soon as would recover after the surgery (hopefully I'll be testing patch it when it would end up commited to git).
http://bugs.winehq.org/show_bug.cgi?id=28723
Andrew Eikum aeikum@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |NEW Ever Confirmed|0 |1
--- Comment #94 from Andrew Eikum aeikum@codeweavers.com 2011-12-16 10:09:04 CST --- Thanks for the update, Alexey. I'll take care of sending the patch off. Good luck with your surgery! I expect you will come back to find RAGE with working audio.
(In reply to comment #92)
Shouldn't this bug be marked as NEW?
I guess. Does anyone actually notice NEW, UNCONFIRMED, and ASSIGNED? I don't.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #95 from Andrew Eikum aeikum@codeweavers.com 2011-12-16 10:42:15 CST --- (In reply to comment #90)
I put back set_period_near into the last patch. But I'm not satisfied with the results. My render test produces a spurious xrun notification with plug:dmix that disappears when running with full logs or when using set_period_min. Where does that xrun come from? OTOH, set_near allows PA to produce better position values in Intrepid. Oh well, it really looks like min&max is not over yet.
I'm working on Bug 14717 and testing the changes there on top of your latest patchset. In Darwinia, I get frequent (at least 1 every 5s) xruns in ALSA with _near(). If I change it to _min(), the xruns go away. Using _min() and _near() together also makes the problem go away. I don't know how to test for the improved PA behavior you suggest that _near() provides.
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Severity|minor |normal
--- Comment #96 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-16 11:35:26 CST --- I once wrote: Ever since I've seen EAGAIN returned by ALSA when trying to write into an almost full buffer, I'm convinced that the audio devices maintain a safety margin between the write and read positions and will *not* let you write as much as the upper limit buffer size - written frames says.
That might explain the underruns with a max buffer 3x period_size. Perhaps 4 times would be better. I'll try and analyse logs over the week-end.
Could you try set_buffer_min(period) // not sure 3x should be the minimum, try 2x? set buffer_near(3xperiod)? set buffer_near(4xperiod)?
I don't know how to test for the improved PA behavior
I'm simply running my render.c tests and looking at GCP and GP values.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #97 from Andrew Eikum aeikum@codeweavers.com 2011-12-16 13:34:08 CST --- (In reply to comment #96)
Could you try set_buffer_min(period) // not sure 3x should be the minimum, try 2x?
I tested with min(1x), min(2x) and min(3x) and the ALSA buffer is always about 1024 times the ALSA period length, i.e. huge.
set buffer_near(3xperiod)?
Here the buffer is exactly 3 times the ALSA period length, and has underruns.
set buffer_near(4xperiod)?
Here it's about 10 times the period length, and no underruns.
When I convince it to be 4 periods long, I also get no underruns.
Perhaps _min(4x), _near(4x)?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #98 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-17 18:14:52 CST --- Please always post the actual ALSA periods so we know the values. 1024 is huge as I've rarely seen buffers over 2s, that would imply a 2ms period -- possible (I've seen 1ms) but I'd expect set_period_near 10ms not to yield 2ms. Also, please say whether you used set_period or set_buffer, as I'm slowly getting confused. Have you been using plug:dmix, PA with handle_underrun or another device?
I tested with min(1x)
I didn't mean alone. I intended to mean min+near, as in set_period_near(10ms) set_buffer_min(2*mmdev_period) else return PERIOD_ERROR set_buffer_near(4*mmdev_period) (or set_periods(4) which means 4xalsa_period) There's also set_..._first
Given the EAGAIN behaviour and period size chunking behaviour, I'm not sure ALSA supports using buffer = 2 periods, unlike the ping pong game that mmdevapi and winmm like to play.
Also, I'm wondering why we should insist on such tiny buffers. 100ms should be equally fine from an audio POV. I'm still thinking about the lead-in idea from comment #58, esp. since we must not forget about bug #29056, "ALSA won't start without a full period".
I've not yet thought about the theoretical lower limits
That's what we need. No trial & error. What's going to happen when period_near returns > 10ms? e.g. 11 or 50ms? What when period_near returns < 10ms? e.g. 1 or 9 ms?
set buffer_near(3xperiod)?
Here the buffer is exactly 3 times the ALSA period length, and has underruns.
set buffer_near(4xperiod)?
Here it's about 10 times the period length, and no underruns.
3/4->3/10 Weird!
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #99 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-19 04:25:21 CST --- Playing more with ALSA, I convinced myself that a simple combination of period_near and buffer_near is not enough to guarantee identical behaviour on all machines.
Consider dmix at 48000: set_period_near(10ms) -> 21333ms(!) set_buffer_near(30ms) -> 42666ms (= 2x ALSA period) Even buffer_near 40ms wouldn't make a difference...
Do you believe that the current winealsa.drv logic can drive an ALSA device from only 2 periods without underruns, considering that ALSA always wants a full period before submitting anything?
OTOH, using hw:0 set_period_near(10ms) -> 10ms set_buffer_near(30ms) -> 30ms presumably works much better. That's why we need to mention the actual period&buffer sizes when reporting that app X had an underrun.
set_period_near(10ms) set_hw_params_periods(3); or 4 may be a better choice for the current driver.
BTW, in comment #21, I expressed the idea that at the time the first event is received, the mixer already has shoveled one period of data. If you look at testbot job 15962, you'll see that GCP has decreased by one (480) or sometimes 2 periods (960 frames) after the first WaitFor(Event). It always had decreased in EVENTCALLBACK mode.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #100 from Andrew Eikum aeikum@codeweavers.com 2011-12-19 10:09:24 CST --- (In reply to comment #98)
Please always post the actual ALSA periods so we know the values. 1024 is huge as I've rarely seen buffers over 2s, that would imply a 2ms period -- possible (I've seen 1ms) but I'd expect set_period_near 10ms not to yield 2ms. Also, please say whether you used set_period or set_buffer, as I'm slowly getting confused. Have you been using plug:dmix, PA with handle_underrun or another device?
Certainly, sorry for being vague. I'm using PA with handle_underrun for the following. I gathered the following clicking the Test Audio button in winecfg. All of them have (with your period clamping patch applied, shared mode): alsa_period_us = This->mmdev_period_rt / 10; snd_pcm_hw_params_set_period_time_near(This->pcm_handle, This->hw_params, &alsa_period_us, NULL);
(1) set_buffer_size_min({any of 1,2,3,4} * mmdev_period_frames) set_buffer_size_near(4 * mmdev_period_frames)
ALSA period: 441 frames ALSA buffer: 1764 frames MMDevice period: 441 frames MMDevice buffer: 4410 frames
(2) set_buffer_size_min({any of 1,2,3} * mmdev_period_frames) set_buffer_size_near(3 * mmdev_period_frames)
err:alsa:AudioClient_Initialize ALSA period: 441 frames err:alsa:AudioClient_Initialize ALSA buffer: 1323 frames err:alsa:AudioClient_Initialize MMDevice period: 441 frames err:alsa:AudioClient_Initialize MMDevice buffer: 4410 frames
Also, I'm wondering why we should insist on such tiny buffers. 100ms should be equally fine from an audio POV. I'm still thinking about the lead-in idea from comment #58, esp. since we must not forget about bug #29056, "ALSA won't start without a full period".
This is why I was advocating for set_buffer_size_min(4xperiod) or similar. You raised the point that PA (and possibly other plugins) might behave better with smaller ALSA buffer sizes.
(In reply to comment #99)
set_period_near(10ms) set_hw_params_periods(3); or 4 may be a better choice for the current driver.
Indeed. I'll experiment with this today. It seems to most clearly express what we want.
With "set_period_near(10ms)" you actually mean "set_period_near(mmdev_period)", with your clamping patch applied, right?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #101 from Andrew Eikum aeikum@codeweavers.com 2011-12-19 15:26:22 CST --- I sent off the patch based on Alexey's work as "winealsa.drv: Limit the data written to ALSA's buffer": http://source.winehq.org/patches/data/81970
Next up is your period matching patch, Jörg. To avoid confusion, it's probably best to submit it after tomorrow's commit cycle.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #102 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-19 16:39:28 CST --- Well, I came to the conclusion that Alexey's is really 2 patches and should be split: one to have GCP implement delta increments and be updated only when write happens at event ticks, the other to limit the writes. As I said in comment #83, I don't like the alsa_write rate limiting part. Furthermore, I think that part would be strictly superfluous if set_hw_periods(3) or (4) gives good results.
To summarize set_period_near(mmdevapi_period) set_periods(4)
if(some_test) return INVALID_PERIOD_SIZE (excl.) or CREATE_ENDPOINT_FAILED (shared). if(other_test) FIXME("period size doesn't match, expect trouble").
To recap:
Why do we need a not too large buffer? a) Because PA appears to work better and b) because snd_pcm_drop upon Stop is bogus, so my next patch is going to drop it and let Stop enter XRUN state. So it's essential that ALSA didn't buffer too much, otherwise ALSA would continue to play for too long after Stop. After Stop In shared mode mmdevapi must hold: sum written = GetPosition + GetCurrentPadding
Can the driver deal with an ALSA period < 10ms? We'll feed it only every 10ms, hence we need a buffer large enough to hold more periods. This conflicts with the above set_periods(4), e.g. imagine ALSA period were 1ms. I've not yet tested how set_buffer_min(2xmmdevapi_period) + set_periods(4) play together. Perhaps: alsa_period = mmdevapi_period; set_period_near(alsa_period) // that's an in+out variable! if(alsa_period>=mmdevapi_period) set_periods(4); else set_buffer_size_near(4xmmdevapi_period); This supposes that set_period_near immediately returns a period, i.e. before set_hw_params().
Can the driver bear an ALSA period > 10ms? IMHO only with a silence lead-in patch. The worst-case scenario is: app writes 10ms at every event, but unlike Rage2 doesn't even prefill with 10ms. This is a slight variation of Alexey's test case from comment #55. I claim that this would work in native without underruns. Imagine a 50ms period. Only the silence lead-in gives enough freedom in streaming mode such that after 5x10ms have been sent, ALSA won't XRUN just before the next 5x10ms have been collected and sent. Currently I'm not sure whether a 10ms lead-in is enough, or a 40-50ms one required.
Furthermore, because of the bug in comment #69 that ALSA won't play a trailing chunk less than period size, we also need to write trailing silence... A trailing silence doesn't make the lead-in superfluous. I'd believe native can get away with a lead-in because its mixer (behaves as if it) is constantly running, but it still needs trailing silence to eventually have the HW play the final bits. Wine needs the lead-in because its mixer is not running and sending a continuous audio stream. So it's not so funny that we're back at the original wine-1.3.25 mmdevapi design which used to add silence, except that we now know better - that GetPosition must filter out our silence additions and - that it's incorrect to add more than mmdevapi 10ms silence at each event, even with ALSA period size 50ms.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #103 from Michael Cronenworth mike@cchtml.com 2011-12-19 22:49:09 CST --- (In reply to comment #90)
Created attachment 37982 [details] Rebased & edited patchset incl. 0000 renumbered to 0001
Hi guys. I'm not sure if you want any user reviews of your work, but I thought I'd chip in. I used this patch against 1.3.35 (Fedora 16 x86_64) and I was able to play Starcraft 2 and League of Legends without any sound chop (using PA and a hardware soundcard, not Intel HDA). Good work! (even if you are not happy about it)
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #104 from Alexey Loukianov mooroon2@mail.ru 2011-12-19 23:59:45 CST --- Nice to hear that, Michael, it's good that the patch works for you. ---- I'm done with the eyes surgery and now slowly recovering after it. Probably this evening I would be able to take a look into pretty decent amounts of comments that had been posted here since the beginning of the previous week. I'm also going to test the latest patchset (had it happen to land into git?) with Rage and a set of my own testcases and report back in case any failures found.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #105 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-20 12:27:01 CST --- Created attachment 38049 --> http://bugs.winehq.org/attachment.cgi?id=38049 alternative patchset #5
Here's a patchset that leaves alsa_write as is, relying on the small ALSA buffer to limit writes. However, it also removes the "decrement padding by period size deltas" known from shared mode. This is bad and needs to be restored. GCP might return held_frames % mmdev_period_frames, except upon underrun or when stopped, but a too small GCP needs a larger mmdevapi buffer...
Patch 0003 contains the set_periods logic from comment #102. IMHO that's the one to send next, not the original set_buffer_near(3-4*mmdev_period).
I've also included the "don't use snd_pcm_drop" and the 2xsnd_pcm_drop hack for PulseAudio. PA is ***. I spent hours wondering why the tests failed, then I found out that PA wasn't working correctly anymore, even speaker-test produced silence.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #106 from Michael Cronenworth mike@cchtml.com 2011-12-20 21:06:04 CST --- After 24 hours and trying patch set 5 I have more results. Set 5 is about the same as set 4. Sound works properly about half of the time, but it is better than without the patches. There are times, usually under load so the buffers are probably underrun, that I hear "fuzz". Sometimes the fuzz is only for a few seconds, but most of the time the fuzz is constant once it starts. I have to quit the application and restart PA for Wine to sound proper again (restarting Wine is not enough). Non-wine apps sound fine even when Wine is distorted though. I see this message spammed to my terminal when the fuzz occurs:
ALSA lib pcm.c:7316:(snd_pcm_recover) underrun occurred
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #107 from Andrew Eikum aeikum@codeweavers.com 2011-12-21 07:51:58 CST --- (In reply to comment #106)
After 24 hours and trying patch set 5 I have more results. Set 5 is about the same as set 4. Sound works properly about half of the time, but it is better than without the patches. There are times, usually under load so the buffers are probably underrun, that I hear "fuzz". Sometimes the fuzz is only for a few seconds, but most of the time the fuzz is constant once it starts. I have to quit the application and restart PA for Wine to sound proper again (restarting Wine is not enough). Non-wine apps sound fine even when Wine is distorted though. I see this message spammed to my terminal when the fuzz occurs:
ALSA lib pcm.c:7316:(snd_pcm_recover) underrun occurred
Thanks for the update. To be clear, do you have this behavior with /both/ patchsets, or just with "alternative patchset #5"?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #108 from Michael Cronenworth mike@cchtml.com 2011-12-21 08:10:54 CST --- (In reply to comment #107)
Thanks for the update. To be clear, do you have this behavior with /both/ patchsets, or just with "alternative patchset #5"?
Yes. However, the "fuzz" sounds different between the patch sets. #4 sounds like sound is sped up by a factor of two. #5 does not speed up sound, but sounds like there is blank space added in the sound playback for just a few ms. The #4 problem is the same problem that occurs (since the sound changes in .30) without any patch any time an app plays sound. It just happens dramatically less often with the patches applied.
Would recording the sound and having it on youtube be of any help?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #109 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-21 09:29:43 CST --- Michael, thank you. Your report is very dense so I've many questions.
restart PA
Although I've mentioned PA several times in this issue, I want it to not be about PA. Thus I'd be grateful if you could take PA out of the equation. I nevertheless want (even old) PA to work.
Andrew, what are the current mechanisms to select a particular device in winealsa? Back then there was the registry, nowadays I'm editing mmdevdrv.c:defname[] by hand to e.g. "plug:dmix" or "plughw:0,0". If there's none, that's a bug. There should be one, e.g. to select surround devices or whatever ALSA has to offer.
Do we have a specific Pulse Underrun bug entry? Not bug #10495
but most of the time the fuzz is constant once it starts
That means that my patch "2x snd_pcm_drop cause PA" is not effective enough.
#4 vs #5 fuzz differences What happens if you take the 2x snd_pcm_drop patch out of #5? What if you include it in #4? Is that what causes the differences?
restart PA for Wine (restarting Wine is not enough).
I've not found out a) when PA needs a complete restart; b) when snd_pcm_close+open("default") is enough (= other apps still sound fine) c) when 2x snd_pcm_drop suffices... b) and c) may be very close internally. In bug #25750, comment #6, Raymond mentions a PA disconnect/reconnect cycle.
I assume you use the newest PA + alsa_plugs >= 1.0.24. It's interesting that you still mention trouble. I'm using the outdated PA from Ubuntu Intrepid that is known to sometimes loose sound, sometimes after as much as 1h of play.
since sound changes in .30
What do you mean? Removal of DSound hw acceleration?
Would recording the sound
I don't care. Describing is enough to get a "picture". What helps is if you can state whether each and every trouble correlates with an underrun log. Then we'd need to know if the underrun was caused because audio was already disfunctional shortly before (normal padding&delay/position values) or whether high system load caused it.
Nevertheless, in this bug entry I'd prefer talking about how to support the Rage scenario - 10ms prefill (or even 0 prefill) - at most 10ms at each period event with a variety of ALSA devices, from hw-near ones with tiny buffers, to PA-alikes with huge ones and accessorily about underrun handling, or more generally, fault resilience, incl. detecting stalling of audio output.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #110 from Andrew Eikum aeikum@codeweavers.com 2011-12-21 11:52:37 CST --- (In reply to comment #109)
Andrew, what are the current mechanisms to select a particular device in winealsa? Back then there was the registry, nowadays I'm editing mmdevdrv.c:defname[] by hand to e.g. "plug:dmix" or "plughw:0,0". If there's none, that's a bug. There should be one, e.g. to select surround devices or whatever ALSA has to offer.
Well, we have the selectors for the default device in winecfg. That lists all of the devices that are discoverable by winealsa.drv. Unfortunately, there doesn't seem to be any way to get a list of devices from ALSA. I saw your recent mail to alsa-devel. Hopefully they respond with something useful. I expect they won't.
If not, I guess we could add (or re-add) a registry entry for undiscoverable devices. Perhaps a comma-separated list of device names. It really sucks that the user would have to input this.
Looking at VLC, they also use snd_card_next(), snd_ctl_pcm_next_device() and friends to do device enumeration. mplayer has a bunch of hard-coded defaults ("default", "surround40", "surround51", "plug:" prefixes for each of those), although there's no documented location of those device names as far as I'm aware.
Hooray ALSA... We should move this discussion to a separate bug.
Do we have a specific Pulse Underrun bug entry? Not bug #10495
I thought we did have a bug where we discussed adding handle_underrun (before your Bug 28040), especially in context of StarCraft 2, but I can't find one. Maybe that was discussed entirely on IRC.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #111 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-21 15:54:36 CST --- Yours was a wise idea to add one patch per day. test.winehq gathers Linux data once per day. http://test.winehq.org/data/tests/winmm:wave.html shows Linux failures since the first patch. Hopefully they'll go away at the end of the series, yet http://test.winehq.org/data/962230064dc371cc32cf6d01bb2bf68d200a81fc/linux_f... looks random while http://test.winehq.org/data/962230064dc371cc32cf6d01bb2bf68d200a81fc/linux_a... looks systematic.
http://bugs.winehq.org/show_bug.cgi?id=28723
K1773R K1773R@darkgamex.ch changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |K1773R@darkgamex.ch
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #112 from Austin English austinenglish@gmail.com 2011-12-22 12:52:55 CST --- (In reply to comment #101)
Next up is your period matching patch, Jörg. To avoid confusion, it's probably best to submit it after tomorrow's commit cycle.
http://source.winehq.org/git/wine.git/commitdiff/6d17715b01dc09a2091e5d7ed49...
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #113 from Andrew Eikum aeikum@codeweavers.com 2011-12-22 14:01:55 CST --- I believe all of the patches for this bug are in Wine now, right?
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #114 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-22 17:50:37 CST ---
#5 does not speed up sound, but sounds like there is blank space added in the sound playback for just a few ms.
This is desireable behaviour. When an underrun happens, there's nothing better to do but to try and have a blank as short as possible. I'll submit this patch too, even if it's a PA quirk.
Whether an underrun is avoidable is another story.
I'm somewhat frustrated. I've seen random xruns with Ubuntu Lucid booted, running nothing but the render tests (with PA...). Given my comment #51 and comment #61, I now have doubts whether Wine and/or the Linux OS can support the 10ms periods with the current guaranteed underrun when 30-40ms late. Look at the mmdevapi and the damn' 10ms, how many server round-trips, context and thread switches are involved in event signaling and sending data from the app -> dsound -> mmdevapi -> ALSA every 10ms? Do we require a dual core do produce sound nowadays?
BTW, I've attached to bug #27937 my current render & GetPosition tests. I welcome e-mails with logs from them on both native an UNIX machines.
all of the patches for this bug are in Wine now
If Alexey confirms Rage is working fine, this bug could be marked fixed. OTOH I still believe that the current code is too sensitive to underruns due to a missing lead-in in the Rage use-case, and the present bug is the one to discuss this. PA quirks should be discussed elsewhere, even though they are hard to separate from a user POV.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #115 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-23 04:49:04 CST --- It's too early to close this underrun bug. The current design doesn't prevent it. Consider these scenarios: a) 10ms prefill -- This is the Rage use-case. write 10ms iff padding <= 10ms (=period)
b) without prefill. It's the same really. ALSA simply starts 10ms later as data comes in.
c) App takes a little time, calling ReleaseBuffer only 6ms after the event. That doesn't matter, the app has time to supply data until the next callback event.
d) The callback is delayed a little due to system load. This is the killer.
Consider a 10ms ALSA period. At each and every event, ALSA is on the verge of underrun, having consumed the meagre 10ms it was given. If the callback is late, an underrun is due.
Consider a 16ms ALSA period. ALSA won't start at the first event, because the period is not full. This actually does good as it is equivalent to a lead-in. However, at time t=26ms, ALSA has finished playing its 16ms period, whereas mmdevapi has only seen 30ms. 32ms is needed for the next period, hence an underrun is due. Or is likely, depending on the inner ALSA device workings, e.g. hw:0 may start to play the incomplete period and see it filled up at time t=30ms. Whereas we've seen in comment #63 that plug:dmix won't play the incomplete period, yet maybe its internal buffering will allow it to absorb the delay until t=30.
Here's a timeline: 0 10 20 30 40 50 60 70 80 90 100 110 120 |....|....|....|....|....|....|....|....|....|....|....|....|... 10ms |.......|.......|.......|.......|.......|.......|.......|.. 16ms ----- prefill 20----- 20ms are available at time t=10ms 30----- 30ms at t=20ms 40----- 40ms at t=30ms 50-----
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #116 from Jörg Höhle hoehle@users.sourceforge.net 2011-12-23 09:27:47 CST --- Created attachment 38086 --> http://bugs.winehq.org/attachment.cgi?id=38086 Delay start by waiting for 2 ALSA periods
Instead of a lead-in, we can delay the moment ALSA starts until there's more data in it. The more data in ALSA the later underruns happen. What I like about this solution is that apps that prefill more are not delayed. This patch causes no feature interaction with GetPosition, but needs resolution of bug #29056 before going into git. Wine must play short samples and residual frames. You should nevertheless try it now and see how it helps with the Rage use case at the cost of increased latency.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #117 from Jörg Höhle hoehle@users.sourceforge.net 2012-01-02 13:11:28 CST --- Any volunteer out there to try out the effect of the previous patch about ALSA's start_threshold on underruns?
BTW, I played with start_threshold and while hw: and plug:dmix are period oriented, ALSA's pulse plugin is not and will start after writing one frame with start_threshold=1, even without filling a full period.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #118 from Jörg Höhle hoehle@users.sourceforge.net 2012-01-03 10:46:59 CST --- I've thought more about timelines and start thresholds and came to the conclusion that winealsa needs start_threshold = alsa_period + mmdevapi_period + a little like 3ms or 1/3 mmdevapi_period The "a little" allows event delivery to be delayed by as much, when the system is highly loaded. Only then is underrun free behaviour designed into the code in the worst case scenario (comment #16 without even prefill).
This is based on the consideration that ALSA's avail is unlike mmdevapi's padding: Observe avail = buffer_size and you've hit an underun (with hw:0 and probably dmix). Observe GetCurrentPadding = 0 and you have one period time to supply data. Also, ALSA's hw:0 and dmix operate in period-sized chunks, so ALSA should always have at least own period of data that it can consume.
Currently, I'm unsure whether start_threshold is the way to go or whether to write a lead-in, because I don't know yet how to mix that with the need for trailing silence or when exactly to write trailing silence given the PulseAudio bugs that lead to unreliable snd_pcm_avail_update values.
http://bugs.winehq.org/show_bug.cgi?id=28723
Christian christian.frank@gmx.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |christian.frank@gmx.de
--- Comment #119 from Christian christian.frank@gmx.de 2012-01-03 13:46:43 CST --- i did not test your patch with rage, but it does not work for sc2 nor skyrim. still buffer underruns and crackling sound :( ..
im on ubuntu 11.10 with pure alsa and wine compiled from git.I have an soundblaster 512 with hw mixing. using emuk101 module.
I also noticed the following behaviour in addition (maybe additional bugs):
- I can not choose an output device in winecfg which is in use by another app at the same time (teamspeak in my case) - Testing the sound in winecfg crashes winecfg when the device is in use
Regards, Christian
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #120 from Michael Cronenworth mike@cchtml.com 2012-01-03 22:44:46 CST --- (In reply to comment #109)
Although I've mentioned PA several times in this issue, I want it to not be about PA. Thus I'd be grateful if you could take PA out of the equation. I nevertheless want (even old) PA to work.
My apologies, but if this is the case I will refrain from further comments. Fedora relies on PA now and removing it from my system would take time and cause issues with some software.
#4 vs #5 fuzz differences What happens if you take the 2x snd_pcm_drop patch out of #5? What if you include it in #4? Is that what causes the differences?
I have not run tests with and without some patches. Sorry.
I assume you use the newest PA + alsa_plugs >= 1.0.24. It's interesting that you still mention trouble. I'm using the outdated PA from Ubuntu Intrepid that is known to sometimes loose sound, sometimes after as much as 1h of play.
Yes. PA 0.9.23 and ALSA 1.0.24.
since sound changes in .30
What do you mean? Removal of DSound hw acceleration?
In Wine 1.3.29 (and previous releases) I had perfect sound under any condition. Wine >= .30 sound broke. As of .36 (that includes your patches so far) sound works better, but it still breaks under some conditions as mentioned previously.
we'd need to know if the underrun was caused because audio was already disfunctional shortly before (normal padding&delay/position values) or whether high system load caused it.
I do hear my HD tick and SC2 pauses for a few milliseconds, so yes, high load (at least high Wine load) does seem to create the underrun condition. Wine <= .29 seemed to handle this just fine.
http://bugs.winehq.org/show_bug.cgi?id=28723
Andreas Bierfert andreas.bierfert@lowlatency.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |andreas.bierfert@lowlatency | |.de
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #121 from Christian christian.frank@gmx.de 2012-01-06 04:03:22 CST --- Well, i noticed something very very strange:
After reinstalling and reenabling pulse audio on my system the crackling sound is history ..With pure alsa my sound is crackling e.g in Hard Reset (and any other games) but it works with pulse.
Not sure what could be the root cause for this :( ..Maybe game->wine->alsa->pulse->alsa->soundblaster behaves different than game->wine->alsa->soundblaster
Hope this helps somehow. If you need any other tests, poke me.
Regards, Christian
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #122 from Raymond superquad.vortex2@gmail.com 2012-01-08 20:12:15 CST --- (In reply to comment #121)
Well, i noticed something very very strange:
After reinstalling and reenabling pulse audio on my system the crackling sound is history ..With pure alsa my sound is crackling e.g in Hard Reset (and any other games) but it works with pulse.
Not sure what could be the root cause for this :( ..Maybe game->wine->alsa->pulse->alsa->soundblaster behaves different than game->wine->alsa->soundblaster
Hope this helps somehow. If you need any other tests, poke me.
Which soundlbaster do your have ?
This may because wine now use a smaller alsa buffer than PA server
if your soundblaster card 0
cat /proc/asound/card0/pcm0p/sub0/hw_params
and post the output when running alsa , pulseaudio
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #123 from Jörg Höhle hoehle@users.sourceforge.net 2012-01-20 16:32:36 CST --- Alexey, I hope you recovered well from your eyes surgery. I'd be glad if you could investigate whether native's mmdevapi *always* continues to signal events in EVENTCALLBACK mode after IAC_Start was first called. On testbot you'll find my render tests that prove that in shared mode, a) even after Stop or Reset, events continue to be triggered and b) SetEventHandle appears to return 0x8007007b on the second invocation, i.e. one cannot change the event target initially set. Can it ever cease prior to calling Release? Wine currently stops sending events after Stop. I just noticed that Wine's DSound uses timeKillEvent in Release() only, so native's mmdevapi would be a compatible source of events.
I've not tested exclusive mode, capture, behaviour after releasing some samples, waiting for several seconds or any other fancy combination.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #124 from Jörg Höhle hoehle@users.sourceforge.net 2012-01-23 04:12:08 CST --- One thing I've had in mind for several weeks now is whether we should switch wineALSA's default period to 20ms -- just like wineOSS and CoreAudio use -- until after we have a design that ensures running glitch-free[*] in the worst-case scenario and until after relase 1.4. This takes advantage of Alexey's observations in comment #19, namely that XAudio2 will use larger buffers when it sees such a period, causing less stress in Wine.
[*] The missing key currently is that winealsa starts too early: there's so little in the buffer that the next underrun is close.
What I don't know at all is how DSound would react to that. It looks like native DSound has been using 10ms ticks for over a decade. OTOH, all users of wineoss have been using exactly that 10/20ms combination since 1.3.25.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #125 from Raymond superquad.vortex2@gmail.com 2012-01-24 19:30:27 CST --- (In reply to comment #124)
One thing I've had in mind for several weeks now is whether we should switch wineALSA's default period to 20ms
alsa default "dmix" has a fixed period size which return 21.333 ms period time
I don't really understand the current code
This->alsa_bufsize_frames = This->mmdev_period_frames * 4;
and what is the reason to check (err < 0)
if((err = snd_pcm_hw_params_set_period_time_near(This->pcm_handle, This->hw_params, &alsa_period_us, NULL)) < 0) WARN("Unable to set period time near %u: %d (%s)\n", alsa_period_us, err, snd_strerror(err)); /* ALSA updates the output variable alsa_period_us */
This->mmdev_period_frames = MulDiv(fmt->nSamplesPerSec, This->mmdev_period_rt, 10000000);
This->bufsize_frames = MulDiv(duration, fmt->nSamplesPerSec, 10000000);
/* Buffer 4 ALSA periods if large enough, else 4 mmdevapi periods */ This->alsa_bufsize_frames = This->mmdev_period_frames * 4; if(err < 0 || alsa_period_us < period / 10) err = snd_pcm_hw_params_set_buffer_size_near(This->pcm_handle, This->hw_params, &This->alsa_bufsize_frames); else{ unsigned int periods = 4; err = snd_pcm_hw_params_set_periods_near(This->pcm_handle, This->hw_params, &periods, NULL); }
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #126 from Jörg Höhle hoehle@users.sourceforge.net 2012-01-25 09:19:44 CST --- I've attached to bug #27937 my newest set of render tests that you'll also find in testbot job #16643. It implements the XAudio2/Rage scenario as described by Alexey in comment #19.
Doing so, I found out that it was an error to ignore clock slew. It appears that CreateTimerQueue is no good time base. It never caused wake-ups after 10ms on my system. Using CreateTimerQueue(... mmdevapi_ms - 3 (=7ms) ...) produced wake-ups every 8ms approx., anything higher lead to 12ms. Never in-between. Of course, 12ms produced regular underruns, as XAudio2 only adds at most 10ms at every event. Frequency scaling from min to max showed no change.
IOW, to support the XAudio2 scenario = "add at most one period per event", we MUST deliver events *exactly* based on audio frame consumption, or faster! (or auto-adjust based on Release'd frames).
I don't really understand the current code
What part? Arguably, err < 0 after set_*_near should be fatal, because ALSA will oblige out of range parameters, unlike the non-near parent.
http://bugs.winehq.org/show_bug.cgi?id=28723
pete vazz28@yahoo.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |vazz28@yahoo.com
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #127 from Jörg Höhle hoehle@users.sourceforge.net 2012-02-01 03:00:57 CST --- The new render.c:test_worst_case is very useful for early disagnosis:
http://test.winehq.org/data/f0cfa7cf46d7c832fb84555529d1a1b7c49c46c0/linux_m... render.c:1587: Should play 1000ms continuous tone with fragment size 480. render.c:1631: Released 48000=100x480 -480 frames at 48000 worth 990ms in 1196ms render.c:1631: Released 48000=100x480 -480 frames at 48000 worth 990ms in 1201ms render.c:1631: Released 48000=100x480 -480 frames at 48000 worth 990ms in 1197ms
All such machines with period sizes near 12ms will stutter in Rage if my implementation of XAudio2 behaviour as described by Alexey in comment #19 is correct.
I've now augmented my tests to produce audible output. My Ubuntu Intrepid machine too has a 12ms CreateTimerQueue period and presents hiccups.
http://test.winehq.org/data/f0cfa7cf46d7c832fb84555529d1a1b7c49c46c0/linux_f... render.c:1587: Should play 1000ms continuous tone with fragment size 480. render.c:1631: Released 48000=100x480 -480 frames at 48000 worth 990ms in 2035ms render.c:1631: Released 48000=100x480 -480 frames at 48000 worth 990ms in 1648ms render.c:1631: Released 48000=100x480 -480 frames at 48000 worth 990ms in 1997ms 2 seconds is completely broken.
Some machines match 10ms: render.c:1587: Should play 1000ms continuous tone with fragment size 480. render.c:1631: Released 48000=100x480 -480 frames at 48000 worth 990ms in 1009ms They will stutter much less often, as 10ms events are not synchronized to wall time (or the DAC clock). Yet they will eventually stutter because of this slew. Unlike timeSetEvent, CreateTimerQueue does not take elapsed time into account when computing the next wake-up time.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #128 from Jörg Höhle hoehle@users.sourceforge.net 2012-02-04 02:45:08 CST --- Well, I've a simple explanation for the crackling sound that my audible worst_case/Xaudio2 test attached to bug #27937 produces. I always thought it would explain stuttering (as in the subject line of this bug) but I perceive it as crackling instead (as in the subject line of bug #28856).
If my test_worst case really replicates XAudio2 behaviour, that would explain crackling in all apps that use XAudio2 (i.e. w7 xact audio insead of xp DSound).
Case 1: render.c:Released 48000=100x480 -480 frames at 48000 worth 990ms in 1196ms Here CreateEventTimer delivers events every 12ms -- always late. As test_worst_case never writes more than one (10ms) period, audio hits an underrun every few periods. This cause a crackling sound partly covering the sine tone. Crackling is almost continuous.
Case 2: render.c:Released 48000=100x480 -480 frames at 48000 worth 990ms in 1009ms On this machine, CreateEventTimer delivers events every 10ms like we asked. However, because event delivery is not synchronized to the audio clock, there may be times where GetCurrentPadding is larger than one period. Every such time, test_worst_case won't write data. There will be a hole, resulting in an underrun and crackling some timer later.
Case 3: render.c:Released 40320=84x480 -960 frames at 48000 worth 820ms in 789ms Here I cheated on the 12ms timer machine, using CreateTimerQueue(period_ms -3), resulting in timer callbacks 8ms apart. In that case, not writing a period full of samples was not harmful, as it would likely be written at the next timer event. That's why only 84 periods were written in that case, despite receiving 100 events. The result is a continuous sine, without crackling.
I consider the algorithm "write one period if padding is less or equal to one period" to be stupid, because it doesn' take clock skew into account. Or it hints that native's doesn't deliver events like we do.
http://bugs.winehq.org/show_bug.cgi?id=28723
K1773R K1773R@darkgamex.ch changed:
What |Removed |Added ---------------------------------------------------------------------------- CC|K1773R@darkgamex.ch |
http://bugs.winehq.org/show_bug.cgi?id=28723
Sylvain Petreolle spetreolle@yahoo.fr changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |spetreolle@yahoo.fr
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #129 from Andrew Eikum aeikum@codeweavers.com 2012-02-06 09:55:48 CST --- Yeah, we ought to evaluate how we do our callbacks and timers. CreateTimerQueueTimer doesn't seem sufficient.
WinMM's timer code does its own management with standard Unix calls. Basically, it uses poll(3)'s timeout parameter to sleep between callbacks. We could do similar in our drivers, which might be more accurate than the TimerQueue code.
We could also hook into ALSA/OSS's callback mechanisms, but frankly giving more responsibility to these systems is the last thing I want to do at this point.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #130 from Jörg Höhle hoehle@users.sourceforge.net 2012-02-10 20:27:47 CST --- Created attachment 38788 --> http://bugs.winehq.org/attachment.cgi?id=38788 Patch to increase TimerQueue rate
This patch changes the TimerQueue rate from 12 to 8ms on my system. It fixes crackling output from my interactive mmdevapi render tests, as mentioned in comment #128. You guys watching bug #28856 and bug #28282, please try it out.
CreateTimerQueueTimer vs. winmm timeSetEvent. The difference is, winmm:time* uses: delta_time = timer->dwTriggerTime - GetTickCount(); thereby readjusting to elapsed time. CTQT doesn't: queue_current_time() + t->period deviating more and more. We need tests to check whether that is correct CTQ behaviour.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #131 from Raymond superquad.vortex2@gmail.com 2012-02-10 21:32:27 CST --- (In reply to comment #130)
Created attachment 38788 [details] Patch to increase TimerQueue rate
This patch changes the TimerQueue rate from 12 to 8ms on my system. It fixes crackling output from my interactive mmdevapi render tests, as mentioned in comment #128. You guys watching bug #28856 and bug #28282, please try it out.
CreateTimerQueueTimer vs. winmm timeSetEvent. The difference is, winmm:time* uses: delta_time = timer->dwTriggerTime - GetTickCount(); thereby readjusting to elapsed time. CTQT doesn't: queue_current_time() + t->period deviating more and more. We need tests to check whether that is correct CTQ behaviour.
Do you have a customized alsa default device ?
and how does your perform the test ?
The behaviour of dmix is quit different when there is another application already aplaying.
I have performed a test, it seem that the start_threshold has no effect if you are using dmix plugin, it start immmediately
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #132 from Jörg Höhle hoehle@users.sourceforge.net 2012-02-19 15:15:00 CST --- Alexey, Michael, Christian, you've been quiet lately. I'd like to know whether my timer rate hack from comment #130 helps this app. It helps against crackling sound from SC2 in bug #28856. The reason why I'm asking is that I consider the timer issue to be the last one that would prevent closing this bug, see comment #114.
- A lock-free ALSA driver would be nice, but is not essential. It is sketched in bug #29531. - The need to restart PA is undeniably a bug in PA/alsa_plugins. I've submitted bug reports #46296 and #46296 to PA, see Wine bug #28856. - The timer rate also affects OSS in bug #29585. - With a correct timer and the lead-in+hidden_frames concept currently in place, Wine should support glitch-free "worst-case" 10ms XAudio2 rendering for as long as one's PC HW supports that rate. E.g. flipping screens on my laptop causes interruptions so large that audio glitches. I don't see how to prevent that given mmdevapi's properties, the buffers and latencies are simply too small.
it seem that the start_threshold has no effect if you are using dmix plugin, it start immmediately
My results are opposite: PA starts immediately (with threshold=1). dmix strictly works in periods, both when starting and with trailing frames, not caring whether threshold equals 1 or period_size-1.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #133 from Christian christian.frank@gmx.de 2012-02-20 02:34:18 CST --- (In reply to comment #132)
Alexey, Michael, Christian, you've been quiet lately. I'd like to know whether my timer rate hack from comment #130 helps this app. It helps against crackling sound from SC2 in bug #28856. The reason why I'm asking is that I consider the timer issue to be the last one that would prevent closing this bug, see comment #114.
- A lock-free ALSA driver would be nice, but is not essential. It is sketched in bug #29531.
- The need to restart PA is undeniably a bug in PA/alsa_plugins. I've submitted bug reports #46296 and #46296 to PA, see Wine bug #28856.
- The timer rate also affects OSS in bug #29585.
- With a correct timer and the lead-in+hidden_frames concept currently in
place, Wine should support glitch-free "worst-case" 10ms XAudio2 rendering for as long as one's PC HW supports that rate. E.g. flipping screens on my laptop causes interruptions so large that audio glitches. I don't see how to prevent that given mmdevapi's properties, the buffers and latencies are simply too small.
it seem that the start_threshold has no effect if you are using dmix plugin, it start immmediately
My results are opposite: PA starts immediately (with threshold=1). dmix strictly works in periods, both when starting and with trailing frames, not caring whether threshold equals 1 or period_size-1.
Hi Jörg,
i just tested Starcraft and Skyrim with Wine 1.4 rc4 and everything works fine. No sound issues at all.Also Hard Reset, Wolfenstein and the new Jagged Alliance Back in Action have perfectly working sound. For me so far you did a GREAT JOB fixing all those ugly sound issues ! Many thanks for that !
This was tested with pure alsa on 11.10 and a sb 512 pci.
I can't get rage up and running right now in Wine.Will try this in the afternoon and see how it works.
Cu, Christian
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #134 from Jörg Höhle hoehle@users.sourceforge.net 2012-02-27 04:49:17 CST --- Initially I thought we need EVENTCALLBACK timing tests. It turns out that the current test_worst_case is good enough for reasoning on averages: render.c:2080: Should play 1000ms continuous tone with fragment size 441. render.c:2132: Released 44100=100x441 -441 frames at 44100 worth 990ms in 1006ms
- If the events were more than ~10ms apart, total time would be >> 1006ms, e.g. ~1200ms on my 12ms callback wakeup Intrepid system.
- If the events were < ~10ms apart, it would take less time or be unable to release data 100 times, e.g. 80 times on my Ubuntu system with 8ms wakeups, or leave more padding at the end.
Now is the rate 10ms or 10.1587? I'm researching opportunities for allowing a regular drain (e.g. decrease padding by one period each time) even in the presence of irregular or non-matching callback intervals, given that we have a few frames hidden within ALSA that hopefully will compensate that. Ideas & help welcome.
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Depends on| |30071
--- Comment #135 from Jörg Höhle hoehle@users.sourceforge.net 2012-03-04 01:31:03 CST --- I separated the CreateTimerQueueTimer issue into bug #30071 and added a dependency. Let's close this bug when that'll be fixed (or winexyz.drv does not depend on CTQT anymore, whatever comes first).
If you're curious, try out the patch there.
Note that a fixed CTQT does not prevent underruns caused by clock skew. If the audio device would draw e.g. 48010 frames while our clock counts 48000 within 10ms, the feeder would be too late on average and the ALSA buffer slowly empty. That should be the topic of another bug report. The audio drivers must move away from CTQT and use something that adapts intervals. I don't believe that ChangeTimerQueueTimer fits the bill.
http://bugs.winehq.org/show_bug.cgi?id=28723
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.4.0 |1.6.0
--- Comment #136 from Austin English austinenglish@gmail.com 2012-03-07 13:27:16 CST --- Moving milestone to 1.6.
http://bugs.winehq.org/show_bug.cgi?id=28723
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|1.6.0 |1.4.0
--- Comment #137 from Austin English austinenglish@gmail.com 2012-03-07 13:34:30 CST --- My mistake, 1.6 criteria are not set yet, so these need to stay in 1.4.
Sorry for the noise.
http://bugs.winehq.org/show_bug.cgi?id=28723
Oliver oliver.joos@hispeed.ch changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |oliver.joos@hispeed.ch
http://bugs.winehq.org/show_bug.cgi?id=28723
annihilation wiklauri1771@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |wiklauri1771@gmail.com
http://bugs.winehq.org/show_bug.cgi?id=28723
Bug 28723 depends on bug 30071, which changed state.
Bug 30071 Summary: Need a CreateTimerQueueTimer that is stable over time http://bugs.winehq.org/show_bug.cgi?id=30071
What |Old Value |New Value ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED
http://bugs.winehq.org/show_bug.cgi?id=28723
Jörg Höhle hoehle@users.sourceforge.net changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED
--- Comment #138 from Jörg Höhle hoehle@users.sourceforge.net 2013-03-02 16:24:25 CST ---
"everything works fine", "perfectly working sound" ...
... plus a lot of what's been suggested here has been implemented meanwhile, so I think we should close this issue. A closed bug does not prevent us from reading Alexey's description of the XA2 algorithm in comment #19 or anything else, including attachments.
Closing this does not deny the existence of a fundamental design flaw w.r.t. clock skew in our drivers. After all, the periodic decrease of padding and event signaling is based on a system clock, whereas samples are played by some audio device clock.
http://bugs.winehq.org/show_bug.cgi?id=28723
--- Comment #139 from Alexey Loukianov mooroon2@mail.ru 2013-03-03 05:54:52 CST --- (In reply to comment #138)
I think we should close this issue...
Yup, I have no objections. This bug had actually been transformed into something like "Wine's mmdev low level implementation has fundamental design flaw" long ago. As for the bug itself (i.e. about sound stutter in Rage) - it had been workaround for a most part long ago (i.e. stutters are so rare that one has to listen very carefully for a long period of time to notice them) thus closing this bug is a perfectly valid thing.
As for fundamental design flaw - it might be reasonable to open up another "generic placeholder" bug in case there are any plans on doing total mmdev implementation revamp which wouldn't suffer from clock skew problem (or would use some smart way to mitigate the problems imposed by drift).
http://bugs.winehq.org/show_bug.cgi?id=28723
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #140 from Alexandre Julliard julliard@winehq.org 2013-03-15 14:46:17 CDT --- Closing bugs fixed in 1.5.26.