 
            Hi,
If you don't paste all the mmdevapi tests with the fix in my git tree, it's a regression. I don't want to see held_Frames because it's a shadow buffer. If winmm and dsound don't work without it, they're wrong, fix those.
I won't comment on the two individual pulseaudio drivers. Unfortunately, none of the two authors bothered to point me to a verbose log of the mmdevapi tests for my scrutiny.
The tests should pass like they do on native.
Regarding latency, I'm not aware of a flaw in Wine's winmm. HOWEVER, Wine's DSound needs a patch. Here the reason.
To recap, latency is somehow related to the "distance" between what one hears and the PCM frames sent to the device. GetPosition is the only API call in that area in both winmm and mmdevapi.
Buffer size is a completely different thing. I find it useful to have in mind a cascade of audio filters performing some buffering each. Then it's obvious that an app only sees the frontmost buffer size and knows nothing about the others. High latencies imply that there must be some large buffer space(s), somewhere.
Period is an artificial entity related to how often the frontmost buffer is drained to feed the next buffer in the chain, in case that happens regularly. This is not guaranteed nor necessarily documented.
You get the idea: regardless of arbitrary (front) buffer sizes and periods not worth this name, try and send a stable flow of data to play music without glitches and let explosions be heard ASAP.
So far the situation, now the issue. It appears easier to write code that provides (frontmost) buffer space behaviour similar to native than one that provides similar latency.
Alas, behaviour observable on MS-Windows differs from what we get with PulseAudio and some ALSA devices.
Tests seem to indicate a latency of around 30-70ms from mmdevapi with MS-Windows. That good enough for games. OTOH if ALSA or PA gives us 2 seconds on Linux, that's a big cause of trouble. The authors of MS apps could never test them in an environment with such high latency. What happens with software not tested? It happens not to work.
Two solutions: - Reduce actual latency or - lie about latency.
Lying about latency causes apps to loose the ability to sync audio and video. However, given a choice between loss of lip sync and possible crashes or other weird behaviour because an app is executed in a environment that its developers never experienced, I consider the loss of lip sync to be less worrying.
We designed the winealsa driver to accomodate an arbitrary latency and accept a large variation of periods. (Some bug report from Jack users seem to imply that it should accept even larger periods, e.g. 150ms, yet still pretend to use 10ms on the mmdevapi side).
We choose to have the winealsa driver not lie about latency. That should be left to high level APIs.
DirectSound is built around the "Direct" HW (lack of) abstraction: a circular buffer of samples is played by a DAC. The converted signal is immediately sent to the speaker. Hence:
1. GetPosition information in the DSound abstraction translates to a "playpos" -- well known.
2. No provision is made for additional buffering. The playpos must lie within the circular buffer. The reported free (writable) space shall never cross the playpos -- with subtle consequences.
Wine's DSound needs a patch to ensure this second property. Given a 80ms primary buffer, Wine's DSound must not pretend its playpos is 2 seconds late!
I think Wine should try and reduce latency but lie if that does not suffice.
What's needed? 1. Have DSound always use a buffer large enough for typical situations (I believe 100-200ms).
2. Clamp the reported position such that it won't leave the (virtual) primary buffer. As a result, DSound must pretend to play even when Wine is solely pumping the huge 2s of cascading SW audio buffers.
3. Work on reducing the latency of the cascading audio filters. That device-level work is independent on the DSound one.
Native's 30-70ms has the benefit that it nicely fits within typical DSound primary buffer sizes. PulseAudio's 2s does not (yet). I believe a total 80ms latency would be acceptable with games.
Further areas of investigation and effort:
- Maybe clamp latency in the mmdevapi drivers after all? After all, huge latencies are known to cause dead-locks in apps and we don't know what all apps or libraries built atop mmdevapi or winmm expect. (If yes, to what value? 150% buffer size?)
- Try and find MS setups with huge latencies. USB headphones are said to be candidates, but I've yet to see an excellent and trustworthy report about what happens in that case.
- Work on reducing the cascades of audio buffers
- Work on further decoupling mmdevapi periods from UNIX audio API ones
- Research the trade-off between glitch free playback and buffering in UNIX. The current settings in Wine's audio are IMHO not good enough yet.
Regards, Jörg Höhle
 
            Hey Joerg,
2012/6/26 Joerg-Cyril.Hoehle@t-systems.com:
Hi,
If you don't paste all the mmdevapi tests with the fix in my git tree, it's a regression. I don't want to see held_Frames because it's a shadow buffer. If winmm and dsound don't work without it, they're wrong, fix those.
I won't comment on the two individual pulseaudio drivers. Unfortunately, none of the two authors bothered to point me to a verbose log of the mmdevapi tests for my scrutiny.
The tests should pass like they do on native.
Regarding latency, I'm not aware of a flaw in Wine's winmm. HOWEVER, Wine's DSound needs a patch. Here the reason.
To recap, latency is somehow related to the "distance" between what one hears and the PCM frames sent to the device. GetPosition is the only API call in that area in both winmm and mmdevapi.
Buffer size is a completely different thing. I find it useful to have in mind a cascade of audio filters performing some buffering each. Then it's obvious that an app only sees the frontmost buffer size and knows nothing about the others. High latencies imply that there must be some large buffer space(s), somewhere.
Period is an artificial entity related to how often the frontmost buffer is drained to feed the next buffer in the chain, in case that happens regularly. This is not guaranteed nor necessarily documented.
You get the idea: regardless of arbitrary (front) buffer sizes and periods not worth this name, try and send a stable flow of data to play music without glitches and let explosions be heard ASAP.
So far the situation, now the issue. It appears easier to write code that provides (frontmost) buffer space behaviour similar to native than one that provides similar latency.
Alas, behaviour observable on MS-Windows differs from what we get with PulseAudio and some ALSA devices.
Tests seem to indicate a latency of around 30-70ms from mmdevapi with MS-Windows. That good enough for games. OTOH if ALSA or PA gives us 2 seconds on Linux, that's a big cause of trouble. The authors of MS apps could never test them in an environment with such high latency. What happens with software not tested? It happens not to work.
Two solutions:
- Reduce actual latency or
- lie about latency.
Lying about latency causes apps to loose the ability to sync audio and video. However, given a choice between loss of lip sync and possible crashes or other weird behaviour because an app is executed in a environment that its developers never experienced, I consider the loss of lip sync to be less worrying.
We designed the winealsa driver to accomodate an arbitrary latency and accept a large variation of periods. (Some bug report from Jack users seem to imply that it should accept even larger periods, e.g. 150ms, yet still pretend to use 10ms on the mmdevapi side).
We choose to have the winealsa driver not lie about latency. That should be left to high level APIs.
DirectSound is built around the "Direct" HW (lack of) abstraction: a circular buffer of samples is played by a DAC. The converted signal is immediately sent to the speaker. Hence:
GetPosition information in the DSound abstraction translates to a "playpos" -- well known.
No provision is made for additional buffering. The playpos must lie within the circular buffer. The reported free (writable) space shall never cross the playpos -- with subtle consequences.
Wine's DSound needs a patch to ensure this second property. Given a 80ms primary buffer, Wine's DSound must not pretend its playpos is 2 seconds late!
I think Wine should try and reduce latency but lie if that does not suffice.
What's needed?
Have DSound always use a buffer large enough for typical situations (I believe 100-200ms).
Clamp the reported position such that it won't leave the (virtual) primary buffer. As a result, DSound must pretend to play even when Wine is solely pumping the huge 2s of cascading SW audio buffers.
Work on reducing the latency of the cascading audio filters. That device-level work is independent on the DSound one.
Native's 30-70ms has the benefit that it nicely fits within typical DSound primary buffer sizes. PulseAudio's 2s does not (yet). I believe a total 80ms latency would be acceptable with games.
Further areas of investigation and effort:
Maybe clamp latency in the mmdevapi drivers after all? After all, huge latencies are known to cause dead-locks in apps and we don't know what all apps or libraries built atop mmdevapi or winmm expect. (If yes, to what value? 150% buffer size?)
Try and find MS setups with huge latencies. USB headphones are said to be candidates, but I've yet to see an excellent and trustworthy report about what happens in that case.
Work on reducing the cascades of audio buffers
Work on further decoupling mmdevapi periods from UNIX audio API ones
Research the trade-off between glitch free playback and buffering in UNIX. The current settings in Wine's audio are IMHO not good enough yet.
Lowering latency is overrated, I can get 8 ms between play position and hardware queue length, you just need to be willing to patch dsound to do that. http://repo.or.cz/w/wine/multimedia.git/shortlog does just that, but there might be a reference leak somewhere in dsound, didn't look yet.
With events triggered by winepulse it seems to work and 20 ms buffer + 10ms periods works fine, but it's not really recommended on plain wine since it lacks the rtkit patch to reduce jitter to close to 0 ms, and winealsa lacks proper event driven mode, it just uses a floating timer based on nothing, so it would never be able to come close to that. However dmix works fine if you only fill 3 periods. It would be trivial to fix winealsa for dmix to do the same but that would probably cause it to fail tests, even though it wouldn't be against the spirit of the tests. Sometimes it's an error to pass tests. :-)
~Maarten

