Hi,
What are the areas that would have the most impact if fixed? You are invited to participate and share your thoughts.
I believe we need to distinguish winmm/dsound/mmdevapi/OS.
mmdevapi:
A. lock-less timer callback design, bug #. But I don't think where we would get the most improvement from avoiding a few EnterCriticalSection.
B. Stability of time base, bug #. Perhaps major. My render "worst case" test showed that CreateTimerQueue never invoked callbacks every 10ms as asked rather than 8 or 12ms. I don't know if that's the Linux jiffies we see here.
C. too small ALSA buffer for the backend I've sketched a "hidden frames" design that would allow using a larger ALSA or OSS buffer, but that needs a reliable estimate of how much ALSA has buffered. Also, that's at odds with DSound and XAudio2 which want short latency and presumably don't send much data in advance.
Every audio HW guy recommends using audio buffers as large as possible. Can't we just because of the f**ing 10ms period?
D. timer frequency Is it really important to match native's 10ms period? The UNIX world is trying to decrease the number of interrupts in order to preserve battery life, but we go backwards and move from a model which dynamically computed the next wake-up based on the number of submitted frames to a tiny fixed period. Are we crazy?
E. lead-in aka. "ALSA won't start", bug #
F. lead-out aka. finish playing trailing frames not modulo period size, bug #
G. other mmdevapi
winmm:
H. GetPosition is not (yet) == mmdevapi's GetPosition
I. other winmm
DSound:
J. time base? It uses timeSetEvent. What if using mmdevapi's event?
K. buffer size too small or not matching whatever needs?
L. issues with GetCurrentPadding/GetPosition The "true position" may be too far away from the write pointer, e.g. PA's typical 2s latency. Deal with that, matching DSound's and the apps' expectations as well as Linux/BSD/OSX sound systems.
M. DSound's underlying model is a ring buffer. This does not match mmdevapi's. Should we bypass mmdevapi because all it adds is latency? Reinvent HW acceleration? Provide a hidden API in mmdevapi?
N. other DSound
XAudio2: Modern apps will use that because mmdevapi is too low-level, presumably DSound usage will decrease.
O. XAudio2 appears to use the "worst case small period size writes" known from the Rage bug #
P. other
Capture:
Q. whatever capture issues
UNIX host:
R. thread priority -- no "Pro Audio" / Real-Time priority
S. reliability of event/interrupt delivery over sustained intervals (hours, not minutes).
T. Wine SetEvent & server round-trip times
Other:
U. FMOD & whatever, any particular constraints?
V. other?
Thanks for your contribution, Jörg Höhle
Op 26-01-12 12:59, Joerg-Cyril.Hoehle@t-systems.com schreef:
Hi,
What are the areas that would have the most impact if fixed? You are invited to participate and share your thoughts.
I believe we need to distinguish winmm/dsound/mmdevapi/OS.
mmdevapi:
A. lock-less timer callback design, bug #. But I don't think where we would get the most improvement from avoiding a few EnterCriticalSection.
B. Stability of time base, bug #. Perhaps major. My render "worst case" test showed that CreateTimerQueue never invoked callbacks every 10ms as asked rather than 8 or 12ms. I don't know if that's the Linux jiffies we see here.
C. too small ALSA buffer for the backend I've sketched a "hidden frames" design that would allow using a larger ALSA or OSS buffer, but that needs a reliable estimate of how much ALSA has buffered. Also, that's at odds with DSound and XAudio2 which want short latency and presumably don't send much data in advance.
Every audio HW guy recommends using audio buffers as large as possible. Can't we just because of the f**ing 10ms period?
Sorry, we don't control delivery, and can't tell applications how to behave... Just bite the bullet and add a winepulse driver already, I'll even fix mine to work better if it had a chance of getting accepted, not having a driver for the default linux audio system is just silly..
D. timer frequency Is it really important to match native's 10ms period? The UNIX world is trying to decrease the number of interrupts in order to preserve battery life, but we go backwards and move from a model which dynamically computed the next wake-up based on the number of submitted frames to a tiny fixed period. Are we crazy?
E. lead-in aka. "ALSA won't start", bug #
F. lead-out aka. finish playing trailing frames not modulo period size, bug #
G. other mmdevapi
winmm:
H. GetPosition is not (yet) == mmdevapi's GetPosition
I. other winmm
DSound:
J. time base? It uses timeSetEvent. What if using mmdevapi's event?
K. buffer size too small or not matching whatever needs?
L. issues with GetCurrentPadding/GetPosition The "true position" may be too far away from the write pointer, e.g. PA's typical 2s latency. Deal with that, matching DSound's and the apps' expectations as well as Linux/BSD/OSX sound systems.
See V.
M. DSound's underlying model is a ring buffer. This does not match mmdevapi's. Should we bypass mmdevapi because all it adds is latency? Reinvent HW acceleration? Provide a hidden API in mmdevapi?
No need, you know how mmdevapi behaves, you can write it in such a way without adding latency by using GetCurrentPadding or the clock.
N. other DSound
XAudio2: Modern apps will use that because mmdevapi is too low-level, presumably DSound usage will decrease.
O. XAudio2 appears to use the "worst case small period size writes" known from the Rage bug #
P. other
Capture:
Q. whatever capture issues
UNIX host:
R. thread priority -- no "Pro Audio" / Real-Time priority
Not going to happen, ever. AJ nuked all my attempts at it,
dbus+rtkit watchdog version is here: http://repo.or.cz/w/wine/multimedia.git/commit/431e943193d0d916a7bb6be32b0c2...
S. reliability of event/interrupt delivery over sustained intervals (hours, not minutes).
...?
T. Wine SetEvent& server round-trip times
This is indeed an awful case, but with the wineserver designed the way it is there's no other way around it, I honestly wouldn't be surprised if this is a performance issue on its own for some games...
Other:
U. FMOD& whatever, any particular constraints?
Last I checked (ages ago) fmod just worked. Might be different since last rewrite though..
V. other?
(rant in general) Stop trying to support pulseaudio with winealsa, with all the efforts you would have had a fully functioning driver by now. See my tree for a start, but it doesn't appear to work in extremely low latency cases (winepulse -> pulseaudio -> jack with jack set up for 40 * 3 samples buffer), need to look at it more first.
Thanks for your contribution, Jörg Höhle
~Maarten
Maarten Lankhorst wrote:
you know how mmdevapi behaves Just bite the bullet and add a winepulse driver already, I'll even fix mine to work better if it had a chance of getting accepted
I have no objection to a winepulse driver. Indeed we (not just me, hopefully) now know a lot about how mmdevapi behaves, i.e. what the driver(s) should do.
I just won't write it. PulseAudio started with a promise that there would be no need for a rewrite of every audio app because it would smoothly integrate with ALSA, broke that promise with poor quality plug-ins and the end of the story is that every app was "enhanced" to natively talk to PulseAudio? What a shame. What a summed waste of every project's resources!
with all the efforts you would have had a fully functioning driver by now.
I believe that working around bugs in PulseAudio/alsa_plugs only cost part of my (and Andrew's) time. One major part was learning mmdevapi which I knew nothing about 8 months ago, mostly by writing tests. Then advance partly into DSound and winmm devices whereas before, I was simply happy in my MCI niche.
I really hate it that the imminent release is stressing me such that I don't find enough time to perform the usual amount of Q&A to my patches. After all, I'm doing this work as a hobby, why suffer stress here? I did not decide that the time is ripe for a release and IMHO there are still major audio issues (yet bugzilla mostly lists ancient ones).
Please go ahead and make a good pulse driver, based on what we *now* know about mmdevapi. It might have benefits, e.g. latency control and session management. Hopefully a working GetPosition and underrun handling or else I recommend to not even start.
IMHO, it would not have been reasonable six months ago to start with >3 drivers simultaneously. Now we know much more about the mmdevapi target, so it makes sense nowadays.
The more drivers, the more people are needed to support them. For instance, wineoss has not yet received all bug fixes that went into winealsa or winecoreaudio (e.g. GetNextPacketSize vs. GetCurrentPadding). Who will be there for maintenance?
Regards, Jörg Höhle
Maarten,
thank you for participating!
Provide a hidden API in mmdevapi?
No need, you know how mmdevapi behaves, you can write it in such a way without adding latency by using GetCurrentPadding or the clock.
I don't know what you mean. What I mean is follows:
mmdevapi has no notion of rewinding. Rewinding is what PulseAudio recommends to master latency: http://0pointer.de/blog/projects/guide-to-sound-apis.html "Use snd_pcm_rewind() if you need to react to user input quickly. Do not assume that snd_pcm_rewind() is available and works and to which degree."
Let's say DSound uses a 200ms primary buffer. It can (and I believe Wine's DSound did that) mix all of the playing secondary buffers and feed 200ms of samples to ALSA. Effect: 200ms audio playing with nothing else to do.
Now let' suppose after 50ms, Play is invoked on an explosion buffer. DSound can query the hw pos and remix the remaining 150ms (or 140 to play safe) to include that noise. With ALSA, it can do so - either by using snd_pcm_rewind or - via direct access to the ALSA buffer (mmap).
The key point is that DSound's model (the HW buffer) is compatible with both feeding arbitrary large amounts of data in advance *and* quickly adding sounds with as little latency as possible.
Enters mmdevapi. No rewinding. A Release'd frame will be played, unless you Stop and Reset.
If you Release 200ms of data now, additional samples can only be heard afterwards. The solution so far: write next to nothing in advance -- 10ms! -- and rely on super fast interrupt & wake-up to reliably submit another 10ms just in time.
Assessment: failure. Wobbling sound and underruns reported in bugzilla. No wonder MS sells w7 with new machines only. The old ones can't stand the 10ms interrupt rate.
Alternatives:
A. Every DSound secondary buffer gets its mmdevapi stream. DSound::Play immediately calls IAC::Start.
Some says this is not useable because every mmdevapi stream maps 1:1 to snd_pcm_open and it's rumoured that cards would not support the amount of simultaneous connections corresponding to the number of secondary buffers that DSound apps typically use (rumour has it over 20). Again, DSound is not my domain of expertise.
That's why I've been arguing in bug #29531 that mmdevapi implements its own mixer. ALSA would only see one connection. How to implement that mixer? Now we circled once and are back at the above point about rewind/mmap.
B. Special DSound hooks that bypass mmdevapi's streaming design, allowing to both write 200ms of data in advance and overwrite part of it as needed -- hidden API or COM interfaces.
C. Other?
Well, that was the old DSound. XAudio2 ("the rage bug #28723") apparently goes the "every 10ms" route, hence Wine needs to support that in any case.
I've repeatedly argued that the fact that an app may may use 10ms (and risk gfx and sfx glitches and write 3GHz quadcore as min. requirement on the back of the DVD cover) should not cause apps using 200ms buffers to suffer glitches. How to meet both ends?
Regards, Jörg Höhle
Hey Joerg,
Op 27-01-12 16:35, Joerg-Cyril.Hoehle@t-systems.com schreef:
Maarten,
thank you for participating!
Provide a hidden API in mmdevapi?
No need, you know how mmdevapi behaves, you can write it in such a way without adding latency by using GetCurrentPadding or the clock.
I don't know what you mean. What I mean is follows:
mmdevapi has no notion of rewinding. Rewinding is what PulseAudio recommends to master latency: http://0pointer.de/blog/projects/guide-to-sound-apis.html "Use snd_pcm_rewind() if you need to react to user input quickly. Do not assume that snd_pcm_rewind() is available and works and to which degree."
Let's say DSound uses a 200ms primary buffer. It can (and I believe Wine's DSound did that) mix all of the playing secondary buffers and feed 200ms of samples to ALSA. Effect: 200ms audio playing with nothing else to do.
Now let' suppose after 50ms, Play is invoked on an explosion buffer. DSound can query the hw pos and remix the remaining 150ms (or 140 to play safe) to include that noise. With ALSA, it can do so
- either by using snd_pcm_rewind or
- via direct access to the ALSA buffer (mmap).
The key point is that DSound's model (the HW buffer) is compatible with both feeding arbitrary large amounts of data in advance *and* quickly adding sounds with as little latency as possible.
Enters mmdevapi. No rewinding. A Release'd frame will be played, unless you Stop and Reset.
If you Release 200ms of data now, additional samples can only be heard afterwards. The solution so far: write next to nothing in advance -- 10ms! -- and rely on super fast interrupt& wake-up to reliably submit another 10ms just in time.
Assessment: failure. Wobbling sound and underruns reported in bugzilla. No wonder MS sells w7 with new machines only. The old ones can't stand the 10ms interrupt rate.
I'm fairly confident older machines handle that properly too. Just requires properly written drivers. Windows vista was a lot more bloated than 7. I wouldn't count on rewinding for alsa either, I'm not even sure if dmix finally supports it or not, buffer slightly more data instead. I think this is what IAudioClient::GetStreamLatency is meant for.
Alternatives:
A. Every DSound secondary buffer gets its mmdevapi stream. DSound::Play immediately calls IAC::Start.
Some says this is not useable because every mmdevapi stream maps 1:1 to snd_pcm_open and it's rumoured that cards would not support the amount of simultaneous connections corresponding to the number of secondary buffers that DSound apps typically use (rumour has it over 20). Again, DSound is not my domain of expertise.
Doesn't work on streams where you can alter the frequency..
That's why I've been arguing in bug #29531 that mmdevapi implements its own mixer. ALSA would only see one connection. How to implement that mixer? Now we circled once and are back at the above point about rewind/mmap.
B. Special DSound hooks that bypass mmdevapi's streaming design, allowing to both write 200ms of data in advance and overwrite part of it as needed -- hidden API or COM interfaces.
C. Other?
Well, that was the old DSound. XAudio2 ("the rage bug #28723") apparently goes the "every 10ms" route, hence Wine needs to support that in any case.
I've repeatedly argued that the fact that an app may may use 10ms (and risk gfx and sfx glitches and write 3GHz quadcore as min. requirement on the back of the DVD cover) should not cause apps using 200ms buffers to suffer glitches. How to meet both ends?
Or just fix things properly and use rtkit for time critical threads, the real problem would be context switches with wineserver. The extra task switches and reliance on the scheduler to get things right could cause more problems..
~Maarten