Thanks Alexander. Thoughts below...
On Sat, May 19, 2012 at 09:09:35PM +0600, Alexander E. Patrakov wrote:
There are two ways to implement a high-performance resampler, and I have prepared (conflicting, pick no more than one) patches for both:
1 (this patch): Use a shorter FIR with the existing code. This has the advantage of higher quality (unwanted frequencies are at least attempted to be rejected) and almost no new code. 2 (the other patch): Write new code. E.g., linear interpolation. This is what Windows XP does at its lowest quality setting, and it eats less CPU than variant 1.
Do you have an opinion on which of these patches to use? The low-quality FIR has the advantage of not introducing another codepath. On the other hand, the linear resampler codepath is very simple, and even easier on the CPU.
I'm leaning towards the linear resampler for its larger CPU usage benefits.
Also note that, as evicenced by the debugging patch, a Core 2 Duo E6420 @ 2.13 GHz _can_ resample more than 32 streams simultaneously from various weird rates to 48000 Hz. As GTA:SA reportedly creates only 16 secondary buffers, it _should_ have more than enough CPU time to mix them. IMHO, this makes bug #30639 look somewhat strange: on GyB's computer, GTA:SA stutters, while Darwinia (which looks more demanding about sound) doesn't. It may well be that in fact none of my patches are needed, and that the real bug is that the CPU-intensive cp_fields() function is called from a wrong thread or process. I don't have the expertise needed to debug this.
I did some research on this. Darwinia creates up to 32 buffers, like you said. GTA:SA creates and destroys buffers as needed, and I saw it go as high as 31 in a quick test. Darwinia's buffer frequencies range in the 40-90 kHz range and resample to 22050 Hz, while GTA:SA's range around 10-20 kHz and resample to 48 kHz.
So in each time step, GTA:SA requires about 1000-2000 get_current_sample() calls, but 4800 FIR convolutions per buffer.
Darwinia requires 4000-9000 get_current_sample() calls, but only about 2200 convolutions per buffer.
I suspect the convolutions are considerably more expensive than the get_current_sample() calls, so I would actually expect GTA:SA to be more CPU taxing. That should explain what's going on here.
We could test this on Gyb's machine by setting DefaultSampleRate=22050 and hacking <dlls/dsound/primary.c:primarybuffer_SetFormat> to return S_OK without actually changing the primary buffer's format. That should give GTA:SA similar cp_fields performance to Darwinia, and I expect it would fix the lag issue.
Andrew