This implements a number of optimizations, in particular: - Swapping around the resampling loops in case of downsampling, allowing the FIR step to stay fixed regardless of the resampling ratio. - Rearranging the FIR array elements to make the access sequential. - Adding SSE versions of the resampling functions. Together, these amount to more than a 5x reduction of `cp_fields_resample` execution time. The quality of the resampling should be the same, or even improve slightly, due to a more precise `rem` calculation and removal of the FIR step rounding, although I haven't yet conducted any measurements. -- v4: dsound: Get all channel samples in one go. dsound: Put all channel samples in one go. dsound: Get rid of get_aux and call the functions directly. dsound: Get rid of put_aux and call the functions directly. dsound: Add a 32-bit AVX+FMA3 version of downsample. dsound: Add a 32-bit AVX+FMA3 version of upsample. dsound: Add a 32-bit SSE version of downsample. dsound: Add a 32-bit SSE version of upsample. dsound: Use #define for fir.h contants. dsound: Use a 0.32 fixed point number to represent the resampling ratio. dsound: Replace multiplications by fir_step and fir_width with bit shifts. dsound: Premultiply the input value by firgain and the interpolation weights in downsample. dsound: Transpose the FIR array to make the element access sequential. dsound: Calculate firgain more accurately. dsound: Calculate required_input more accurately. dsound: Swap around the two nested loops in downsample. dsound: Don't invert the remainder twice in upsample. dsound: Use a fixed upsampling loop boundary. dsound: Don't pass dsbfirstep to upsample. dsound: Don't apply firgain in upsample. dsound: Split resample into separate downsample and upsample functions. dsound: Factor out resampling. dsound: Remove asserts from the resampling loop. dsound: Resample into a temporary buffer. dsound: Resample one channel at a time. dsound: Get rid of fir_copy. dsound: Use signed int to calculate indices during resampling. dsound: Multiply by dsbfirstep after calculating the modulus. dsound: Use the modulus operator instead of divide-multiply-subtract. dsound: Do the subtraction before converting to float to improve rem precision. dsound: Don't use double-precision arithmetic in the resampler. dsound: Remove the unused freqneeded field. dsound: Use a better FIR filter generated with Parks-McClellan algorithm. This merge request has too many patches to be relayed via email. Please visit the URL below to see the contents of the merge request. https://gitlab.winehq.org/wine/wine/-/merge_requests/9928