https://bugs.winehq.org/show_bug.cgi?id=46725
--- Comment #14 from Paul Gofman gofmanp@gmail.com --- (In reply to Ethan Lee from comment #13)
That all sounds correct - where things get complicated is WASAPI shared mode's method of calculating the buffer size. It's actually a lot like XAudio2 itself where the sound server is running at a specific frequency, and then when you open the mastering voice with another frequency, it has to calculate the frames needed to fit the server's quantum size.
I might be very well missing some basics here, but are you sure it has anything to do with devices at all on Windows? What I observe in test is that the processing buffer length depends just on the frequency specified in CreateSubmixVoice() call. So if the device frequency is different Win xaudio it is probably doing some resample / rebuffering?