Jinoh Kang (@iamahuman) commented about dlls/windows.media.speech/recognizer.c:
+ goto cleanup; + + if (SUCCEEDED(hr = IMMDevice_GetId(mm_device, &str))) + { + TRACE("selected capture device ID: %s\n", debugstr_w(str)); + CoTaskMemFree(str); + } + + if (FAILED(hr = IAudioClient_GetMixFormat(session->audio_client, (WAVEFORMATEX **)&wfx))) + goto cleanup; + + wfx->wFormatTag = WAVE_FORMAT_PCM; + wfx->nChannels = 1; + wfx->nSamplesPerSec = 16000; + wfx->nBlockAlign = 2; + wfx->wBitsPerSample = 16; Ditto. Something like (with appropriate defines):
```suggestion:-1+0 wfx->wBitsPerSample = WINE_VOSK_BITS_PER_SAMPLE; wfx->nBlockAlign = (wfx->wBitsPerSample + 7) / 8 * wfx->nChannels; ``` -- https://gitlab.winehq.org/wine/wine/-/merge_requests/1948#note_21045