It turns out that WinHttpReceiveResponse() completes synchronously in async mode (unless recursive request for handling authorization or redirect is involved). Some apps depend on that and do not wait for WINHTTP_CALLBACK_STATUS_HEADERS_AVAILABLE, calling WinHttpQueryHeaders() or WinHttpWebSocketCompleteUpgrade() right after calling WinHttpReceiveResponse, relying on that to finish synchronously.
My initial out of tree testing shows that no network communication is performed during WinHttpReceiveResponse() call (when recursive request is not involved). I tested that by inserting a wait between WinHttpSendRequest and WinHttpReceiveResponse and disabling network connection during the wait. WinHttpReceiveResponse still succeeds on Windows.
I think the above means that the actual response receiving from server is performed during WinHttpSendRequest. WinHttpReceiveResponse is not a complete no-op however. As shown by the existing tests the notifications related to receiving response are still delivered during WinHttpReceiveResponse (albeit in the same thread). Also WinHttpReceiveResponse affects request state: querying headers or upgrading to websocket without calling WinHttpReceiveResponse does not succeed.
When redirect is involved, all the WINHTTP_CALLBACK_STATUS_RECEIVING_RESPONSE, WINHTTP_CALLBACK_STATUS_RESPONSE_RECEIVED and WINHTTP_CALLBACK_STATUS_REDIRECT notifications are delivered synchronously from the calling thread. Then, the new request send notifications and response receiving notifications are delivered from the new thread.
An interesting case is when WinHttpReceiveResponse is called from SendRequest callbacks in async mode. If WinHttpReceiveResponse is called from WINHTTP_CALLBACK_STATUS_SENDING_REQUEST or WINHTTP_CALLBACK_STATUS_REQUEST_SENT (i. e., when request is not complete yet), calling WinHttpReceiveResponse() suddenly succeeds an shows the following message sequence (that is partially reflected in the tests I am adding):
- calling WinHttpReceiveResponse from WINHTTP_CALLBACK_STATUS_SENDING_REQUEST (which is already called on the async thread on Win10, thread A). Win8 queues WINHTTP_CALLBACK_STATUS_SENDING_REQUEST synchronously and goes async a bit later.
- WINHTTP_CALLBACK_STATUS_RECEIVING_RESPONSE, thread A;
- WinHttpReceiveResponse() returns to the caller WINHTTP_CALLBACK_STATUS_REQUEST_SENT callback; returning from user callback;
- WINHTTP_CALLBACK_STATUS_REQUEST_SENT, thread A;
- WINHTTP_CALLBACK_STATUS_SENDREQUEST_COMPLETE (another thread, although the sequence is probably synced; I am not implementing this part and calling this callback from the same thread A);
- WINHTTP_CALLBACK_STATUS_RESPONSE_RECEIVED in thread A;
- WINHTTP_CALLBACK_STATUS_HEADERS_AVAILABLE in thread A.
So the receive_response() state RECEIVE_RESPONSE_SEND_INCOMPLETE is primarily needed to handle this case.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/1582
--
v5: ntdll: Inline __wine_unix_call(_fast) dispatch in the syscall dispatcher.
ntdll: Restore frame in return path of the x86 syscall dispatchers.
winecrt0: Inline PE __wine_unix_call(_fast) function calls.
ntdll: Only save non-volatile FPU registers for -nofpu syscalls.
opengl32: Use __wine_unix_call_fast instead of __wine_unix_call.
ntdll: Introduce a new __wine_unix_call_fast syscall.
ntdll: Use -nofpu for NtQueryPerformanceCounter and NtYieldExecution.
winebuild: Introduce a new -nofpu syscall spec flag.
ntdll: Add support for syscall flags in the service CounterTable.
ntdll: Avoid double indirection to get x86_64 syscall_frame pointer.
ntdll: Check SYSCALL_HAVE_WRFSGSBASE syscall flag only for wrfsbase.
ntdll: Swap %eax and %edx registers in the i386 syscall dispatcher.
ntdll: Check syscall table and syscall number before saving FPU.
ntdll: Use named labels for jumps in the syscall dispatcher.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1324
In preparation for https://gitlab.winehq.org/wine/wine/-/merge_requests/1324.
This also begins preparation for a slightly different route than what the MR currently takes, with syscall flags eventually stored in the CounterTable rather than overusing syscall number unused bits.
To do that we're checking the syscall number and loading the syscall table (keeping it in %rbx/%ebx) earlier. This assumes that %rbx isn't modified in between, for instance by the eventual `SYS_arch_prctl` syscall, but I believe it is the case?
--
v2: ntdll: Check SYSCALL_HAVE_WRFSGSBASE syscall flag only for wrfsbase.
ntdll: Swap %eax and %edx registers in the i386 syscall dispatcher.
ntdll: Check syscall table and syscall number before saving FPU.
ntdll: Use named labels for jumps in the syscall dispatcher.
https://gitlab.winehq.org/wine/wine/-/merge_requests/1437
The latter is not guaranteed to be equal across events, and in practice may not be.
--
v2: winebus.sys: Search for added devices by devnode path in process_monitor_event().
https://gitlab.winehq.org/wine/wine/-/merge_requests/1510