From: Jinoh Kang jinoh.kang.kr@gmail.com
Ensure proper atomic update, so that the application do not read stale information from IO_STATUS_BLOCK.
Also, ensure IOSB Status loads are not reordered with Information loads from the consumer side as well, specifically in the following functions:
- GetOverlappedResultEx - WSAGetOverlappedResult --- dlls/kernelbase/file.c | 62 ++++++++++++++++++++++++++++- dlls/ntdll/unix/unix_private.h | 71 ++++++++++++++++++++++++++++++++-- dlls/ws2_32/socket.c | 62 ++++++++++++++++++++++++++++- 3 files changed, 190 insertions(+), 5 deletions(-)
diff --git a/dlls/kernelbase/file.c b/dlls/kernelbase/file.c index 3f3c3649c31..6034e744d81 100644 --- a/dlls/kernelbase/file.c +++ b/dlls/kernelbase/file.c @@ -3160,7 +3160,67 @@ BOOL WINAPI DECLSPEC_HOTPATCH GetOverlappedResultEx( HANDLE file, OVERLAPPED *ov
TRACE( "(%p %p %p %lu %d)\n", file, overlapped, result, timeout, alertable );
- status = io->iosb.u.Status; + /* + * When a system procedure that peforms an overlapped I/O operation returns + * before the I/O has completed, the system sets the Status field of the + * associated (passed as an argument) IO_STATUS_BLOCK to STATUS_PENDING to + * indicate that it is still pending. After the I/O operation is + * completed, the system notifies the application of the I/O completion + * through the following steps: + * + * 1. The output buffer, if any, is populated with the data retrieved from + * the I/O operation. + * + * 2. The Information field of the associated IO_STATUS_BLOCK is written to + * indicate e.g. the number of bytes transferred. + * + * 3. The Status field of the IO_STATUS_BLOCK is written to indicate the + * status of the I/O operation. + * + * 4. Finally, the system signals the application via synchronization APIs + * (waitable objects), I/O completion ports, and/or APCs. + * + * Therefore, if the Status field is set to any value other than + * STATUS_PENDING, we can safely assume that the I/O operation has + * completed. + * + * Here is a catch: memory accesses (read and write) can be reordered by the + * compiler or a processor using weak memory ordering, which affects the + * memory values observed by other threads running on different processors. + * Specifically, it's possible that the reads from the Status field gets + * reordered *after* the reads from the Information field. If we race with + * I/O completion, this results in the application reading a stale + * Information value. In fact, any memory operation (read or write) in the + * current thread could be reordered before the Status read in theory. + * + * CPU 0 (set_async_iosb) CPU 1 (GetOverlappedResultEx) + * =========================== =========================== + * READ(Information) <-- (reordered!) + * WRITE(Information) + * WRITE(Status) + * READ(Status) + * + * In order to prevent such data races, we place appropriate barriers to + * limit reordering so that overlapped I/O operations work correctly in + * multi-threaded applications. A release barrier placed just before the + * Status write is paired with an acquire barrier placed just after the + * Status read. This ensures that all writes from the completed I/O + * operation are seen after GetOverlappedResultEx observes the completion + * (i.e. Status != STATUS_PENDING). + * + * CPU 0 (set_async_iosb) CPU 1 (GetOverlappedResultEx) + * =========================== =========================== + * WRITE(Information) + * <release barrier> <-----------. + * WRITE(Status) | + * | + * | READ(Status) + * (paired with) `-> <acquire barrier> + * READ(Information) + * + * Paired with write-release of IOSB Status in ntdll's set_async_iosb. + */ + status = ReadAcquire( &io->iosb.u.Status ); if (status == STATUS_PENDING) { if (!timeout) diff --git a/dlls/ntdll/unix/unix_private.h b/dlls/ntdll/unix/unix_private.h index 73b9ed76de0..46854e2135c 100644 --- a/dlls/ntdll/unix/unix_private.h +++ b/dlls/ntdll/unix/unix_private.h @@ -361,6 +361,8 @@ static inline BOOL in_wow64_call(void)
static inline void set_async_iosb( client_ptr_t iosb, NTSTATUS status, ULONG_PTR info ) { + NTSTATUS *status_ptr; + if (!iosb) return;
if (in_wow64_call()) @@ -370,19 +372,82 @@ static inline void set_async_iosb( client_ptr_t iosb, NTSTATUS status, ULONG_PTR NTSTATUS Status; ULONG Information; } *io = wine_server_get_ptr( iosb ); - io->Status = status; + status_ptr = &io->Status; io->Information = info; } else { IO_STATUS_BLOCK *io = wine_server_get_ptr( iosb ); #ifdef NONAMELESSUNION - io->u.Status = status; + status_ptr = &io->u.Status; #else - io->Status = status; + status_ptr = &io->Status; #endif io->Information = info; } + + /* + * When a system procedure that peforms an overlapped I/O operation returns + * before the I/O has completed, the system sets the Status field of the + * associated (passed as an argument) IO_STATUS_BLOCK to STATUS_PENDING to + * indicate that it is still pending. After the I/O operation is + * completed, the system notifies the application of the I/O completion + * through the following steps: + * + * 1. The output buffer, if any, is populated with the data retrieved from + * the I/O operation. + * + * 2. The Information field of the associated IO_STATUS_BLOCK is written to + * indicate e.g. the number of bytes transferred. + * + * 3. The Status field of the IO_STATUS_BLOCK is written to indicate the + * status of the I/O operation. + * + * 4. Finally, the system signals the application via synchronization APIs + * (waitable objects), I/O completion ports, and/or APCs. + * + * Therefore, if the Status field is set to any value other than + * STATUS_PENDING, GetOverlappedResultEx can safely assume that the I/O + * operation has completed. + * + * Here is a catch: memory accesses (read and write) can be reordered by the + * compiler or a processor using weak memory ordering, which affects the + * memory values observed by other threads running on different processors. + * Specifically, it's possible that the writes to the Status field gets + * reordered *before* the writes to the Information field. If we race with + * calls to GetOverlappedResultEx or WSAGetOverlappedResultx, this results + * in the application reading a stale Information value. In fact, any + * memory operation (read or write) in the current thread could be reordered + * after the Status write in theory. + * + * CPU 0 (set_async_iosb) CPU 1 (GetOverlappedResult) + * =========================== =========================== + * WRITE(Status) + * READ(Status) + * READ(Information) <-- (stale read!) + * WRITE(Information) <-- (reordered from above) + * + * In order to prevent such data races, we place appropriate barriers to + * limit reordering so that overlapped I/O operations work correctly in + * multi-threaded applications. A release barrier placed just before the + * Status write is paired with an acquire barrier placed just after the + * Status read. This ensures that all writes from the completed I/O + * operation are seen after GetOverlappedResultEx observes the completion + * (i.e. Status != STATUS_PENDING). + * + * CPU 0 (set_async_iosb) CPU 1 (GetOverlappedResult) + * =========================== =========================== + * WRITE(Information) + * <release barrier> <-----------. + * WRITE(Status) | + * | + * | READ(Status) + * (paired with) `-> <acquire barrier> + * READ(Information) + * + * Paired with GetOverlappedResultEx and WSAGetOverlappedResult. + */ + WriteRelease( status_ptr, status ); }
static inline client_ptr_t iosb_client_ptr( IO_STATUS_BLOCK *io ) diff --git a/dlls/ws2_32/socket.c b/dlls/ws2_32/socket.c index be491a32b97..ef65f887c6a 100644 --- a/dlls/ws2_32/socket.c +++ b/dlls/ws2_32/socket.c @@ -3706,7 +3706,67 @@ BOOL WINAPI WSAGetOverlappedResult( SOCKET s, LPWSAOVERLAPPED lpOverlapped, return FALSE; }
- status = io->iosb.u.Status; + /* + * When a system procedure that peforms an overlapped I/O operation returns + * before the I/O has completed, the system sets the Status field of the + * associated (passed as an argument) IO_STATUS_BLOCK to STATUS_PENDING to + * indicate that it is still pending. After the I/O operation is + * completed, the system notifies the application of the I/O completion + * through the following steps: + * + * 1. The output buffer, if any, is populated with the data retrieved from + * the I/O operation. + * + * 2. The Information field of the associated IO_STATUS_BLOCK is written to + * indicate e.g. the number of bytes transferred. + * + * 3. The Status field of the IO_STATUS_BLOCK is written to indicate the + * status of the I/O operation. + * + * 4. Finally, the system signals the application via synchronization APIs + * (waitable objects), I/O completion ports, and/or APCs. + * + * Therefore, if the Status field is set to any value other than + * STATUS_PENDING, we can safely assume that the I/O operation has + * completed. + * + * Here is a catch: memory accesses (read and write) can be reordered by the + * compiler or a processor using weak memory ordering, which affects the + * memory values observed by other threads running on different processors. + * Specifically, it's possible that the reads from the Status field gets + * reordered *after* the reads from the Information field. If we race with + * I/O completion, this results in the application reading a stale + * Information value. In fact, any memory operation (read or write) in the + * current thread could be reordered before the Status read in theory. + * + * CPU 0 (set_async_iosb) CPU 1 (WSAGetOverlappedResult) + * =========================== =========================== + * READ(Information) <-- (reordered!) + * WRITE(Information) + * WRITE(Status) + * READ(Status) + * + * In order to prevent such data races, we place appropriate barriers to + * limit reordering so that overlapped I/O operations work correctly in + * multi-threaded applications. A release barrier placed just before the + * Status write is paired with an acquire barrier placed just after the + * Status read. This ensures that all writes from the completed I/O + * operation are seen after WSAGetOverlappedResult observes the completion + * (i.e. Status != STATUS_PENDING). + * + * CPU 0 (set_async_iosb) CPU 1 (WSAGetOverlappedResult) + * =========================== =========================== + * WRITE(Information) + * <release barrier> <-----------. + * WRITE(Status) | + * | + * | READ(Status) + * (paired with) `-> <acquire barrier> + * READ(Information) + * + * Paired with write-release of IOSB Status in ntdll's set_async_iosb. + */ + status = ReadAcquire( &io->iosb.u.Status ); if (status == STATUS_PENDING) { if (!fWait)