Version 1 of the patch by Keno Fischer.
-- v2: ntdll/server: Make robust to spurious short writes
From: Keno Fischer keno@juliacomputing.com
It is possible for the write/writev functions in send_request to return short writes, even in non-error conditions. There are several situations where this might happen. Examples are: - SIGSTOP/SIGCONT (either explicitly or via ptrace attach) - cgroup freezes and similar mechanisms - system suspends - External debuggers or profilers
In general, Linux makes very few guarantees about syscall restarts. In some cases (in particular when no bytes have been transferred at all), the linux kernel will automatically restart the system call, but once any bytes have been transferred, the result will be a short write with no automatic restart.
Make wine robust to this corner case by properly restarting a short write with adjusted buffers.
Signed-off-by: Keno Fischer keno@juliacomputing.com --- dlls/ntdll/unix/server.c | 41 +++++++++++++++++++++++++++++++++------- 1 file changed, 34 insertions(+), 7 deletions(-)
diff --git a/dlls/ntdll/unix/server.c b/dlls/ntdll/unix/server.c index a6e07215479..9439e5dbcc6 100644 --- a/dlls/ntdll/unix/server.c +++ b/dlls/ntdll/unix/server.c @@ -182,13 +182,25 @@ static DECLSPEC_NORETURN void server_protocol_perror( const char *err ) static unsigned int send_request( const struct __server_request_info *req ) { unsigned int i; - int ret; + int ret = 0;
+ int to_write = sizeof(req->u.req) + req->u.req.request_header.request_size; if (!req->u.req.request_header.request_size) { - if ((ret = write( ntdll_get_thread_data()->request_fd, &req->u.req, - sizeof(req->u.req) )) == sizeof(req->u.req)) return STATUS_SUCCESS; - + const char *write_ptr = (const char *)&req->u.req; + for (;;) { + ret = write( ntdll_get_thread_data()->request_fd, (void*)write_ptr, + to_write ); + if (ret == to_write) return STATUS_SUCCESS; + else if (ret < 0) break; + /* Short write. Most signals are blocked at this point, but it is + still possible to experience a syscall restart due to, e.g. + a SIGSTOP, cgroup freeze or external debug/profile tooling. + This is not an error. Simply adjust the remaining write length + and buffer and start again. */ + to_write -= ret; + write_ptr += ret; + } } else { @@ -201,11 +213,26 @@ static unsigned int send_request( const struct __server_request_info *req ) vec[i+1].iov_base = (void *)req->data[i].ptr; vec[i+1].iov_len = req->data[i].size; } - if ((ret = writev( ntdll_get_thread_data()->request_fd, vec, i+1 )) == - req->u.req.request_header.request_size + sizeof(req->u.req)) return STATUS_SUCCESS; + + for (;;) { + ret = writev( ntdll_get_thread_data()->request_fd, vec, i+1 ); + if (ret == to_write) return STATUS_SUCCESS; + else if (ret < 0) break; + /* Short write as above. Adjust buffer lengths and start again. */ + to_write -= ret; + for (unsigned int j = 0; j < i+1; j++) { + if (ret >= vec[j].iov_len) { + ret -= vec[j].iov_len; + vec[j].iov_len = 0; + } else { + vec[j].iov_base = (char *)vec[j].iov_base + ret; + vec[j].iov_len -= ret; + break; + } + } + } }
- if (ret >= 0) server_protocol_error( "partial write %d\n", ret ); if (errno == EPIPE) abort_thread(0); if (errno == EFAULT) return STATUS_ACCESS_VIOLATION; server_protocol_perror( "write" );
This is [version 2](https://gitlab.winehq.org/bernhardu/wine/-/commit/925172934df8c6e96846864dbf...) of the patch by Keno Fischer to allow partial writes.
For debugging I record Wine processes with [rr-debugger](https://github.com/rr-debugger/rr). This leads here to the same message `wine client error:13c: partial write 65536` when recording a vanilla build or the Debian packages from winehq. It would be nice to use winehq packages when debugging sometimes.
If this is at a too central point, this is maybe a candidate for staging?
In Bugzilla are a few entries which may be related, at least mention this error message: * [11265](https://bugs.winehq.org/show_bug.cgi?id=11265) * [16162](https://bugs.winehq.org/show_bug.cgi?id=16162) * [24447](https://bugs.winehq.org/show_bug.cgi?id=24447) * [39648](https://bugs.winehq.org/show_bug.cgi?id=39648) * [43949](https://bugs.winehq.org/show_bug.cgi?id=43949) * [54975](https://bugs.winehq.org/show_bug.cgi?id=54975)
[PATCH] ntdll/server: Make robust to spurious short writes * [15 Dez 2021 10:22 Keno Fischer](https://list.winehq.org/mailman3/hyperkitty/list/wine-devel@winehq.org/messa...) * [15 Dez 2021 10:44 Gijs Vermeulen](https://list.winehq.org/mailman3/hyperkitty/list/wine-devel@winehq.org/messa...) * [15 Dez 2021 10:48 Keno Fischer](https://list.winehq.org/mailman3/hyperkitty/list/wine-devel@winehq.org/messa...) * [15 Dez 2021 11:02 Alexandre Julliard](https://list.winehq.org/mailman3/hyperkitty/list/wine-devel@winehq.org/messa...) * [16 Dez 2021 01:11 Keno Fischer](https://list.winehq.org/mailman3/hyperkitty/list/wine-devel@winehq.org/messa...)
[PATCH v2] ntdll/server: Make robust to spurious short writes (no longer working [pipermail link](https://www.winehq.org/pipermail/wine-devel/2021-December/203292.html) ) * [16 Dez 2021 02:09 Keno Fischer](https://list.winehq.org/mailman3/hyperkitty/list/wine-devel@winehq.org/messa...) * [16 Dez 2021 02:48 Marvin](https://list.winehq.org/mailman3/hyperkitty/list/wine-devel@winehq.org/messa...)
I am very interested in using this rr-debugger. Unfortunately, I was unable to use rr-debugger successfully.
On Sun Apr 20 16:20:39 2025 +0000, Maotong Zhang wrote:
I am very interested in using this rr-debugger. Unfortunately, I was unable to use rr-debugger successfully.
Hello Zhang, I was able to use rr-debugger in the past with following conditions: * rr must be able to record at all, e.g. `rr record true` is working. * The patch of this MR has to be applied to wine, otherwise `wine client error:13c: partial write 65536` appears. * Was working with a GCC mingw build, old style wow64 (separate configure and build steps for 32-bit and 64-bit). (Currently I have a LLVM new style wow64 which triggers some fault in rr.) * Either the whole set of wine processes has to be inside the recording. (Or at least wineserver and the process of interest, but is more complicated to startup and stop wine.) * Have wine*-preloader removed, otherwise gdb cannot load debug information for *.so files. * When just debugging at the unix-side (*.so) the regular gdb should be sufficient. * For the PE-side (dll/exe) a patched [gdb](https://github.com/JuliaComputing/gdb-solib-wine) (or rebased to more recent [gdb](https://github.com/bernhardu/gdb-solib-wine) version) is needed, so debug information for dll/exe files get loaded with information from the "Windows dynamic loader".
This are many conditions that need to be met, but if all is working its nice to be able to debug early startup or exceptions, forward and reverse. Too bad gdb does not understand PDBs. Backtraces are stopping at "syscall" boundaries.
On Mon Apr 21 15:40:04 2025 +0000, Bernhard Übelacker wrote:
Hello Zhang, I was able to use rr-debugger in the past with following conditions:
- rr must be able to record at all, e.g. `rr record true` is working.
- The patch of this MR has to be applied to wine, otherwise `wine client
error:13c: partial write 65536` appears.
- Was working with a GCC mingw build, old style wow64 (separate
configure and build steps for 32-bit and 64-bit). (Currently I have a LLVM new style wow64 which triggers some fault in rr.)
- Either the whole set of wine processes has to be inside the recording.
(Or at least wineserver and the process of interest, but is more complicated to startup and stop wine.)
- Have wine*-preloader removed, otherwise gdb cannot load debug
information for *.so files.
- When just debugging at the unix-side (*.so) the regular gdb should be sufficient.
- For the PE-side (dll/exe) a patched
[gdb](https://github.com/JuliaComputing/gdb-solib-wine) (or rebased to more recent [gdb](https://github.com/bernhardu/gdb-solib-wine) version) is needed, so debug information for dll/exe files get loaded with information from the "Windows dynamic loader".
- The files get loaded by gdb here from the wineprefix, therefore files
need to match the build directory, e.g. by running `wine wineboot --update --force`. This are many conditions that need to be met, but if all is working its nice to be able to debug early startup or exceptions, forward and reverse. Too bad gdb does not understand PDBs. Backtraces are stopping at "syscall" boundaries.
Thank you very much, I am researching it.