It's pretty easy to elaborate - I have a minimal rust program (`fn main() { println!("hello") }`) compiled using mingw. The program fails to load, printing: ``` 0100:err:virtual:virtual_setup_exception stack overflow 2144 bytes in thread 0100 addr 0x17005a529 stack 0x207a0 (0x20000-0x21000-0x220000) ``` This problem does not occur when I copy the binary off the 9p mount to the local disk, i.e. it runs successfully and prints `hello`.
As it doesn't look like Wine sets O_NONBLOCK for files which are read in a way your patch concerns.
I do think wine is setting `O_NONBLOCK` in a way that affects the loader. After spending some time with gdb and winedbg it appears the sequence of events is approximately (bearing in mind this is my first real foray into wine):
__wine_main - start_main_thread - init_startup_info - (possibly build_initial_params ->) load_main_exe - open_main_image - open_dll_file - open_unix_file - server:DECL_HANDLER(create_file) - server:create_file -> `fd = open_fd(..., flags | O_NONBLOCK, ...)` - virtual_map_module - virtual_map_image - server_get_unix_fd - map_image_into_view - map_pe_header + map_file_into_view - pread
i.e. all file descriptors that wineserver creates in from `create_file` have `O_NONBLOCK` set on them - and one of these file descriptors is used for mapping in the PE header.
Happening this early in the loading process lines up with the process not even starting, and it matches up with my strace the problematic process (below), where
- the fd is returned from the server via `recvmsg` (`server_get_unix_fd`) - `fcntl` is called by `receive_fd` - the pattern of `mmap` then `pread` (which continues longer than the pasted log) looks like the per-section calls to `map_file_into_view` - you can see multiple short read returns from `pread`
(this trace is a single process snippet grepped out of a `strace -f` log where other processes had interleaved syscalls, hence the many `<unfinished...>` - there were no signals though)
``` [pid 3258] recvmsg(12, <unfinished ...> [pid 3258] <... recvmsg resumed>{msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="$\0\0\0", iov_len=4}], msg_iovlen=1, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[5]}], m sg_controllen=24, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 4 [pid 3258] fcntl(5, F_SETFD, FD_CLOEXEC <unfinished ...> [pid 3258] <... fcntl resumed>) = 0 [pid 3258] rt_sigprocmask(SIG_SETMASK, [HUP INT USR1 USR2 ALRM CHLD IO], <unfinished ...> [pid 3258] <... rt_sigprocmask resumed>NULL, 8) = 0 [pid 3258] rt_sigprocmask(SIG_BLOCK, [HUP INT USR1 USR2 ALRM CHLD IO], <unfinished ...> [pid 3258] <... rt_sigprocmask resumed>[HUP INT USR1 USR2 ALRM CHLD IO], 8) = 0 [pid 3258] mmap(0x140000000, 5025792, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...> [pid 3258] <... mmap resumed>) = 0x140000000 [pid 3258] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...> [pid 3258] <... mmap resumed>) = 0x7fe4b9d00000 [pid 3258] fstat(5, <unfinished ...> [pid 3258] <... fstat resumed>{st_mode=S_IFREG|0775, st_size=5422023, ...}) = 0 [pid 3258] mmap(0x140000000, 1536, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 5, 0 <unfinished ...> [pid 3258] <... mmap resumed>) = 0x140000000 [pid 3258] mmap(0x140001000, 765952, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0 <unfinished ...> [pid 3258] <... mmap resumed>) = 0x140001000 [pid 3258] pread64(5, <unfinished ...> [pid 3258] <... pread64 resumed>"\303ff.\17\37\204\0\0\0\0\0\17\37@\0H\203\354(H\213\5\365>\16\0001\311\307\0\1"..., 765952, 1536) = 126976 [pid 3258] mmap(0x1400bc000, 512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0 <unfinished ...> [pid 3258] <... mmap resumed>) = 0x1400bc000 [pid 3258] pread64(5, <unfinished ...> [pid 3258] <... pread64 resumed>"\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0\0\202\3@\1\0\0\0"..., 512, 767488) = 512 [pid 3258] mmap(0x1400bd000, 163840, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0 <unfinished ...> [pid 3258] <... mmap resumed>) = 0x1400bd000 [pid 3258] pread64(5, <unfinished ...> [pid 3258] <... pread64 resumed>"invalid args\0\0\0\0\0\320\v@\1\0\0\0\f\0\0\0\0\0\0\0"..., 163840, 768000) = 126976 [pid 3258] mmap(0x1400e5000, 25600, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0 <unfinished ...> [pid 3258] <... mmap resumed>) = 0x1400e5000 [pid 3258] pread64(5, <unfinished ...> [pid 3258] <... pread64 resumed>"\0\20\0\0\1\20\0\0\0\300\16\0\20\20\0\0>\21\0\0\4\300\16\0@\21\0\0\211\21\0\0"..., 25600, 931840) = 25600 ```
I don't think removing the `O_NONBLOCK` from [here](https://gitlab.winehq.org/wine/wine/-/blob/b5e19a33c9360784961918a364175ab2a...) is viable - it's been there for a *very* long time (https://gitlab.winehq.org/wine/wine/-/commit/0562539d18638e5afdd81b4d894c880...) and there are surely far reaching implications.
I did consider disabling `O_NONBLOCK` on the fd when it's received from the server in `virtual_map_image`. Unfortunately, `man 7 unix` documents `SCM_RIGHTS` as being semantically equivalent to `dup` so it'll affect the flags of the fd held in wineserver, which defeats the point.
Open to other ideas - so far I'm still inclined towards the current general approach in this MR as the most self-contained approach to the problem.