I started out trying to pedantically clarify the question about "change in executable format", and ended up writing yet another summary of PE conversion. Hopefully someone will find this useful...
On 5/14/22 17:49, Charles Davis wrote:
How does a change in executable format get around the WindowServer limitation?
Simple: Wine will now run 32-bit programs as a 64-bit process, using 64-bit system interfaces. This is, in fact, how Windows itself supports 32-bit programs on a 64-bit kernel.
The program-facing parts of Wine (*.dll files) will be in PE32 format. But the host-facing parts (*.so files) will still be in the native binary format of the host system--in the case of Mac OS, that's 64-bit Mach-O. The PE parts enter the Unix parts by making a "syscall" (i.e. not a real host syscall, just a sequence that looks a lot like an NT syscall to user-mode code), so from the program's point of view, all the host-specific code runs in "kernel mode." 32-bit programs are supported the same way they are in Windows, namely, with the Windows-on-Win64 subsystem, using thunks inside special WOW64 DLLs (wow64*.dll).
It might be worth pointing out, just to further clarify Chip's answer, that the change in executable format does not *per se* get around any limitations.
What gets around the limitation is the ability to load a 32-bit code segment into the program, use it to run 32-bit application code, and then switch back to the 64-bit segment in order to call into 64-bit host libraries (such as, in this case, Quartz).
Using the PE executable format is not actually necessary to accomplish this (even for proper WoW64 support, I believe—though someone should correct me if I'm wrong. At least it's not necessary for obvious reasons.) However:
* We need to define a boundary in Wine between 32-bit and 64-bit code. Switching between them requires a far jump (a.k.a. ljmp). We need to choose what code is compiled as 32-bit, and what code is compiled as 64-bit.
* Unrelatedly, some programs expect Win32 DLLs to be in PE format on disk, usually for the purposes of digital restrictions management or anti-tamper. On the other hand, we need some code to be in host (.so/.dylib) format, so that it can link to other host code. [At the very least we need dlopen().]
* At the same time, and still unrelatedly, there are a large number of more minor differences between what Win32 programs expect about their execution environment, and what Unix libraries expect about their execution environment:
(a) Win32 programs expect the TEB to be in %fs or %gs (depending on architecture), but Unix C libraries tend to use it for their own per-thread data. glibc mostly coöperates with us in this respect by using whichever register Windows doesn't use. However, Mac libc does not (it does leave some commonly used TEB offsets untouched, but that has not always been enough). Additionally, bug 47198 concerns a rather creative anti-cheat that effectively means we need to reserve *both*.
(b) Win32 programs manually specify the amount of stack they use, or even set up stacks manually by changing the %esp register (Cygwin is a primary offender). However, Unix libraries may demand much larger stacks. We get around this currently by allocating larger stacks than the Win32 program requests if possible. [I'm not sure I know of any cases where this isn't enough or causes problems?]
(c) Win32 programs expect that the stack will be committed only by touching the guard page, whereas Unix libraries don't expect that they need to do this. We get around this by committing the whole stack from the beginning, but bug 47808 is caused by Cygwin allocating its own stack which is *not* fully committed. Both this, and the above, can be solved by having a separate Unix stack, and executing Unix code on that stack.
(d) Win32 programs like debuggers can insert breakpoints anywhere, or, more importantly, break a running program regardless of where it's running. For various reasons, this causes problems if the thread is in the middle of Unix code. I don't know of any public bug reports for this, but there's a demand for using native debuggers such as Visual Studio, at least enough that CodeWeavers is interested in fixing it. This can be solved by effectively masking off suspend requests while inside of a Unix call.
The ultimate effect is that, for various reasons, we want to define boundaries, and it ends up actually making sense to make all of these boundaries the *same* boundary. There are a few reasons for this:
* Perhaps most importantly, all of the above require some nontrivial thunking. We need to do some work change from PE code to .so code, or from 32-bit code to 64-bit code, or from Win32 code to Unix code. In particular:
- changing from 32-bit code to 64-bit code requires a far/long jump, as has been stated,
- changing from PE code to .so code requires some glue to determine *where* to jump to, as you can't just link from one to the other;
- changing from Win32 code to Unix code requires swapping %fs and/or %gs, switching stacks, marking down that suspend requests are masked, etc.
We need to define where these transitions take place so that we can perform that thunking, and ultimately that definition takes some nontrivial effort. We need to manually write thunks for every function that is thunked. If we can write the thunk once, and do all of the above transitions at the same time, that saves us considerable effort.
* It ends up making a lot of sense to think of Win32 code as "user" code, and Unix code as "kernel" code, especially when considering the problem of debuggers. A user-space debugger should be allowed to mess with any user-space code, but should effectively have suspend requests masked during kernel code, which matches what we want quite well. Similarly, real kernels will change stacks to execute kernel code.
* "User" code can always be expressed in PE format—after all, it only needs to link to other user libraries, plus the glue to the "kernel". On the other hand, it's not easy for Win32 programs to actually access kernel code in order to validate it against the on-disk form, so we don't actually need "kernel" code to be in PE format. If we write "kernel" code such that it never needs to call into a "user" library (which ends up being relatively easy), we can compile it in .so format and have it link to Unix libraries.
* The split between user and kernel code is *basically* done at the same place as the split between 32-bit and 64-bit code on Windows. There's a bit here that's handwaved, but it requires a detailed explanation of how WoW64 works on Windows.
The ultimate effect is that at this point, what's called "PE conversion" really only has *partially* to do with the PE format, and what's involved in the process of PE conversion is mostly defining the above thunks, and making sure that all code is on one side of the split or the other. The other part is, of course, writing the generic thunking code—the parts that switch segments and stacks, set up the dynamic library glue, and so on.
The "writing the generic glue" part is, at this point, almost entirely done. The "splitting up the code and writing thunks" is *mostly* done [considering where we started, it's been quite a long journey!] but there's still some bits left, and of course they are some of the trickiest parts of the Wine code base to split up.
Hope this was helpful for someone. I know Erich wrote a similar writeup, but there were a couple of missing bits in that I wanted to clarify as well.
ἔρρωσθε, Zeb