Hi all,
Reading everything I can find about Wine on macOS, seems like there still isn't a good solution for 32bit on Catalina and beyond.[1] I'm currently running Catalina and don't intend to upgrade my Mac or the OS anytime soon.
My goal is to compile everything on Catalina and to run 32bit Wine on Catalina. I can compile 64bit Wine without a problem and 32bit Wine compiles against the MacOSX10.13.sdk.
I can run 32bit Wine with the `no32exec=0' boot argument and DYLD_ROOT_PATH pointing at a directory containing `/System' & `/usr/lib' from a machine running 10.13. It crashes because of kernel changes that are incompatible with libraries from High Sierra. Specifically, (10.15) libdispatch redefined work queue priorities, so the (10.15) kernel starts the work queue thread with an invalid priority as understood by (10.13) libdispatch.
Exception Type: EXC_BAD_INSTRUCTION (SIGILL) Exception Codes: 0x0000000000000001, 0x0000000000000000 Exception Note: EXC_CORPSE_NOTIFY
Termination Signal: Illegal instruction: 4 Termination Reason: Namespace SIGNAL, Code 0x4 Terminating Process: exc handler [22976]
Application Specific Information: dyld2 mode BUG IN CLIENT OF LIBDISPATCH: Corrupted priority
[...]
Thread 1 Crashed: 0 libdispatch.dylib 0x0643aef5 _dispatch_worker_thread3 + 180 1 libsystem_pthread.dylib 0x068c3fa5 _pthread_wqthread + 1356 2 libsystem_pthread.dylib 0x068c3a32 start_wqthread + 34
Initially, I thought using DYLD_ROOT_PATH with 10.13 system directories was the way to go. So I modified libdispatch to use the new priority definitions and replaced `/usr/lib/system/libdispatch.dylib' in the DYLD_ROOT_PATH with my modified version. This got past the work queue crash, but a few other crashes emerged that needed to be squashed in libdispatch, eventually I made it to NSApplication initialization.
Unfortunately, the application initialization crashes when it tries to initialize the user defaults subsystem. I was hoping I would only have to solve a few crashes in some of the open source libraries. Of course this wasn't the case.
Since I had made it past the initial app bootstrapping and into ObjC frameworks I was no longer in open source code. I could have kept playing Whac-A-Mole, as it is especially easy to monkey patch ObjC code with method swizzling. I considered going down this path, but before going there, I decided to try a DYLD_ROOT_PATH with system directories from 10.14.
I'm glad I tried using 10.14 dylibs before sinking more time into patching and hacking my way through 10.13 dylibs. The 10.14 dylibs don't crash, instead the WindowServer refuses to allow a connection from a 32bit process.
_RegisterApplication(), FAILED TO REGISTER PROCESS WITH CPS/CoreGraphics in WindowServer, err=-304
I suspected this might be an AppKit limitation so I tried using the deprecated Carbon libs and got the same error. Then I tried using private WindowServer APIs directly, which didn't error in the same way, but also didn't draw anything.[2] Compiling the same code for 64bit, however, draws as expected.
A way forward for 32bit Wine is possible. Just do drawing in a 64bit process. It might sound like a hack, but drawing or interacting with the WindowServer is already done using IPC—it is just transparent to the developer. I already made a proof of concept, so I know the technique is solid.
Additionally, on Catalina at least, this removes the need for the DYLD_ROOT_PATH since the 32bit process would, in theory, only need to link with libSystem and it ships with 32bit binaries.
$ sw_vers -productVersion 10.15.7 $ lipo -info /usr/lib/libSystem.dylib Architectures in the fat file: /usr/lib/libSystem.dylib are: x86_64 i386
And by extension, most everything in the umbrella framework `/usr/lib/system/*' also contains 32bit binaries.
My intention is to put the winemac.drv in its own 64bit process. I don't really have a question, this is more a statement of intent, but I'd welcome any feedback, thoughts, or concerns. If nothing else this email memorializes the things someone is likely to encounter if they try to make 32bit Wine run on Catalina or anyone that tries to run any 32bit app on Catalina.
Eddie
[1] I'm aware of CrossOver, etc.
[2] I did eventually get one of the private functions to return -304 which is probably significant as it matches the _RegisterApplication() error.
Am Mittwoch, 20. April 2022, 15:55:13 EAT schrieb Eddie Hillenbrand:
My goal is to compile everything on Catalina and to run 32bit Wine on Catalina. I can compile 64bit Wine without a problem and 32bit Wine compiles against the MacOSX10.13.sdk.
Whoa, you put a lot of effort into this and some results, and I am surprised that Catalina actually can create 32 bit processes.
That said, what you are doing is not a good way forward. Even if we could make it work, and it magically worked on newer MacOS too, doing something Apple doesn't support won't work in the long run.
However, the PE split is making progress in Wine, and winex11.drv is now split into a .so and .dll file. winemac.drv will be next. According to statements from Alexandre on IRC you can already run command line apps in a crossover- like 32on64 mode. Once the graphics driver is in place GUI apps won't be far away. And it will be much cleaner than the solution crossover is using, since it won't need a custom C compiler.
So have some patience, a proper solution that works on newer OSes and even inside Rosetta 2 is coming. And at some future time we can even include our own CPU emulator for ARM support on Linux and for a future without Rosetta 2.
The current winemac.drv blocker will need to be resolved before this conversion happens as right now it’s only working on Mojave and later.
On Fri, May 13, 2022 at 7:56 AM Stefan Dösinger stefandoesinger@gmail.com wrote:
Am Mittwoch, 20. April 2022, 15:55:13 EAT schrieb Eddie Hillenbrand:
My goal is to compile everything on Catalina and to run 32bit Wine on Catalina. I can compile 64bit Wine without a problem and 32bit Wine compiles against the MacOSX10.13.sdk.
Whoa, you put a lot of effort into this and some results, and I am surprised that Catalina actually can create 32 bit processes.
That said, what you are doing is not a good way forward. Even if we could make it work, and it magically worked on newer MacOS too, doing something Apple doesn't support won't work in the long run.
However, the PE split is making progress in Wine, and winex11.drv is now split into a .so and .dll file. winemac.drv will be next. According to statements from Alexandre on IRC you can already run command line apps in a crossover- like 32on64 mode. Once the graphics driver is in place GUI apps won't be far away. And it will be much cleaner than the solution crossover is using, since it won't need a custom C compiler.
So have some patience, a proper solution that works on newer OSes and even inside Rosetta 2 is coming. And at some future time we can even include our own CPU emulator for ARM support on Linux and for a future without Rosetta 2.
Stefan Dösinger stefandoesinger@gmail.com writes:
[[PGP Signed Part:Undecided]] Am Mittwoch, 20. April 2022, 15:55:13 EAT schrieb Eddie Hillenbrand:
My goal is to compile everything on Catalina and to run 32bit Wine on Catalina. I can compile 64bit Wine without a problem and 32bit Wine compiles against the MacOSX10.13.sdk.
However, the PE split is making progress in Wine, and winex11.drv is now split into a .so and .dll file. winemac.drv will be next. [...] Once the graphics driver is in place GUI apps won't be far away.
How do I get involved and help with that effort? How does a change in executable format get around the WindowServer limitation?
On Sat, May 14, 2022 at 4:04 PM Eddie Hillenbrand eddie@graphdyne.com wrote:
Stefan Dösinger stefandoesinger@gmail.com writes:
[[PGP Signed Part:Undecided]] Am Mittwoch, 20. April 2022, 15:55:13 EAT schrieb Eddie Hillenbrand:
My goal is to compile everything on Catalina and to run 32bit Wine on Catalina. I can compile 64bit Wine without a problem and 32bit Wine compiles against the MacOSX10.13.sdk.
However, the PE split is making progress in Wine, and winex11.drv is now split into a .so and .dll file. winemac.drv will be next. [...] Once the graphics driver is in place GUI apps won't be far away.
How do I get involved and help with that effort?
https://www.winehq.org/getinvolved
Studying how it was done in winex11.drv may be instructive.
How does a change in executable format get around the WindowServer limitation?
Simple: Wine will now run 32-bit programs as a 64-bit process, using 64-bit system interfaces. This is, in fact, how Windows itself supports 32-bit programs on a 64-bit kernel.
The program-facing parts of Wine (*.dll files) will be in PE32 format. But the host-facing parts (*.so files) will still be in the native binary format of the host system--in the case of Mac OS, that's 64-bit Mach-O. The PE parts enter the Unix parts by making a "syscall" (i.e. not a real host syscall, just a sequence that looks a lot like an NT syscall to user-mode code), so from the program's point of view, all the host-specific code runs in "kernel mode." 32-bit programs are supported the same way they are in Windows, namely, with the Windows-on-Win64 subsystem, using thunks inside special WOW64 DLLs (wow64*.dll).
I started out trying to pedantically clarify the question about "change in executable format", and ended up writing yet another summary of PE conversion. Hopefully someone will find this useful...
On 5/14/22 17:49, Charles Davis wrote:
How does a change in executable format get around the WindowServer limitation?
Simple: Wine will now run 32-bit programs as a 64-bit process, using 64-bit system interfaces. This is, in fact, how Windows itself supports 32-bit programs on a 64-bit kernel.
The program-facing parts of Wine (*.dll files) will be in PE32 format. But the host-facing parts (*.so files) will still be in the native binary format of the host system--in the case of Mac OS, that's 64-bit Mach-O. The PE parts enter the Unix parts by making a "syscall" (i.e. not a real host syscall, just a sequence that looks a lot like an NT syscall to user-mode code), so from the program's point of view, all the host-specific code runs in "kernel mode." 32-bit programs are supported the same way they are in Windows, namely, with the Windows-on-Win64 subsystem, using thunks inside special WOW64 DLLs (wow64*.dll).
It might be worth pointing out, just to further clarify Chip's answer, that the change in executable format does not *per se* get around any limitations.
What gets around the limitation is the ability to load a 32-bit code segment into the program, use it to run 32-bit application code, and then switch back to the 64-bit segment in order to call into 64-bit host libraries (such as, in this case, Quartz).
Using the PE executable format is not actually necessary to accomplish this (even for proper WoW64 support, I believe—though someone should correct me if I'm wrong. At least it's not necessary for obvious reasons.) However:
* We need to define a boundary in Wine between 32-bit and 64-bit code. Switching between them requires a far jump (a.k.a. ljmp). We need to choose what code is compiled as 32-bit, and what code is compiled as 64-bit.
* Unrelatedly, some programs expect Win32 DLLs to be in PE format on disk, usually for the purposes of digital restrictions management or anti-tamper. On the other hand, we need some code to be in host (.so/.dylib) format, so that it can link to other host code. [At the very least we need dlopen().]
* At the same time, and still unrelatedly, there are a large number of more minor differences between what Win32 programs expect about their execution environment, and what Unix libraries expect about their execution environment:
(a) Win32 programs expect the TEB to be in %fs or %gs (depending on architecture), but Unix C libraries tend to use it for their own per-thread data. glibc mostly coöperates with us in this respect by using whichever register Windows doesn't use. However, Mac libc does not (it does leave some commonly used TEB offsets untouched, but that has not always been enough). Additionally, bug 47198 concerns a rather creative anti-cheat that effectively means we need to reserve *both*.
(b) Win32 programs manually specify the amount of stack they use, or even set up stacks manually by changing the %esp register (Cygwin is a primary offender). However, Unix libraries may demand much larger stacks. We get around this currently by allocating larger stacks than the Win32 program requests if possible. [I'm not sure I know of any cases where this isn't enough or causes problems?]
(c) Win32 programs expect that the stack will be committed only by touching the guard page, whereas Unix libraries don't expect that they need to do this. We get around this by committing the whole stack from the beginning, but bug 47808 is caused by Cygwin allocating its own stack which is *not* fully committed. Both this, and the above, can be solved by having a separate Unix stack, and executing Unix code on that stack.
(d) Win32 programs like debuggers can insert breakpoints anywhere, or, more importantly, break a running program regardless of where it's running. For various reasons, this causes problems if the thread is in the middle of Unix code. I don't know of any public bug reports for this, but there's a demand for using native debuggers such as Visual Studio, at least enough that CodeWeavers is interested in fixing it. This can be solved by effectively masking off suspend requests while inside of a Unix call.
The ultimate effect is that, for various reasons, we want to define boundaries, and it ends up actually making sense to make all of these boundaries the *same* boundary. There are a few reasons for this:
* Perhaps most importantly, all of the above require some nontrivial thunking. We need to do some work change from PE code to .so code, or from 32-bit code to 64-bit code, or from Win32 code to Unix code. In particular:
- changing from 32-bit code to 64-bit code requires a far/long jump, as has been stated,
- changing from PE code to .so code requires some glue to determine *where* to jump to, as you can't just link from one to the other;
- changing from Win32 code to Unix code requires swapping %fs and/or %gs, switching stacks, marking down that suspend requests are masked, etc.
We need to define where these transitions take place so that we can perform that thunking, and ultimately that definition takes some nontrivial effort. We need to manually write thunks for every function that is thunked. If we can write the thunk once, and do all of the above transitions at the same time, that saves us considerable effort.
* It ends up making a lot of sense to think of Win32 code as "user" code, and Unix code as "kernel" code, especially when considering the problem of debuggers. A user-space debugger should be allowed to mess with any user-space code, but should effectively have suspend requests masked during kernel code, which matches what we want quite well. Similarly, real kernels will change stacks to execute kernel code.
* "User" code can always be expressed in PE format—after all, it only needs to link to other user libraries, plus the glue to the "kernel". On the other hand, it's not easy for Win32 programs to actually access kernel code in order to validate it against the on-disk form, so we don't actually need "kernel" code to be in PE format. If we write "kernel" code such that it never needs to call into a "user" library (which ends up being relatively easy), we can compile it in .so format and have it link to Unix libraries.
* The split between user and kernel code is *basically* done at the same place as the split between 32-bit and 64-bit code on Windows. There's a bit here that's handwaved, but it requires a detailed explanation of how WoW64 works on Windows.
The ultimate effect is that at this point, what's called "PE conversion" really only has *partially* to do with the PE format, and what's involved in the process of PE conversion is mostly defining the above thunks, and making sure that all code is on one side of the split or the other. The other part is, of course, writing the generic thunking code—the parts that switch segments and stacks, set up the dynamic library glue, and so on.
The "writing the generic glue" part is, at this point, almost entirely done. The "splitting up the code and writing thunks" is *mostly* done [considering where we started, it's been quite a long journey!] but there's still some bits left, and of course they are some of the trickiest parts of the Wine code base to split up.
Hope this was helpful for someone. I know Erich wrote a similar writeup, but there were a couple of missing bits in that I wanted to clarify as well.
ἔρρωσθε, Zeb
On Sat, May 14, 2022 at 6:29 PM Zebediah Figura zfigura@codeweavers.com wrote:
... Using the PE executable format is not actually necessary to accomplish this (even for proper WoW64 support, I believe—though someone should correct me if I'm wrong. At least it's not necessary for obvious reasons.) ...
This is correct, out of morbid curiosity (back when I still had some free time) I wrote a prototype "Linux WoW64" wrapper and I can say that this 100% works without doing anything in PE. I say morbid curiosity because I made this tool able to simultaneously load 32-bit and 64-bit libraries in the same program and be able to create 32-bit thunks from a library's headers and that whole process is ... interesting.
...
Hope this was helpful for someone. I know Erich wrote a similar writeup, but there were a couple of missing bits in that I wanted to clarify as well.
Heh, good memory - my "writeup" was more of a TLDR for end-users. This is _far_ more complete and covers a lot of the technical issues that get solved along the way.
Best, Erich
On 5/16/22 12:24, Erich E. Hoover wrote:
On Sat, May 14, 2022 at 6:29 PM Zebediah Figura zfigura@codeweavers.com wrote:
... Using the PE executable format is not actually necessary to accomplish this (even for proper WoW64 support, I believe—though someone should correct me if I'm wrong. At least it's not necessary for obvious reasons.) ...
This is correct, out of morbid curiosity (back when I still had some free time) I wrote a prototype "Linux WoW64" wrapper and I can say that this 100% works without doing anything in PE. I say morbid curiosity because I made this tool able to simultaneously load 32-bit and 64-bit libraries in the same program and be able to create 32-bit thunks from a library's headers and that whole process is ... interesting.
Right, conceptually it shouldn't matter. In practice, with the way things are currently done in Wine, there might be some non-obvious reason that WoW64 breaks if MinGW isn't available.
...
Hope this was helpful for someone. I know Erich wrote a similar writeup, but there were a couple of missing bits in that I wanted to clarify as well.
Heh, good memory - my "writeup" was more of a TLDR for end-users. This is _far_ more complete and covers a lot of the technical issues that get solved along the way.
Oops, I actually meant *Eric* Pouech [1]; I was unaware you had written up anything ;-)
[1] https://www.winehq.org/pipermail/wine-devel/2022-April/213677.html
On Mon, May 16, 2022 at 11:36 AM Zebediah Figura zfigura@codeweavers.com wrote:
. .. Oops, I actually meant *Eric* Pouech [1]; I was unaware you had written up anything ;-)
[1] https://www.winehq.org/pipermail/wine-devel/2022-April/213677.html
Yeah, I wondered if that were the case - but I thought you might be referencing this [1] from last year.
Best, Erich
[1] https://www.winehq.org/pipermail/wine-devel/2021-August/193082.html