Handling page faults in syscalls
I just had a bit of a frustrating moment debugging why winecfg was crashing in my freshly built tree, and discovered something that many of you probably already know but is new to me: Wine swallows page faults that happen inside a syscall and just unwinds the stack to the caller returning an error, with only a trace level log. This seems surprising - normally page faults indicate a bug and need to be reported. And also scary - the code behind the syscall boundary can be anything, not necessarily part of the wine project, possibly even platform libc, and therefore not safe to unwind. My guess is these are caught instead of killing the process because that's the expected behavior of an NT syscall when you pass a bad pointer to it - that a bad dereference of a userspace pointer would be handled and turned into returning an fault error code. Is this right? But then what would actually make sense is to write separate functions for safely dereferencing user pointers, like Linux has. What should ideally be happening here? ~Theodore
Am Dienstag, 23. Dezember 2025, 02:12:57 Ostafrikanische Zeit schrieb tblodt--- via Wine-devel:
My guess is these are caught instead of killing the process because that's the expected behavior of an NT syscall when you pass a bad pointer to it - that a bad dereference of a userspace pointer would be handled and turned into returning an fault error code.
Some of the unexpected syscall faults I debugged happened because of incorrectly set up host libraries - libGL segfaulting when trying to create a context, gnutls or kerberos crashing when trying to load them. Those are not necessary for the base functionality of Wine, so we don't want to crash and burn due to it. I don't know if that's the real reason for the silent swallowing of SIGSEGVs. It is something to keep in mind though when changing the behavior. I ran into cases myself where I wish I had gotten more obvious notifications. The Vulkan code has assert(!status) after most calls after basic initialization, which will make a syscall sigsegv fatal. On the other hand, a Windows-side STATUS_ACCESS_VIOLATION also won't print anything by default, there are many applications that generate a ton of them in their normal operation.
Some of the unexpected syscall faults I debugged happened because of incorrectly set up host libraries - libGL segfaulting when trying to create a context, gnutls or kerberos crashing when trying to load them.
It turned out that the thing I was chasing originally was actually such a case (I pointed DYLD_LIBRARY_PATH at homebrew, which somehow made macos frameworks load a homebrew library instead of the framework's, which was missing some symbols it depended on, and... https://github.com/apple-oss-distributions/dyld/blob/3d96227e8b4626b8e6aa4b8...) I would really have preferred if it crashed out. I think I think of it like, the responsibility for fixing this kind of problem is on the packager, and it would be much better to fail noisily so it can be caught earlier by the people who should be fixing it.
On the other hand, a Windows-side STATUS_ACCESS_VIOLATION also won't print anything by default, there are many applications that generate a ton of them in their normal operation.
Do you mean the analog of EFAULT, or installing a SIGSEGV handler, or something else? Those are indeed "normal" and shouldn't report as a problem but also not what i'm talking about :p ~Theodore
Am 23.12.2025 um 10:41 schrieb tblodt@icloud.com:
I think I think of it like, the responsibility for fixing this kind of problem is on the packager, and it would be much better to fail noisily so it can be caught earlier by the people who should be fixing it.
On Linux external libraries are usually provided by the distribution. I think a concern is that we don't want so spam the user if e.g. the 32 bit LDAP library fails to load. For things like GL/Vulkan/d3d there is usually a more obvious message if the application does try to make use of it rather than just accidentally linking to ddraw. That said, I am not the one who wrote the syscall infrastructure, so I am just guessing here.
participants (2)
-
Stefan Dösinger -
tblodt@icloud.com