https://bugs.winehq.org/show_bug.cgi?id=48291
Andrew Wesie awesie@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |awesie@gmail.com
--- Comment #51 from Andrew Wesie awesie@gmail.com --- (In reply to Paul Gofman from comment #37)
(In reply to David Torok from comment #36)
I've been thinking about ways to go about this, and thought we could rewrite the seccomp filter to trap only on virtual addresses outside of glibc. (see the attachment)
What are your thoughts, would that work well for us?
Please note the following:
- Linux programs and libraries are doing syscalls directly whenever they
want often bypassing glibc wrappers. Some syscalls don't even have such wrappers. Wine does a number of syscalls directly, so do a lot of native libraries. So the address range from where you can get a native syscall is absolutely not limited by glibc code segments.
I think filtering on the instruction pointer could work on 64-bit, because you don't need to reuse address space; 32-bit is a different matter. Wine already runs out of address space with some programs.
You can load one seccomp filter per windows DLL / exec mapping and trap if the instruction pointer is within the address range. For better performance, if you can reserve a "large enough" area of memory for these mappings, you would only need one seccomp filter in the fast case. This would avoid the problem of trying to guess which Linux libraries use syscalls.
On 32-bit, you might be able to get away with using VDSO to detect Linux syscalls. I know that glibc will use VDSO for syscalls if possible, not sure about other libraries we may be concerned about.
- glibc (and other libraries) can be loaded at different addresses, and you
can't ever change seccomp filters: once set, they are inherited by the child processes and can't be removed or replaced, only new filters can be added. Maybe there is an exception for process having admin caps (CAP_SYS_ADMIN), but that's not something I would mess with in Wine.
Honestly, wine should not generally be using exec* system calls. I am not saying that it doesn't, just that if we think about the semantics of creating a new process on Windows, the new process should be forked from wineserver or something.
Since seccomp appears to be per-thread, maybe you could create an unconstrained thread that will be used for fork and exec*. This would avoid adding more cruft to wineserver or additional IPC.