New subject: Announcing security hardened kernels for testing

5 Jan 2005

      Hi,
Thanks for the great info. I'll CC this to wine-devel as I think it's of
general interest, I hope you don't mind.
For context, PaX is a set of security patches for Linux which lock down
the system in a similar manner to exec-shield and SELinux. I say similar
manner, because PaX seems to go further than these systems do - in fact
from what I've read it seems to be the 'gold standard' in security
patches.
I'll quote the whole email and reply inline. This thread started on
ubuntu-devel after one Ubuntu developer said they were experimenting
with PaX, and I asked what the differences were between it and exec-
shield (with which the community seems to have more experience) and why
it was chosen. So I was pointed towards this thread:
http://lists.debian.org/debian-devel/2003/11/msg00206.html
in which the PaX author and Ingo Molnar who did exec-shield discuss the
differences.
On Wed, 2005-01-05 at 13:37 +0100, pageexec@freemail.hu wrote:
...
Hello,
just ran across this thread on the ubuntu-devel list and have
a few observations:

PaX cares about backwards compatibility as much as it cares about
security, the best compromise we could make is that one can mark
executables to be exempt from PaX enforcements (and you should
have known about this as we'd talked about PaX+wine last year...).

OK. I don't remember this thread I'm afraid but I do recall that you can
exempt particular programs from PaX, so if a distribution wanted to
integrate that it'd have to mark Wine as exempt by default. Presumably
if WineHQ/CodeWeavers were to ship binary packages we'd have to do the
same to work on such a distribution. But it's just an ELF flag right?
...

as of the 20041201 snapshot of wine, it needs to be exempt from
at least ASLR [1], because it still makes some invalid assumptions
about the address space:

the highest mapping in the address space may not be the stack,
nor is the highest mapping (be that the stack or something
else) supposed to extend to the end of the userland address
space. the end result of this assumption is that some piece
of code in the preloader enters an infinite loop requesting
(but never getting) anon mappings above TASK_SIZE (0xc0000000
typically). excerpt from an strace:

mmap2(0xbffe0000, 262144, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x77ec0000
munmap(0x77ec0000, 262144)              = 0
mmap2(0xbffe0000, 131072, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0xbffe0000
mmap2(0xc0000000, 131072, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x77ee0000
munmap(0x77ee0000, 131072)              = 0
mmap2(0xc0000000, 65536, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x77ef0000
munmap(0x77ef0000, 65536)               = 0
 the last lines then repeat indefinitely as the kernel would
 never give out the request address, even with MAP_FIXED.

We do this because some Windows programs and DLLs cannot cope with
getting pointers >2gig, so we need to ensure that the kernel does not
give us mappings above this point. The only way to do this currently is
to do an iterative reservation to map as much of this address space as
possible which is what you're seeing here.
...

the above mentioned infinite loop also highlighted another bad
assumption wine makes: mmap() without MAP_FIXED but with a non-0
hint is under no obligation to observe the hint and give you a
mapping at that address, under PaX it doesn't do so explicitly.
excerpt from an strace:

mmap2(0x81000000, 1034813440, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x3a420000
munmap(0x3a420000, 1034813440)          = 0
mmap2(0x81000000, 517406720, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x81000000
mmap2(0x9fd70000, 517406720, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x59190000
munmap(0x59190000, 517406720)           = 0
mmap2(0x9fd70000, 258670592, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x9fd70000
 as you can see, wine insists on an address until it gets it,
 without using MAP_FIXED.

We have no choice in the matter, I think we can't use MAP_FIXED as
that'd risk blowing away any mappings already made above the 2gig
boundary. Actually this code was originally written to support the 4G/4G
VM patch that was put into Fedora for a while (it's gone now).
...

there's at least /usr/lib/wine/ntdll.dll.so which is marked with
an executable PT_GNU_STACK program header, suggesting that it
needs an executable stack (or there's some build problem).

Last time I looked documentation on exactly what triggers this flag is
scarce or non-existant. I remember asking Ingo if inline assembly still
generated it and the answer back then was no, but I have no idea why gcc
has decided it's needed now. If you look at ntdll in the sources:
http://source.winehq.org/source/dlls/ntdll/
It's fairly harmless, there is some assembly in there but I don't
remember seeing any code which assumed an executable stack.
...
this
   alone would make wine fail under a PaX kernel as PT_GNU_STACK is
   completely ignored there (because it's the wrong solution for the
   wrong problem), nor is it allowed to generate code at runtime
   (this applies to apps on which PaX is enforced of course, one can
   always disable these on a per-executable basis).
I'm afraid Wine cannot operate in an environment that doesn't allow us
to map pages as executable and fill them with generated code. This
technique is:
a) Used by some Windows programs
b) Used by the Wine DLL loader
c) Required to implement DCOM universal interface proxies
So if PaX denies this as a matter of course then it will never work.
Having read the thread with Ingo I must say I agree with him that
runtime code generation is a legitimate technique and not a bug.
...
i also have memories from about a year ago that kernel32.so had
   some executable code snippets (some thunking code?) in .data or
   some other otherwise non-executable area, that of course wouldn't
   (and didn't) work under PaX either. back then Alexandre Julliard
   suggested that this wasn't easy to rewrite (by also making the
   now static code text reloc free) - has this been done since then?
I don't think so, but I don't remember this thread either.
...
i also have strace excerpts that show how wine wanted to create
   writable and executable memory, suggesting that it still wants to
   generate code at runtime and this is how it fails under PaX (which
   prevents runtime code generation by default):
mmap2(NULL, 1179648, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fee0000
munmap(0x7fff0000, 65536)               = 0
mprotect(0x7fee0000, 65536, PROT_READ|PROT_WRITE|PROT_EXEC) = -1 EACCES (Permission denied)
munmap(0x7fee0000, 1114112)             = 0
write(2, "wine: failed to create the proce"..., 40wine: failed to create the process heap
) = 40
on a sidenote, XP SP2 makes the default heap non-executable as
   well, so if the above is the result of wanting to be compatible
   with Windows, you may want to rethink it for the future.
I'm not sure why it is, but yes I expect it's because some programs rely
on it. Service Pack 2 may well make the default process heap NX but it
also has a huge infrastructure in place to deal with backwards
compatibility concerns, including a large database of badly behaved
apps, user-accessible GUIs to disable the protections and I believe it
also has code to catch NX faults (on hardware that supports that) and
ask the user if they wish to disable the protections for that
application.
...

it would be nice if wine-preloader and wine-pthread had a configurable
base address, the current default makes them impossible to use
under the faster non-executable method of PaX/i386 (which halves
the userland address space, [2]).

They have to be fixed otherwise the kernel may place them in the middle
of a reserved area which would cause initialisation to fail. This cannot
be changed.
...
so, right now wine can't run with a randomized address space, i have
yet to test if it can get away without generating code at runtime
and/or having writable/executable memory.
It can't and I don't see any way to make it able to operate under such
conditions in future. Is there a way to brand the binaries as excluded
from PaX at build time without a special tool? If not would you be
willing to submit a build system patch to detect the branding tool on
PaX systems and use it on the relevant binaries automatically?
thanks -mike

re: Announcing security hardened kernels for testing