http://bugs.winehq.org/show_bug.cgi?id=10467
--- Comment #6 from Anastasius Focht focht@gmx.net 2007-11-18 12:07:37 --- Created an attachment (id=9238) --> (http://bugs.winehq.org/attachment.cgi?id=9238) patch to process stack guard page to allow CLR exceptions from .NET runtime to succeed
Hello,
this is probably the most important patch of all. Small but full of consequences ... took me some time to find out what's going on ;-)
At current state of wine (even with previous patches applied), any CLR exception thrown from managed/JIT code will cause .NET apps to segfault. Exceptions thrown from managed code are not bad by default. In fact they are important program flow execution mechanism. Almost every .NET app throws some in it's lifetime. Well I skip the glory details here how managed exceptions work at OS level...
At a certain point after the native callstack unwinding took place, the managed code (JIT) undergoes some complex unwinding process to restore the JIT virtual machine to correct state.
Basically the JIT code manager crawls the managed code stacks and rebuilds frames to resume execution at some point. During this stack crawl, some prerequisites checked to make sure the stack is in "good" state. One if them are stack guard pages. Unfortunately this is an area where the current wine implementation fails...
In windows stack guard pages are used for dynamic thread stack growth. When the app touches the guard page, windows commits that page and the next uncommitted page becomes the new guard page. Automatic stack growth works only for the guard page and stack memory usually grows in page size (=4K) increments.
The .NET runtime code checks the process stack regions for present guard pages. This fails for wine: although the a guard page is set up, its located at other location and the flags do not match (PAGE_NOACCESS vs. PAGE_GUARD). This leads to a somewhat catastrophic result ;-)
Imagine the following code snippet (I added fancy function names to better reading) .. assuming no page guards have been found.
--- snip --- call get_current_stack_pointer_in_eax mov esi, eax and esi, 0FFFFF000h mov edi, 1000h mov ecx, ebx sub esi, edi call get_managed_thread_top_stack_addr_in_eax mov ebx, eax cmp esi, ebx jb we_got_a_problem_no_place_for_page_guard push esi call setup_page_guard .. call verify_if_page_guard_is_really_there --- snip ---
Basically it sets up a page guard in next stack page (rounded) using VirtualProtect( .. PAGE_GUARD) (within setup_page_guard()).
At this point the rounding for page guard address leaves estimated 0x250 bytes before the VirtualProtect() call to designated guard address. No problem for windows - even if page guard is touched by nested API calls it's automagically fixed by OS.
The wine sequence: VirtualProtect -> VirtualProtectEx -> NtProtectVirtualMemory -> VIRTUAL_SetProt -> mprotect *kaboooooom*
It immediately segfaults when calling mprotect, aborting the program. Before mprotect(), the stack space has around 0x100 bytes left to designated guard address (some locals within NtProtectVirtualMemory eat space). Somewhere in glibc code the remaining stack space to designated guard address is eaten. When mprotect() is executed, the "guard" page is immediately touched by glibc code itself which leads to segmentation fault. Interesting not-so-obvious problem ;-)
Unfortunately this place can't be fixed in wine. I removed some locals to gain more stack space but not avail. 0x200 bytes are not enough for the call sequence to succeed. Bypassing the guard region with a large buffer on stack before calling mprotect() won't work: as soon as the stack pointer is re-adjusted on function exit, the address will cross the guard page address - even if it does not touch it - it will cause abort. Dead end ... even if it succeeds to setup the guard in this case, other code might cause PAGE_GUARD violations - depending on call stack depth - which are not automagically handled in wine,
Fortunately I found a solution. When it looks for guard pages, it starts at bottom of stack and walks it's way up (VirtualQuery). When it finds such a guard page, the problematic code which re-enables guard pages is not executed. So wine has to make sure there is an initial guard page the .NET runtime likes ;-)
Attached patch fixes this. Notice it uses NtCurrentTeb()->Tib.StackLimit rather than the current NtCurrentTeb()->DeallocationStack to setup guard page address. This is required because .NET runtime starts page guard walking by looking at Tib.StackLimit until it reaches current stack pointer area. Also the PAGE_GUARD flag needs to be explicitly set.
This should allow most .NET software to start or at least not to segfault anymore. ;-) Even if unrecoverable .NET errors are encountered you now get a detailed description on console what went wrong and can fix this accordingly (like specific truetype missing whatever).
Regards