On 19.01.2016 12:20, Jacek Caban wrote:
Signed-off-by: Jacek Caban jacek@codeweavers.com
tools/winebuild/build.h | 1 + tools/winebuild/parser.c | 1 + tools/winebuild/spec32.c | 59 ++++++++++++++++++++++++++++++++++++++++ tools/winebuild/winebuild.man.in | 3 ++ 4 files changed, 64 insertions(+)
I would also like to get this issue fixed, so I wouldn't complain if Alexandre really prefers the JIT approach, but I personally still do not really see a big advantage compared to static code generation. Let me quickly explain again why. I'll also send another mail to point a couple of issues with the current patch, but here again first the general discussion.
* We have no way to implement support for 64-bit because SYSCALL instructions are directly part of the thunk. Unless we want to remove the "not" in the definition of "Wine is not an emulator", its technically impossible to intercept those Syscalls and we cannot implement it.
* 32-bit Windows 8 thunks also cannot be implemented, with the same reason as 64-bit thunks. To see that, just take a look at the expected opcodes:
// 00 b825000000 mov eax,25h // 05 e803000000 call eip+3 // 0a c22c00 ret 2Ch // 0d 8bd4 mov edx,esp // 0f 0f34 sysenter // 11 c3 ret // 12 8bff mov edi,edi
And at the very strict verification:
[...] CallEip != function_code.call_eip || function_code.call_offset != 3 [...] kSysenter != function_code.sysenter
* Based on my knowledge, implementing WOW64 doesn't help much because we have no app which relies on it. Although their code technically would support it, it seems like Chromium and Steam never uses it in practice. At least during our testing so far, we haven't found a way to fix that.
* Last but not least, one of the reasons why I haven't submitted my own approach so far: The whole idea only works reliable as long as we do NOT forward Ldr* functions to syscall thunks. If we do, like it should work on Windows, and like it is most likely required by other apps, there are race conditions all over the place and Steam webstore only works with a probability of about 50% or less. I haven't figured out what causes it yet.
@ Alexandre: Do you have a preferrence? Suggestions we had so far:
- JIT compilation of thunks (only x86) - Winebuild generated static thunks (see http://ix.io/nxT ) - Macro-generated thunks (see staging repo)
To my knowledge, all have the same limitations: x86, non-wow64 and race-conditions when forwarding Ldr* functions through the syscall thunks.
Best regards, Sebastian
Sebastian Lackner sebastian@fds-team.de writes:
@ Alexandre: Do you have a preferrence? Suggestions we had so far:
- JIT compilation of thunks (only x86)
- Winebuild generated static thunks (see http://ix.io/nxT )
- Macro-generated thunks (see staging repo)
To my knowledge, all have the same limitations: x86, non-wow64 and race-conditions when forwarding Ldr* functions through the syscall thunks.
As long as we have these limitations, I don't feel like committing any of them. I'm not convinced that there's a reasonable way to make that sort of thing work, and maintaining a complex piece of code that doesn't truly solve the problem isn't very appealing. Has anybody tried submitting a patch to Chrome to disable the sandbox under Wine?
On 1/19/16 6:29 AM, Alexandre Julliard wrote:
Sebastian Lackner sebastian@fds-team.de writes:
As long as we have these limitations, I don't feel like committing any of them. I'm not convinced that there's a reasonable way to make that sort of thing work, and maintaining a complex piece of code that doesn't truly solve the problem isn't very appealing. Has anybody tried submitting a patch to Chrome to disable the sandbox under Wine?
Of course Jacek will know better than I do, but I thought something like this was also necessary for the Office 2013 online installer. Perhaps it's different, though.
On 01/19/16 15:44, Josh DuBois wrote:
On 1/19/16 6:29 AM, Alexandre Julliard wrote:
Sebastian Lackner sebastian@fds-team.de writes:
As long as we have these limitations, I don't feel like committing any of them. I'm not convinced that there's a reasonable way to make that sort of thing work, and maintaining a complex piece of code that doesn't truly solve the problem isn't very appealing. Has anybody tried submitting a patch to Chrome to disable the sandbox under Wine?
Of course Jacek will know better than I do, but I thought something like this was also necessary for the Office 2013 online installer. Perhaps it's different, though.
This would help, yes, but I later found that a lot less invasive solution is enough for the Office 2013. However, I didn't send a few of required patches yet mostly because if we had syscall emulation in Wine, some of them wouldn't be needed.
Thanks, Jacek
On 01/19/16 13:29, Alexandre Julliard wrote:
Sebastian Lackner sebastian@fds-team.de writes:
@ Alexandre: Do you have a preferrence? Suggestions we had so far:
- JIT compilation of thunks (only x86)
- Winebuild generated static thunks (see http://ix.io/nxT )
- Macro-generated thunks (see staging repo)
To my knowledge, all have the same limitations: x86, non-wow64 and race-conditions when forwarding Ldr* functions through the syscall thunks.
As long as we have these limitations, I don't feel like committing any of them. I'm not convinced that there's a reasonable way to make that sort of thing work, and maintaining a complex piece of code that doesn't truly solve the problem isn't very appealing. Has anybody tried submitting a patch to Chrome to disable the sandbox under Wine?
Are you suggesting disabling the whole sandbox on Wine? How about something less radical? There are a few things that we could try to change in Chrome that would still require changes in Wine, but would give us a working sandboxing and wouldn't have to be Wine-specific (at least not in an explicit way). A few ideas would be:
- Make checks less strict. If checks for sysenter would be removed or changed to allow some sort of regular call, it would allow us to implement problematic calls in Wine. That would be a trivial change in Chome, making it more likely that they would take it.
- Change code inspecting functions to try to match all known function bodies instead of expecting exactly one of them depending on Windows version. With this change, implementing one variant in Wine would solve the problem for us and we wouldn't need win8 and wow64 variants (although it wouldn't be enough for 64-bit).
- Recognize hotpatchable functions in Chrome as valid syscalls This should be easy to do on their side and we can easily make all Nt* calls in Wine hotpatchable. Sadly, we don't have support for this on Mac (and clang in general) and 64-bit yet, but that would mostly increase importance of an already existing problem.
All such solutions require changes in both Chrome and Wine, which makes it tricky to decide on a solution. What's your opinion?
Thanks, Jacek
Am 19.01.2016 um 16:18 schrieb Jacek Caban:
Sadly, we don't have support for this on Mac (and clang in general) and 64-bit yet, but that would mostly increase importance of an already existing problem.
We have hotpatching support in the official WineHQ OS X builds. I implemented the support for the ms_hook_prologue attribute in clang. Take a look at our packaging repository (https://github.com/wine-compholio/wine-packaging/tree/master/macosx/clang-na...) for the patches.
On 01/19/16 16:28, Michael Müller wrote:
Am 19.01.2016 um 16:18 schrieb Jacek Caban:
Sadly, we don't have support for this on Mac (and clang in general) and 64-bit yet, but that would mostly increase importance of an already existing problem.
We have hotpatching support in the official WineHQ OS X builds. I implemented the support for the ms_hook_prologue attribute in clang. Take a look at our packaging repository (https://github.com/wine-compholio/wine-packaging/tree/master/macosx/clang-na...) for the patches.
That's great, good to hear that. How is upstreaming going? Is clang willing to take those patches?
Thanks, Jacek
Jacek Caban jacek@codeweavers.com writes:
All such solutions require changes in both Chrome and Wine, which makes it tricky to decide on a solution. What's your opinion?
My concern is not just about hooking the system calls, it's what happens once they are hooked. Sebastian said that hooks inside the Ldr* functions don't work right. I'm worried that supporting this properly would require that our internal code paths follow the Windows ones closely, which is neither feasible nor desirable.
This could of course also be changed in Chrome so that it could deal with the way Wine's ntdll is implemented, but that would probably be more intrusive.
On 01/19/16 17:06, Alexandre Julliard wrote:
Jacek Caban jacek@codeweavers.com writes:
All such solutions require changes in both Chrome and Wine, which makes it tricky to decide on a solution. What's your opinion?
My concern is not just about hooking the system calls, it's what happens once they are hooked. Sebastian said that hooks inside the Ldr* functions don't work right. I'm worried that supporting this properly would require that our internal code paths follow the Windows ones closely, which is neither feasible nor desirable.
This could of course also be changed in Chrome so that it could deal with the way Wine's ntdll is implemented, but that would probably be more intrusive.
It's hard to discuss what Sebastian reports without understanding the problem. For what I know, his variant of patches exposes way more hookable calls than it should. It's not a secret that on Windows Nt* functions will not call any other exported functions. This is an implementation detail that I don't consider too internal to follow. The fact that those hooks may be called recursively with Sebastian's patches makes me think that it's the most likely reason of the problems.
For other internal behaviour, we already have cases where they do matter. My recent example is Office 2013 online installer (which has sandbox-like code that among others merges some special dirs, sometimes containing DLLs, into file system). It's obvious that we need NtOpenFile-alike in loader to open a file in loader and all changes that it required was to pass SYNCHRONIZE flag. For all I know native may use different APIs there, but all is fine using this call as long as we pass an argument that is sane and makes sense anyway.
That said, I fully agree that we don't want to follow Windows too closely, but I don't think we're at this point yet. Also, my variant of the patch (others could be reworked to follow that as well) exposes even less internal behaviour than existing code, because we need to explicitly mark parts that would go through exported thunks.
Here is my proposal for fixing the situation:
Short term we use one of variants of patches for syscall thunks. This will give an immediate fix for at lest some (most common?) Wine configs.
Long term (as in we do it now, but we will benefit way later due to latency of Chrome release and the time needed for embedders to pick it), we do changes in Chrome. My thinking is that we could make runtime matching of all known thunks in Chrome instead of allowing only one variant in given config (this would fix Wine with all Windows versions set as well as wow64). For 64-bit we could make the check less strict (allowing jump in place of mov eax,52h) or invent other, Wine-specific, solution.
Thanks, Jacek
On 21.01.2016 15:43, Jacek Caban wrote:
On 01/19/16 17:06, Alexandre Julliard wrote:
Jacek Caban jacek@codeweavers.com writes:
All such solutions require changes in both Chrome and Wine, which makes it tricky to decide on a solution. What's your opinion?
My concern is not just about hooking the system calls, it's what happens once they are hooked. Sebastian said that hooks inside the Ldr* functions don't work right. I'm worried that supporting this properly would require that our internal code paths follow the Windows ones closely, which is neither feasible nor desirable.
This could of course also be changed in Chrome so that it could deal with the way Wine's ntdll is implemented, but that would probably be more intrusive.
It's hard to discuss what Sebastian reports without understanding the problem. For what I know, his variant of patches exposes way more hookable calls than it should. It's not a secret that on Windows Nt* functions will not call any other exported functions. This is an implementation detail that I don't consider too internal to follow. The fact that those hooks may be called recursively with Sebastian's patches makes me think that it's the most likely reason of the problems.
Could you please point out where you think my patchset doesn't match Windows behavior / leads to recursive calls which should not occur? Or, have you already tried improving your patchset, to implement forwarding of Ldr* and Rtl* functions through the thunks, like it is supposed to work on Windows? I would guess you encounter exactly the same problem.
For other internal behaviour, we already have cases where they do matter. My recent example is Office 2013 online installer (which has sandbox-like code that among others merges some special dirs, sometimes containing DLLs, into file system). It's obvious that we need NtOpenFile-alike in loader to open a file in loader and all changes that it required was to pass SYNCHRONIZE flag. For all I know native may use different APIs there, but all is fine using this call as long as we pass an argument that is sane and makes sense anyway.
That said, I fully agree that we don't want to follow Windows too closely, but I don't think we're at this point yet. Also, my variant of the patch (others could be reworked to follow that as well) exposes even less internal behaviour than existing code, because we need to explicitly mark parts that would go through exported thunks.
Here is my proposal for fixing the situation:
Short term we use one of variants of patches for syscall thunks. This will give an immediate fix for at lest some (most common?) Wine configs.
If the issues with Ldr* functions can be solved I'm fine with that. However, I wouldn't necessarily recommend the JIT approach if we have only one supported config. Lets assume we would use winebuild statically generated thunks (like my patch based on Erichs work), the additional logic for JIT could be implemented at any time later.
Long term (as in we do it now, but we will benefit way later due to latency of Chrome release and the time needed for embedders to pick it), we do changes in Chrome. My thinking is that we could make runtime matching of all known thunks in Chrome instead of allowing only one variant in given config (this would fix Wine with all Windows versions set as well as wow64). For 64-bit we could make the check less strict (allowing jump in place of mov eax,52h) or invent other, Wine-specific, solution.
Thanks, Jacek
On 01/21/16 16:04, Sebastian Lackner wrote:
On 21.01.2016 15:43, Jacek Caban wrote:
On 01/19/16 17:06, Alexandre Julliard wrote:
Jacek Caban jacek@codeweavers.com writes:
All such solutions require changes in both Chrome and Wine, which makes it tricky to decide on a solution. What's your opinion?
My concern is not just about hooking the system calls, it's what happens once they are hooked. Sebastian said that hooks inside the Ldr* functions don't work right. I'm worried that supporting this properly would require that our internal code paths follow the Windows ones closely, which is neither feasible nor desirable.
This could of course also be changed in Chrome so that it could deal with the way Wine's ntdll is implemented, but that would probably be more intrusive.
It's hard to discuss what Sebastian reports without understanding the problem. For what I know, his variant of patches exposes way more hookable calls than it should. It's not a secret that on Windows Nt* functions will not call any other exported functions. This is an implementation detail that I don't consider too internal to follow. The fact that those hooks may be called recursively with Sebastian's patches makes me think that it's the most likely reason of the problems.
Could you please point out where you think my patchset doesn't match Windows behavior / leads to recursive calls which should not occur?
I looked deeper in your patchset and the one occur that I thought was there was my mistake.
Or, have you already tried improving your patchset, to implement forwarding of Ldr* and Rtl* functions through the thunks, like it is supposed to work on Windows? I would guess you encounter exactly the same problem.
Yes, I tried that. See the attached patch (note that SYSCALL macro has an opposite meaning comparing to your patches) on top of my patchset. I couldn't reproduce the problem with Steam. It still works for me 100% of time.
I tried to compare it with your version, but, as I wrote you, it didn't work for me at all. I assume I'm missing something.
For other internal behaviour, we already have cases where they do matter. My recent example is Office 2013 online installer (which has sandbox-like code that among others merges some special dirs, sometimes containing DLLs, into file system). It's obvious that we need NtOpenFile-alike in loader to open a file in loader and all changes that it required was to pass SYNCHRONIZE flag. For all I know native may use different APIs there, but all is fine using this call as long as we pass an argument that is sane and makes sense anyway.
That said, I fully agree that we don't want to follow Windows too closely, but I don't think we're at this point yet. Also, my variant of the patch (others could be reworked to follow that as well) exposes even less internal behaviour than existing code, because we need to explicitly mark parts that would go through exported thunks.
Here is my proposal for fixing the situation:
Short term we use one of variants of patches for syscall thunks. This will give an immediate fix for at lest some (most common?) Wine configs.
If the issues with Ldr* functions can be solved I'm fine with that. However, I wouldn't necessarily recommend the JIT approach if we have only one supported config. Lets assume we would use winebuild statically generated thunks (like my patch based on Erichs work), the additional logic for JIT could be implemented at any time later.
Sure, in that case we won't need the flexibility of JIT, so static thunks should be fine. I still think we could avoid text relocations, but that's a different topic.
Thanks, Jacek
On 01/21/16 16:58, Jacek Caban wrote:
On 01/21/16 16:04, Sebastian Lackner wrote:
On 21.01.2016 15:43, Jacek Caban wrote:
On 01/19/16 17:06, Alexandre Julliard wrote:
Jacek Caban jacek@codeweavers.com writes:
All such solutions require changes in both Chrome and Wine, which makes it tricky to decide on a solution. What's your opinion?
My concern is not just about hooking the system calls, it's what happens once they are hooked. Sebastian said that hooks inside the Ldr* functions don't work right. I'm worried that supporting this properly would require that our internal code paths follow the Windows ones closely, which is neither feasible nor desirable.
This could of course also be changed in Chrome so that it could deal with the way Wine's ntdll is implemented, but that would probably be more intrusive.
It's hard to discuss what Sebastian reports without understanding the problem. For what I know, his variant of patches exposes way more hookable calls than it should. It's not a secret that on Windows Nt* functions will not call any other exported functions. This is an implementation detail that I don't consider too internal to follow. The fact that those hooks may be called recursively with Sebastian's patches makes me think that it's the most likely reason of the problems.
Could you please point out where you think my patchset doesn't match Windows behavior / leads to recursive calls which should not occur?
I looked deeper in your patchset and the one occur that I thought was there was my mistake.
Actually no, it was a valid example. Here is a stack that has two syscalls:
NtOpenFile -> FILE_CreateFile -> nt_to_unix_file_name_attr -> RtlAllocateHeap -> HEAP_FindFreeBlock -> HEAP_CreateSubHeap -> NtAllocateVirtualMemory
Cheers, Jacek
On 21.01.2016 17:22, Jacek Caban wrote:
On 01/21/16 16:58, Jacek Caban wrote:
On 01/21/16 16:04, Sebastian Lackner wrote:
On 21.01.2016 15:43, Jacek Caban wrote:
On 01/19/16 17:06, Alexandre Julliard wrote:
Jacek Caban jacek@codeweavers.com writes:
All such solutions require changes in both Chrome and Wine, which makes it tricky to decide on a solution. What's your opinion?
My concern is not just about hooking the system calls, it's what happens once they are hooked. Sebastian said that hooks inside the Ldr* functions don't work right. I'm worried that supporting this properly would require that our internal code paths follow the Windows ones closely, which is neither feasible nor desirable.
This could of course also be changed in Chrome so that it could deal with the way Wine's ntdll is implemented, but that would probably be more intrusive.
It's hard to discuss what Sebastian reports without understanding the problem. For what I know, his variant of patches exposes way more hookable calls than it should. It's not a secret that on Windows Nt* functions will not call any other exported functions. This is an implementation detail that I don't consider too internal to follow. The fact that those hooks may be called recursively with Sebastian's patches makes me think that it's the most likely reason of the problems.
Could you please point out where you think my patchset doesn't match Windows behavior / leads to recursive calls which should not occur?
I looked deeper in your patchset and the one occur that I thought was there was my mistake.
Actually no, it was a valid example. Here is a stack that has two syscalls:
NtOpenFile -> FILE_CreateFile -> nt_to_unix_file_name_attr -> RtlAllocateHeap -> HEAP_FindFreeBlock -> HEAP_CreateSubHeap -> NtAllocateVirtualMemory
Cheers, Jacek
In this case it shouldn't matter (as far as I know NtOpenFile isn't intercepted by the Chromium Sandbox), however when thinking more carefully about it, heap functions could indeed be problematic. What we theoretically need is two sets of them, user mode calls are supposed to go through NtAllocateVirtualMemory, but kernel mode calls not. I'll do some more testing myself, so far I haven't found out which thunks exactly introduce the randomness in the Chromium sandboxing code.
Could we maybe generate more complex thunks which automatically keep track of nested Nt* calls (for example, by counting somewhere in the TEB structure)? Its pretty complex to get this right without any separation in ntdll or help of the compiler.
Best regards, Sebastian
On Jan 21, 2016 9:41 AM, "Sebastian Lackner" sebastian@fds-team.de wrote:
... In this case it shouldn't matter (as far as I know NtOpenFile isn't
intercepted
by the Chromium Sandbox), however when thinking more carefully about it,
heap
functions could indeed be problematic. What we theoretically need is two
sets of
them, user mode calls are supposed to go through NtAllocateVirtualMemory,
but
kernel mode calls not. I'll do some more testing myself, so far I haven't
found out
which thunks exactly introduce the randomness in the Chromium sandboxing
code.
...
Maybe this is being overly simplistic, but we do have both Zw* and Nt* entry points. It could make sense to use Zw* internally and route all the external calls through the thunks (Nt*).
Best, Erich
On 22.01.2016 03:13, Erich E. Hoover wrote:
On Jan 21, 2016 9:41 AM, "Sebastian Lackner" sebastian@fds-team.de wrote:
... In this case it shouldn't matter (as far as I know NtOpenFile isn't
intercepted
by the Chromium Sandbox), however when thinking more carefully about it,
heap
functions could indeed be problematic. What we theoretically need is two
sets of
them, user mode calls are supposed to go through NtAllocateVirtualMemory,
but
kernel mode calls not. I'll do some more testing myself, so far I haven't
found out
which thunks exactly introduce the randomness in the Chromium sandboxing
code.
...
Maybe this is being overly simplistic, but we do have both Zw* and Nt* entry points. It could make sense to use Zw* internally and route all the external calls through the thunks (Nt*).
Best, Erich
I already thought about that, but on Windows both Nt* and Zw* exports point to the same function. Sooner or later we might find an application which relies on this. Of course we could still use Zw* names internally only, but like any other solution, its a huge amount of effort to check all calls manually. Especially things like heap management functions are used both in "user mode" and "kernel mode".
On 01/19/16 13:03, Sebastian Lackner wrote:
- Last but not least, one of the reasons why I haven't submitted my own approach so far: The whole idea only works reliable as long as we do NOT forward Ldr* functions to syscall thunks. If we do, like it should work on Windows, and like it is most likely required by other apps, there are race conditions all over the place and Steam webstore only works with a probability of about 50% or less. I haven't figured out what causes it yet.
How can I reproduce it? I tried your patches with Steam, but they didn't work. My patches that use syscall thunks only for NtOpenFile (and that's needed for Office) work 100% time. This means that we have a version that calls too many and another with too few calls through those thunks, so it should be easy to bisect, which exact calls cause problems. I used your ntdll-Syscall_Wrappers patches from staging tree on top of clean Wine. Am I missing something?
(BTW, my first guess without any debugging would be that with your patches, we call memory management functions inside Nt* calls, which definitely shouldn't go through thunks).
Thanks, Jacek