On 22.12.2015 12:22, Paul Gofman wrote:
Fixes loader crash when compiled using Intel compiler (icc 16.0.1 20151021).
Signed-off-by: Paul Gofman gofmanp@gmail.com
loader/preloader.c | 26 +++++++++++++++++--------- 1 file changed, 17 insertions(+), 9 deletions(-)
Its not my decision, but I don't think its a good idea to add all the (unnecessary) ifs just to make this code compatible with ICC. I assume the compiler doesn't handle inline assembly correctly, but this would mean that there are several more affected places. It would be nice to have a more general solution to fix/workaround this issue. If this is not possible, I would at least suggest to add a workaround to the functions directly, instead of adding checks to all places where they are called.
On 12/22/2015 05:33 PM, Sebastian Lackner wrote:
Its not my decision, but I don't think its a good idea to add all the (unnecessary) ifs just to make this code compatible with ICC. I assume the compiler doesn't handle inline assembly correctly, but this would mean that there are several more affected places. It would be nice to have a more general solution to fix/workaround this issue. If this is not possible, I would at least suggest to add a workaround to the functions directly, instead of adding checks to all places where they are called.
I think the cause of problem is is ICC aggressive optimization which is sometimes buggy under specific conditions. In this case ICC seems to ignore the side effect of asm syscall. I think it does not mean that it handles all of inline assembly incorrectly. This problem is hard to reproduce: I can't get the same problem using the same wld_mmap and wld_mprotect functions in a small separate test case, so this likely happens under some specific conditions only. Yes, surely there might be other places like (or not quite like) that which will misbehave the same way which I did not come across yet. But I do not know any general way to track and fix them all at once. Maybe except for disabling (some) optimization which actually makes use of ICC non sensible. I was hoping that adding a few checks like that won't hurt the code (I understand though that the checks may be paranoid or redundant apart from ICC issue). If this is not the case then maybe some other solution/workaround for this ICC issue can be found. Do you think it is better to move return value check directly into wld_mmap and wld_mprotect instead of checking their status on call? I thought that checking return status rather than building in such an effect into system function wrapper is a common approach. Or maybe I could add wrapper inline functions with error checking?
On 22.12.2015 16:31, Paul Gofman wrote:
On 12/22/2015 05:33 PM, Sebastian Lackner wrote:
Its not my decision, but I don't think its a good idea to add all the (unnecessary) ifs just to make this code compatible with ICC. I assume the compiler doesn't handle inline assembly correctly, but this would mean that there are several more affected places. It would be nice to have a more general solution to fix/workaround this issue. If this is not possible, I would at least suggest to add a workaround to the functions directly, instead of adding checks to all places where they are called.
I think the cause of problem is is ICC aggressive optimization which
is sometimes buggy under specific conditions. In this case ICC seems to ignore the side effect of asm syscall. I think it does not mean that it handles all of inline assembly incorrectly. This problem is hard to reproduce: I can't get the same problem using the same wld_mmap and wld_mprotect functions in a small separate test case, so this likely happens under some specific conditions only. Yes, surely there might be other places like (or not quite like) that which will misbehave the same way which I did not come across yet. But I do not know any general way to track and fix them all at once. Maybe except for disabling (some) optimization which actually makes use of ICC non sensible.
Trying different optimization flags is one possibility. Alternatively, you could also try to remove static/inline or adding __volatile__ to the assembler code.
I was hoping that adding a few checks like that won't hurt the code
(I understand though that the checks may be paranoid or redundant apart from ICC issue). If this is not the case then maybe some other solution/workaround for this ICC issue can be found. Do you think it is better to move return value check directly into wld_mmap and wld_mprotect instead of checking their status on call? I thought that checking return status rather than building in such an effect into system function wrapper is a common approach. Or maybe I could add wrapper inline functions with error checking?
There are probably more ways to workaround the issue besides if-checks. If possible, something else (which keeps the return value) would definitely be preferred. A couple of ideas which might help (not sure if they work):
- Replace SYSCALL_RET() macro with an (inline?) function. - Replace inline assembly with a global ASM function (without C wrapper around). See for example how its done for _start, or for the x86_64 functions. If you need help with the assembler code, feel free to ask. - Pass address of ret to the assembler code, and let assembler code write the result. The compiler is probably no longer able to optimize away the function content then.
On 12/22/2015 06:48 PM, Sebastian Lackner wrote:
Trying different optimization flags is one possibility. Alternatively, you could also try to remove static/inline or adding __volatile__ to the assembler code.
__volatile__ is already there. There is also a "memory" modifier in wld_mmap, and it is missing in wld_mprotect (docs suggest it is the direct hint for the compiler not to optimize out asm code). But unfortunately adding it to wld_mprotect does not solve the issue. I will play more with it and with the other options you suggest.
Paul Gofman gofmanp@gmail.com writes:
Do you think it is better to move return value check directly into
wld_mmap and wld_mprotect instead of checking their status on call? I thought that checking return status rather than building in such an effect into system function wrapper is a common approach. Or maybe I could add wrapper inline functions with error checking?
It would be better to fix the compiler, or disable the broken optimization.
On 12/22/2015 06:59 PM, Alexandre Julliard wrote:
It would be better to fix the compiler, or disable the broken optimization.
Compiler is not open source, I really can do nothing with it, nor I see any practical use for me to deal with ICC compilation with major optimizations disabled. Could you please advise if there could be any use if I explore the issue a bit more and try to get a more accurate workaround or I'd better give up if it is of no interest anyway?
Paul Gofman gofmanp@gmail.com writes:
On 12/22/2015 06:59 PM, Alexandre Julliard wrote:
It would be better to fix the compiler, or disable the broken optimization.
Compiler is not open source, I really can do nothing with it, nor I see any practical use for me to deal with ICC compilation with major optimizations disabled. Could you please advise if there could be any use if I explore the issue a bit more and try to get a more accurate workaround or I'd better give up if it is of no interest anyway?
If there's really something missing in the Wine code, sure, it should be fixed. But if the code is correct, tweaking it to work around optimizer bugs is a waste of time.
On 12/22/2015 07:22 PM, Alexandre Julliard wrote:
If there's really something missing in the Wine code, sure, it should be fixed. But if the code is correct, tweaking it to work around optimizer bugs is a waste of time.
Thank you, I understand the approach. I will look into the issue more and come back with it if I find some straightforward way without tweaking (like universally "optimization safe" __asm__ or relevant compiler flag).
I figured out what is really happening there after reviewing assembly listings. The problem in compilation is triggered by the following (correct) logic in preloader.c/map_so_lib function: ... if( header->e_type == ET_DYN ) //line 806 { ... goto postmap; } ... while (c < &loadcmds[nloadcmds]) { ... postmap: ... ++c; } ... The problem seen in assembly listing is that ICC stores &loadcmds[nloadcmds] in the stack variable in the beginning of the loop, and this initialization is skipped along with the beginning of the loop on 'goto postmap'. My initial attempt to fix it looks like random tweak (the overall function register & vars allocation structure changed and the problem did not fire). Now I can workaround the problem in a definite way by using some pre-initialized variable instead of &loadcmds[nloadcmds] in loop comparison (this fixes the issue). So it is no incorrect code in Wine here (nor any optimization hints for this I could imagine). goto jumping inside the loop body is correct (though unusual), but it triggers the problem with ICC. I could think of changing this code one of two ways: 1. The simplest one (add a variable instead of &loadcmds[nloadcmds]), which is actually a tweak (while a better defined one than my initial attempt); 2. Redesign this logic and avoid jumping inside loop body (this is probably a better way)
Would a patch like this (presumably way 2) be relevant?
On 12/22/2015 07:22 PM, Alexandre Julliard wrote:
If there's really something missing in the Wine code, sure, it should be fixed. But if the code is correct, tweaking it to work around optimizer bugs is a waste of time.