We pushed the flags, but kept them set. Far Cry sets NT flags, which causes later iretd instruction to raise a GP fault exception.
This fixes a regression from e341d1f695311725752c287057f6c6ab60fdf2a3.
Wine-Bug: https://bugs.winehq.org/show_bug.cgi?id=50793 Signed-off-by: Rémi Bernon rbernon@codeweavers.com --- tools/winebuild/import.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/winebuild/import.c b/tools/winebuild/import.c index b745f16db42..ad773e2a0d9 100644 --- a/tools/winebuild/import.c +++ b/tools/winebuild/import.c @@ -1447,6 +1447,8 @@ static void output_syscall_dispatcher( int count, const char *variant ) output( "\tmovl %%esi,-0x04(%%ebp)\n" ); output_cfi( ".cfi_rel_offset %%esi,-0x04\n" ); output( "\tpushfl\n" ); + output( "\tpushl $0x202\n" ); + output( "\tpopfl\n" ); output( "\tmovw %%gs,-0x1a(%%ebp)\n" ); output( "\tmovw %%fs,-0x1c(%%ebp)\n" ); output( "\tmovw %%es,-0x1e(%%ebp)\n" );
Hi Rémi,
On 6/1/21 10:43 AM, Rémi Bernon wrote:
We pushed the flags, but kept them set. Far Cry sets NT flags, which causes later iretd instruction to raise a GP fault exception.
This fixes a regression from e341d1f695311725752c287057f6c6ab60fdf2a3.
iret is responsible for a fair chunk of syscall dispatcher overhead. I plan to submit patches optimizing syscall dispatcher to not do iret in the usual code path and use signal_restore_full_cpu_context if context was modified during the syscall. I don't have 32-bit version implemented yet, but the idea would be the same. I think those optimization should fix the regression without adjusting flags on each syscall. signal_restore_full_cpu_context would still have the same problem, but maybe then we could do flags adjustment only there? Or are there other reasons to do the adjustment (in which case I would need to take that into account in my patches as well)?
Thanks,
Jacek
On 6/2/21 5:35 PM, Jacek Caban wrote:
Hi Rémi,
On 6/1/21 10:43 AM, Rémi Bernon wrote:
We pushed the flags, but kept them set. Far Cry sets NT flags, which causes later iretd instruction to raise a GP fault exception.
This fixes a regression from e341d1f695311725752c287057f6c6ab60fdf2a3.
iret is responsible for a fair chunk of syscall dispatcher overhead. I plan to submit patches optimizing syscall dispatcher to not do iret in the usual code path and use signal_restore_full_cpu_context if context was modified during the syscall. I don't have 32-bit version implemented yet, but the idea would be the same. I think those optimization should fix the regression without adjusting flags on each syscall. signal_restore_full_cpu_context would still have the same problem, but maybe then we could do flags adjustment only there? Or are there other reasons to do the adjustment (in which case I would need to take that into account in my patches as well)?
Thanks,
Jacek
I don't know for sure but if some other code inside the syscall does iretd with the NT flag still being set, it may cause a crash again.
This was actually an issue a long time ago as [1] suggest, and it's been fixed in the Linux kernel [2]. As the Linux kernel now clears the flags on syscall entry it doesn't seem completely crazy to clear it on NT syscall entry (although probably not useful if you intend to replace the iretd).
[1] https://bugs.winehq.org/show_bug.cgi?id=33275 [2] https://lkml.org/lkml/2014/10/27/825
On 6/2/21 5:45 PM, Rémi Bernon wrote:
On 6/2/21 5:35 PM, Jacek Caban wrote:
Hi Rémi,
On 6/1/21 10:43 AM, Rémi Bernon wrote:
We pushed the flags, but kept them set. Far Cry sets NT flags, which causes later iretd instruction to raise a GP fault exception.
This fixes a regression from e341d1f695311725752c287057f6c6ab60fdf2a3.
iret is responsible for a fair chunk of syscall dispatcher overhead. I plan to submit patches optimizing syscall dispatcher to not do iret in the usual code path and use signal_restore_full_cpu_context if context was modified during the syscall. I don't have 32-bit version implemented yet, but the idea would be the same. I think those optimization should fix the regression without adjusting flags on each syscall. signal_restore_full_cpu_context would still have the same problem, but maybe then we could do flags adjustment only there? Or are there other reasons to do the adjustment (in which case I would need to take that into account in my patches as well)?
Thanks,
Jacek
I don't know for sure but if some other code inside the syscall does iretd with the NT flag still being set, it may cause a crash again.
This was actually an issue a long time ago as [1] suggest, and it's been fixed in the Linux kernel [2]. As the Linux kernel now clears the flags on syscall entry it doesn't seem completely crazy to clear it on NT syscall entry (although probably not useful if you intend to replace the iretd).
[1] https://bugs.winehq.org/show_bug.cgi?id=33275 [2] https://lkml.org/lkml/2014/10/27/825
Also FWIW, on my desktop (with a simple NtQPC benchmark, with perf sampling) I don't see any overhead caused by iret, most of the syscall overhead comes from xsavec64 / xrstor64. It may be hardware dependent.
On 6/2/21 10:00 PM, Rémi Bernon wrote:
On 6/2/21 5:45 PM, Rémi Bernon wrote:
On 6/2/21 5:35 PM, Jacek Caban wrote:
Hi Rémi,
On 6/1/21 10:43 AM, Rémi Bernon wrote:
We pushed the flags, but kept them set. Far Cry sets NT flags, which causes later iretd instruction to raise a GP fault exception.
This fixes a regression from e341d1f695311725752c287057f6c6ab60fdf2a3.
iret is responsible for a fair chunk of syscall dispatcher overhead. I plan to submit patches optimizing syscall dispatcher to not do iret in the usual code path and use signal_restore_full_cpu_context if context was modified during the syscall. I don't have 32-bit version implemented yet, but the idea would be the same. I think those optimization should fix the regression without adjusting flags on each syscall. signal_restore_full_cpu_context would still have the same problem, but maybe then we could do flags adjustment only there? Or are there other reasons to do the adjustment (in which case I would need to take that into account in my patches as well)?
Thanks,
Jacek
I don't know for sure but if some other code inside the syscall does iretd with the NT flag still being set, it may cause a crash again.
This was actually an issue a long time ago as [1] suggest, and it's been fixed in the Linux kernel [2]. As the Linux kernel now clears the flags on syscall entry it doesn't seem completely crazy to clear it on NT syscall entry (although probably not useful if you intend to replace the iretd).
[1] https://bugs.winehq.org/show_bug.cgi?id=33275 [2] https://lkml.org/lkml/2014/10/27/825
Also FWIW, on my desktop (with a simple NtQPC benchmark, with perf sampling) I don't see any overhead caused by iret, most of the syscall overhead comes from xsavec64 / xrstor64. It may be hardware dependent.
Nevermind, I can see it (although FPU state still is much bigger).