https://bugs.winehq.org/show_bug.cgi?id=49852
Bug ID: 49852 Summary: Performance regression in "msvcrt: Use correct code page in _write when outputing to console." Product: Wine Version: unspecified Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: msvcrt Assignee: wine-bugs@winehq.org Reporter: martin@martin.st Distribution: ---
Created attachment 68190 --> https://bugs.winehq.org/attachment.cgi?id=68190 Test sample
Since 081d25d4b989c732d301b786311c034132c8d7a3, "msvcrt: Use correct code page in _write when outputing to console.", a test app (that prints a lot to stdout) had an around 10-20x slowdown.
Before, "time wine64 ihateeh-x86_64.exe > /dev/null 2>&1" ran in around 1 second, after that commit it takes 15 seconds. (At some later point, it has regressed further to 25 seconds. The corresponding testcase on aarch64 used to run in 3 seconds, after the commit it takes 60, and with current master takes around 120 seconds.)
https://bugs.winehq.org/show_bug.cgi?id=49852
Martin Storsjö martin@martin.st changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |piotr@codeweavers.com
https://bugs.winehq.org/show_bug.cgi?id=49852
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Regression SHA1| |081d25d4b989c732d301b786311 | |c034132c8d7a3 Version|unspecified |5.14 Keywords| |download, performance, | |regression
https://bugs.winehq.org/show_bug.cgi?id=49852
Martin Storsjö martin@martin.st changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jacek@codeweavers.com
--- Comment #1 from Martin Storsjö martin@martin.st --- Further bisecting shows that the runtime doubled further in ab6056d6774310162ef268ca6af5e3ad2e556a4e, "kernel32: Use IOCTL_CONDRV_WRITE_CONSOLE in WriteConsoleW when possible.".
I.e., originally the attached test app ran in around 1 second, in 081d25d4b9 (msvcrt: Use correct code page) it regressed to 13 seconds (in my recent tests) and further on to 24 seconds in ab6056d677 (kernel32: Use IOCTL_CONDRV_WRITE_CONSOLE).
Not sure what to say about these changes vs performance, as I presume the changes are done for correctness sake - but it's a bit surprising to wait for that long, for a test app that earlier ran in a second or two.
https://bugs.winehq.org/show_bug.cgi?id=49852
--- Comment #2 from Jacek Caban jacek@codeweavers.com --- Looking at kernel32 side of it, I can see a lot of one character writes. That's from fputc() msvcrt calls. Maybe it should be buffered on msvcrt side, I didn't look at that.
Current console code implements bare console support in kernel32 itself and it essentially just writes directly to host std output fd. It's fast, but it also has a number of limitations. Originally motivated by migration to PE files, I'm currently working on changing it to be more similar to Windows, where those calls are forwarded to a separated process that manages the console, processes write and forwards to host console, when appropriate. With the additional IPC, some performance drop is unfortunately not avoidable.
For me, your test runs with current time in 15s on Linux with current git.
With some patches that I plan to send soon that will change bare console to use mentioned ioctl, it runs in 42s.
I hacked ioctl handling a bit so that client doesn't wait for conhost to complete (so we can't propagate error from conhost). With the hack it runs in 9s.
Avoiding using tty code path reduces that time to 2.5s (something like wine ihateeh-x86_64.exe |cat)
The application takes about 14s to run on Windows for me, so our performance is not really that bad in comparison.
https://bugs.winehq.org/show_bug.cgi?id=49852
--- Comment #3 from Martin Storsjö martin@martin.st --- Thanks for having a look!
(In reply to Jacek Caban from comment #2)
Avoiding using tty code path reduces that time to 2.5s (something like wine ihateeh-x86_64.exe |cat)
Oh, that's interesting. So pipeing stdout in that case makes it a non-tty, avoiding the slow code, but redirecting stdout to /dev/null still considers it a tty?
The application takes about 14s to run on Windows for me, so our performance is not really that bad in comparison.
Fair enough - I guess the testcase is rather extreme in the amount of output it produces. (The only sensible way to run it is by piping the output to /dev/null anyway.)
I guess one way to avoid it altogether would to have a look if I could cut some of the amount of IO done in the testcase, while keeping the functionality...
https://bugs.winehq.org/show_bug.cgi?id=49852
--- Comment #4 from Jacek Caban jacek@codeweavers.com --- (In reply to Martin Storsjö from comment #3)
(In reply to Jacek Caban from comment #2)
Avoiding using tty code path reduces that time to 2.5s (something like wine ihateeh-x86_64.exe |cat)
Oh, that's interesting. So pipeing stdout in that case makes it a non-tty, avoiding the slow code, but redirecting stdout to /dev/null still considers it a tty?
Yeah, it's msvcrt that considers /dev/null to be a tty because we report FILE_TYPE_CHAR type. We should probably find a better solution.
https://bugs.winehq.org/show_bug.cgi?id=49852
Piotr Caban piotr.caban@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |piotr.caban@gmail.com
--- Comment #5 from Piotr Caban piotr.caban@gmail.com --- I've sent a patch that fixes console detection code in _write. It uses VerifyConsoleIoHandle function that is slow. It speeds up the execution when output is redirected to /dev/null and slows down when output goes to console. In order to improve execution time VerifyConsoleIoHandle will need to be improved.
Old code was faster but incorrect. I'm not sure if we should consider it as regression taking into account we're on par with native speed.
Another workaround you can use: wine64 ihateeh-x86_64.exe |cat &>/dev/null
https://bugs.winehq.org/show_bug.cgi?id=49852
--- Comment #6 from Rafał Mużyło galtgendo@o2.pl --- wrt. those differences between redirecting to /dev/null and a pipe, I'd like to mention again (previously mentioned either in chat or a bug I no longer recall) a tangentially related problem:
I ran into a Unity game that crashed due to address space exhaustion. While it was in the process of crashing, it was producing error messages containing mojibake. Yet if it was redirected via tee, those errors strings were correctly printed. So, this might be another part that needs checking.
https://bugs.winehq.org/show_bug.cgi?id=49852
--- Comment #7 from Martin Storsjö martin@martin.st --- Thanks for the fixes and workarounds! The msvcrt fix lowers the runtime on aarch64 from 120 seconds to 55.
The workaround for piping to cat does work nicely as well, but one practical concern there is that it makes harder to check for the exit code of the wine process (unless using more elaborate constructs with bash and $PIPESTATUS).
In any case, it turned out to be trivial to strip out unnecessary printouts from the testcase, reducing the number of lines printed from 77k to around 2k, making it run in a sensible amount of time.
Thanks for your time and efforts!
https://bugs.winehq.org/show_bug.cgi?id=49852
Fabian Maurer dark.shadow4@web.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |dark.shadow4@web.de
https://bugs.winehq.org/show_bug.cgi?id=49852
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED
--- Comment #8 from Alexandre Julliard julliard@winehq.org --- (In reply to Piotr Caban from comment #5)
Old code was faster but incorrect. I'm not sure if we should consider it as regression taking into account we're on par with native speed.
That's good enough then, marking fixed.
https://bugs.winehq.org/show_bug.cgi?id=49852
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #9 from Alexandre Julliard julliard@winehq.org --- Closing bugs fixed in 9.0-rc5.