https://bugs.winehq.org/show_bug.cgi?id=49663
Bug ID: 49663 Summary: Performance regression in TrackMania Nations Forever Product: Wine Version: 5.14 Hardware: x86-64 OS: Linux Status: UNCONFIRMED Severity: normal Priority: P2 Component: msvcrt Assignee: wine-bugs@winehq.org Reporter: phatboi1337@gmail.com Regression SHA1: f02193738c88bea8fefb3f8d2e79c5e9941f6b5a Distribution: ArchLinux
Hi there!
As of f02193738c88bea8fefb3f8d2e79c5e9941f6b5a, performance in TrackMania Nations Forever has dropped from ~250 fps to ~72 fps on the A01-Race map.
The issue only seems to be present when DXVK is being used. WineD3D performance seems to be unchanged, running at ~120 fps.
The game is available for free, either from Steam or the official website (http://trackmaniaforever.com/nations/).
https://bugs.winehq.org/show_bug.cgi?id=49663
Gijs Vermeulen gijsvrm@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |piotr.caban@gmail.com Keywords| |regression
https://bugs.winehq.org/show_bug.cgi?id=49663
Fabian Maurer dark.shadow4@web.de changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |dark.shadow4@web.de
--- Comment #1 from Fabian Maurer dark.shadow4@web.de --- I wouldn't be too surprised, since the standard library memmove is very optimized (uses AVX), and the new implementation is not.
https://bugs.winehq.org/show_bug.cgi?id=49663
Austin English austinenglish@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- URL| |http://trackmaniaforever.co | |m/nations/ Keywords| |download, performance
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #2 from Fabian Maurer dark.shadow4@web.de --- Created attachment 67932 --> https://bugs.winehq.org/attachment.cgi?id=67932 Patch for better memcpy/memmove
Could you please retest with the patch I attached? Make sure to compile with at least -O2.
https://bugs.winehq.org/show_bug.cgi?id=49663
Zebediah Figura z.figura12@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |z.figura12@gmail.com
--- Comment #3 from Zebediah Figura z.figura12@gmail.com --- That patch can't work as-is; we need memcpy to act like memmove.
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #4 from Fabian Maurer dark.shadow4@web.de --- (In reply to Zebediah Figura from comment #3)
That patch can't work as-is; we need memcpy to act like memmove.
Why is that?
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #5 from Fabian Maurer dark.shadow4@web.de --- Created attachment 67933 --> https://bugs.winehq.org/attachment.cgi?id=67933 New patch to improve performance
@OP Would you mind testing both patches? This one should be a bit slower, though I'd like to see your results.
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #6 from Zebediah Figura z.figura12@gmail.com --- (In reply to Fabian Maurer from comment #4)
(In reply to Zebediah Figura from comment #3)
That patch can't work as-is; we need memcpy to act like memmove.
Why is that?
Applications depend on it. See this comment from msvcrt:string:
https://source.winehq.org/git/wine.git/blob/HEAD:/dlls/msvcrt/tests/string.c...
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #7 from Fabian Maurer dark.shadow4@web.de --- (In reply to Zebediah Figura from comment #6)
Applications depend on it. See this comment from msvcrt:string:
Figured, classic rule violations again. Anyways, for now the simplified patch should be good enough, if not, then we could still extend memmove to call musl memcpy if no overlap is present.
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #8 from phatboi1337@gmail.com --- I've tested with both patches and both of them seem to perform the same, ~223 fps on the same spot I got my initial numbers from.
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #9 from Zebediah Figura z.figura12@gmail.com --- It might be worth testing memmove from glibc and other libraries as well, on the chance that they offer better performance.
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #10 from Fabian Maurer dark.shadow4@web.de --- The glibc implementation offers about up to 50% more speed, but it's all written in assembly to take advantage of AVX/SSE, combined with architecture checks to choose the best path at runtime. I didn't want to replicate all that into Wine, when musl is fast enough. What's your thought on that?
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #11 from Zebediah Figura z.figura12@gmail.com --- (In reply to Fabian Maurer from comment #10)
The glibc implementation offers about up to 50% more speed, but it's all written in assembly to take advantage of AVX/SSE, combined with architecture checks to choose the best path at runtime. I didn't want to replicate all that into Wine, when musl is fast enough. What's your thought on that?
While the reporter's numbers are not necessarily statistically significant on their own, generally 20 fps is an improvement worth having. Personally, I think when it comes to core CRT functions like memmove, it really is worth having as optimized an implementation as possible. I don't get the impression that platform-specific assembly would be much of a problem.
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #12 from Fabian Maurer dark.shadow4@web.de --- Agreed, I just wasn't sure whether this was accepted. Would we link against assembly files (like in glibc) or convert it to inline assembly? I could work on that, although I admit I don't have too much experience with SSE/AVX.
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #13 from Zebediah Figura z.figura12@gmail.com --- (In reply to Fabian Maurer from comment #12)
Agreed, I just wasn't sure whether this was accepted. Would we link against assembly files (like in glibc) or convert it to inline assembly? I could work on that, although I admit I don't have too much experience with SSE/AVX.
You'd probably want to ask Piotr (and maybe Alexandre). Given the amount of assembly code a separate file seems ideal to me, but it may require some changes to the build system.
https://bugs.winehq.org/show_bug.cgi?id=49663
Stefan Dösinger stefan@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |stefan@codeweavers.com
--- Comment #14 from Stefan Dösinger stefan@codeweavers.com ---
Anyways, for now the simplified patch should be good enough, if not, then we could still extend memmove to call musl memcpy if no overlap is present.
Note that it isn't as easy as comparing pointers + range. Various mmap shenanigans might be going on that make memory areas that look separate overlap.
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #15 from Fabian Maurer dark.shadow4@web.de ---
Note that it isn't as easy as comparing pointers + range. Various mmap shenanigans might be going on that make memory areas that look separate overlap.
The question is, does msvcrt handle this? musl for example, does a simple range check.
https://bugs.winehq.org/show_bug.cgi?id=49663
Anya maniikarabera@protonmail.ch changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |maniikarabera@protonmail.ch
https://bugs.winehq.org/show_bug.cgi?id=49663
Julian Rüger jr98@gmx.net changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jr98@gmx.net
https://bugs.winehq.org/show_bug.cgi?id=49663
--- Comment #16 from Piotr Caban piotr.caban@gmail.com --- It should be fixed by 38c490496000c5852f14e9c022868c5107d9ff03. Please retest.
https://bugs.winehq.org/show_bug.cgi?id=49663
François Gouget fgouget@codeweavers.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |patch CC| |fgouget@codeweavers.com
https://bugs.winehq.org/show_bug.cgi?id=49663
phatboi1337@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED
--- Comment #17 from phatboi1337@gmail.com --- (In reply to Piotr Caban from comment #16)
It should be fixed by 38c490496000c5852f14e9c022868c5107d9ff03. Please retest.
Sorry for not reporting back for such a long time. The issue is fixed, thank you!
https://bugs.winehq.org/show_bug.cgi?id=49663
Gijs Vermeulen gijsvrm@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Fixed by SHA1| |38c490496000c5852f14e9c0228 | |68c5107d9ff03
https://bugs.winehq.org/show_bug.cgi?id=49663
Alexandre Julliard julliard@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED
--- Comment #18 from Alexandre Julliard julliard@winehq.org --- Closing bugs fixed in 6.10.
https://bugs.winehq.org/show_bug.cgi?id=49663
Michael Stefaniuc mstefani@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |6.0.x
https://bugs.winehq.org/show_bug.cgi?id=49663
Michael Stefaniuc mstefani@winehq.org changed:
What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|6.0.x |---
--- Comment #19 from Michael Stefaniuc mstefani@winehq.org --- Removing the 6.0.x milestone from bug fixes included in 6.0.2.