https://bugs.winehq.org/show_bug.cgi?id=53947
Bug ID: 53947 Summary: Daily Chthonicle is extremely slow and laggy Product: Wine Version: 7.20 Hardware: x86-64 OS: Linux Status: NEW Severity: normal Priority: P2 Component: gdiplus Assignee: wine-bugs@winehq.org Reporter: dark.shadow4@web.de Distribution: ---
Especially notable when you run it without winecfg's "virtual desktop" on a 4k screen.
winetricks "gdiplus" works around the issue.
https://bugs.winehq.org/show_bug.cgi?id=53947
Fabian Maurer dark.shadow4@web.de changed:
What |Removed |Added ---------------------------------------------------------------------------- URL| |https://web.archive.org/web | |/20221118194033/https://sjc | |10.dl.dbolical.com/dl/2016/ | |09/13/Daily_Chthonicle_Edit | |ors_Edition_Demo.zip?st=YJc | |dXvXvzfToFv_15RJu-g==&e=166 | |8803722 Keywords| |download
https://bugs.winehq.org/show_bug.cgi?id=53947
Fabian Maurer dark.shadow4@web.de changed:
What |Removed |Added ---------------------------------------------------------------------------- Keywords| |performance
https://bugs.winehq.org/show_bug.cgi?id=53947
Bartosz gang65@poczta.onet.pl changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |gang65@poczta.onet.pl
--- Comment #1 from Bartosz gang65@poczta.onet.pl --- Does "winetricks gdiplus" makes it faster?
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #2 from Bartosz gang65@poczta.onet.pl --- Please attach terminal output.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #3 from Fabian Maurer dark.shadow4@web.de ---
Does "winetricks gdiplus" makes it faster?
Yes, as I said.
Please attach terminal output.
Nothing to see here. What WINEDEBUG?
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #4 from Bartosz gang65@poczta.onet.pl --- Please attach at least part of log when the sluggish is visible:
WINEDEBUG=+timestamp,+gdiplus
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #5 from Fabian Maurer dark.shadow4@web.de --- Created attachment 73612 --> https://bugs.winehq.org/attachment.cgi?id=73612 Log
Wouldn't it be easier to test yourself? It happens right at the start of the demo I linked, when it tries to draw the intro.
But here you go, added a log.
https://bugs.winehq.org/show_bug.cgi?id=53947
Julian Rüger jr98@gmx.net changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |jr98@gmx.net
https://bugs.winehq.org/show_bug.cgi?id=53947
florian.will@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |florian.will@gmail.com
--- Comment #6 from florian.will@gmail.com --- It looks like the intro spends a lot of time in GdipDrawImagePointsRect() to resize 514 x 696 source images to the full window size, something like 3693 x 2100 in your case, and closer to full HD in my case.
I experimented with some ugly optimizations [1] a while ago for another application that relies heavily on image resizing, and while my changes speed that intro up, it's still not great (28ms / frame for 1814 x 1053 vs. 56ms on git master, using Ryzen 5 1600 – I suspect it's supposed to run at 60 FPS?). I suspected that my changes would introduce lots of bugs for use cases that go through the same code, so I gave up on that attempt.
[1] https://github.com/w-flo/wine/commits/zusi_opt , "Fast path for resampling unrotated bitmaps" is probably the relevant commit
I wonder, does native gdiplus use SIMD to speed this up? Or do they simply use a more efficient non-SIMD algorithm for image upscaling? I'm surprised it apparently works good enough on 4K for you.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #7 from Bartosz gang65@poczta.onet.pl --- @florian Could you please take a look at Merge Request, which improves performance of GdipDrawImagePointsRect?
Link: https://gitlab.winehq.org/wine/wine/-/merge_requests/2864
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #8 from Bartosz gang65@poczta.onet.pl --- I have run oprofile with wine-8.20, and here is the results:
CPU: Intel Haswell microarchitecture, speed 3700 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name app name symbol name 254487 17.1461 gdiplus.dll wine _resample_bitmap_pixel 170679 11.4995 no-vmlinux wine /no-vmlinux 160177 10.7919 win32u.so wine /opt/wine-devel/lib/wine/i386-unix/win32u.so 117085 7.8886 gdiplus.dll wine _GdipBitmapSetPixel@16 103532 6.9755 gdiplus.dll wine _GdipDrawImagePointsRect@48 87948 5.9255 anon (tgid:6993 range:0x7ae70000-0x7b172fff) wine anon (tgid:6993 range:0x7ae70000-0x7b172fff) 55544 3.7423 gdiplus.dll wine _sample_bitmap_pixel 55214 3.7200 gdiplus.dll wine _alpha_blend_pixels_hrgn
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #9 from Bartosz gang65@poczta.onet.pl --- After building wine from source code, it is noticed that floorf is called from ucrtbase.dll library, which is very slow. Full Oprofile output:
Counted cpu_clk_unhalted events () with a unit mask of 0x00 (Core cycles when at least one thread on the physical core is not in halt state) count 100000 samples % image name app name symbol name 895705 33.4761 ucrtbase.dll.so wine floor 351683 13.1438 no-vmlinux wine /no-vmlinux 330995 12.3706 gdiplus.dll.so wine resample_bitmap_pixel 300093 11.2157 win32u.so wine blend_rects_8888 272645 10.1898 gdiplus.dll.so wine GdipDrawImagePointsRect 137421 5.1360 gdiplus.dll.so wine sample_bitmap_pixel 112476 4.2037 gdiplus.dll.so wine convert_32bppARGB_to_32bppPARGB 35455 1.3251 no-vmlinux wineserver /no-vmlinux 35223 1.3164 no-vmlinux wine64-preloader /no-vmlinux 30252 1.1306 libc.so.6 wine /usr/lib/i386-linux-gnu/libc.so.6
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #10 from Fabian Maurer dark.shadow4@web.de --- Created attachment 75435 --> https://bugs.winehq.org/attachment.cgi?id=75435 Test program
I created a little program for easier testing. For some reason your patch actually slows it down though, it's a bit weird.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #11 from Fabian Maurer dark.shadow4@web.de --- For me (Arch Linux) there is a massive performance regression between wine-8.9 and wine-8.8. 8.9 is about 3 times slower.
I split this off into bug 55899.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #12 from Bartosz gang65@poczta.onet.pl --- Created attachment 75672 --> https://bugs.winehq.org/attachment.cgi?id=75672 Screenshot where slowness is visible during scrolling
The regression was fixed with commit: https://gitlab.winehq.org/wine/wine/-/commit/4b458775bb8c9492ac859cfd167c5f5...
@Fabian Could you please describe where the slowness is appearing for you? I have noticed it on title screen, when the summary page is appearing and it is scrolling.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #13 from Fabian Maurer dark.shadow4@web.de --- Especially at the beginning, yes. I usually measure the time (approximately) from starting the game until it begins scrolling up.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #14 from Fabian Maurer dark.shadow4@web.de --- Created attachment 75675 --> https://bugs.winehq.org/attachment.cgi?id=75675 Profiling data (Inverted call stack)
I did some profiling on my -O2 wine using perf (bfd version). Results attached.
To functions eat most of the time:
blend_rects_8888 calling into blend_argb (inlined)
GdipDrawImagePointsRect calling into resample_bitmap_pixel (InterpolationModeNearestNeighbor) calling into sample_bitmap_pixel.
No idea how either could be meaningfully improved.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #15 from Bartosz gang65@poczta.onet.pl --- Thanks Fabian. This information is extremaly useful. Could you please describe (or add some link), how I could profile it in that way?
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #16 from Fabian Maurer dark.shadow4@web.de --- You need to compile perf 6.6 (or greater) yourself from https://github.com/torvalds/linux/tree/master/tools/perf Like in https://aur.archlinux.org/packages/perf-bfd
Then you need wine with debug symbols (e.g. self built). Run "perf record -g /home/fabian/Programming/Wine/wine/loader/wine DailyChthonicle.exe" After a while, ALT-F4 the program. In the same folder, run "perf script report gecko" Then tick the "Invert call stack" checkbox.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #17 from florian.will@gmail.com --- (In reply to Fabian Maurer from comment #10)
I created a little program for easier testing.
Thanks for the little benchmark program, good to have some numbers!
I just rebased my patches at https://github.com/w-flo/wine/commits/zusi_opt on top of latest winehq git. The patchset improves the benchmark result from
Success, time per run: 87.500000 ms [git master]
to
Success, time per run: 23.500000 ms [zusi_opt]
for me. After "winetricks gdiplus" I get 54.5 ms, so the patchset seems to make this particular benchmark run faster than when using native gdiplus.
The improvement is mostly thanks to "WIP: gdiplus: Fast path for resampling unrotated bitmaps" (which needs the two preceeding refactoring commits for resample_bitmap() and apply_tiling()).
I don't really like that patch though, it seems really convoluted and I'm not convinced it works correctly in general (it seems to work fine for my use-case ZusiDisplay, as well as this bechmark program).
I wonder if a similar or better improvement could be achieved using less code / easier to read code.
https://bugs.winehq.org/show_bug.cgi?id=53947
--- Comment #18 from Fabian Maurer dark.shadow4@web.de --- Created attachment 76157 --> https://bugs.winehq.org/attachment.cgi?id=76157 Profiling data with your patches
Attaching profiling data as comparison.
For my test program I get - Vanilla Wine: 324ms - With native gdiplus: 16ms - Your fork: 14ms
The game itself is about 2 times faster, which is nice. Although I kinda fear that it makes SIMD optimization harder, since now there's two paths to rewrite.
https://bugs.winehq.org/show_bug.cgi?id=53947
Esme Povirk madewokherd@gmail.com changed:
What |Removed |Added ---------------------------------------------------------------------------- CC| |madewokherd@gmail.com