[Bug 53947] New: Daily Chthonicle is extremely slow and laggy
https://bugs.winehq.org/show_bug.cgi?id=53947 Bug ID: 53947 Summary: Daily Chthonicle is extremely slow and laggy Product: Wine Version: 7.20 Hardware: x86-64 OS: Linux Status: NEW Severity: normal Priority: P2 Component: gdiplus Assignee: wine-bugs(a)winehq.org Reporter: dark.shadow4(a)web.de Distribution: --- Especially notable when you run it without winecfg's "virtual desktop" on a 4k screen. winetricks "gdiplus" works around the issue. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 Fabian Maurer <dark.shadow4(a)web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- URL| |https://web.archive.org/web | |/20221118194033/https://sjc | |10.dl.dbolical.com/dl/2016/ | |09/13/Daily_Chthonicle_Edit | |ors_Edition_Demo.zip?st=YJc | |dXvXvzfToFv_15RJu-g==&e=166 | |8803722 Keywords| |download -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 Fabian Maurer <dark.shadow4(a)web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |performance -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 Bartosz <gang65(a)poczta.onet.pl> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |gang65(a)poczta.onet.pl --- Comment #1 from Bartosz <gang65(a)poczta.onet.pl> --- Does "winetricks gdiplus" makes it faster? -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #2 from Bartosz <gang65(a)poczta.onet.pl> --- Please attach terminal output. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #3 from Fabian Maurer <dark.shadow4(a)web.de> ---
Does "winetricks gdiplus" makes it faster?
Yes, as I said.
Please attach terminal output.
Nothing to see here. What WINEDEBUG? -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #4 from Bartosz <gang65(a)poczta.onet.pl> --- Please attach at least part of log when the sluggish is visible: WINEDEBUG=+timestamp,+gdiplus -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #5 from Fabian Maurer <dark.shadow4(a)web.de> --- Created attachment 73612 --> https://bugs.winehq.org/attachment.cgi?id=73612 Log Wouldn't it be easier to test yourself? It happens right at the start of the demo I linked, when it tries to draw the intro. But here you go, added a log. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 Julian Rüger <jr98(a)gmx.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jr98(a)gmx.net -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 florian.will(a)gmail.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |florian.will(a)gmail.com --- Comment #6 from florian.will(a)gmail.com --- It looks like the intro spends a lot of time in GdipDrawImagePointsRect() to resize 514 x 696 source images to the full window size, something like 3693 x 2100 in your case, and closer to full HD in my case. I experimented with some ugly optimizations [1] a while ago for another application that relies heavily on image resizing, and while my changes speed that intro up, it's still not great (28ms / frame for 1814 x 1053 vs. 56ms on git master, using Ryzen 5 1600 – I suspect it's supposed to run at 60 FPS?). I suspected that my changes would introduce lots of bugs for use cases that go through the same code, so I gave up on that attempt. [1] https://github.com/w-flo/wine/commits/zusi_opt , "Fast path for resampling unrotated bitmaps" is probably the relevant commit I wonder, does native gdiplus use SIMD to speed this up? Or do they simply use a more efficient non-SIMD algorithm for image upscaling? I'm surprised it apparently works good enough on 4K for you. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #7 from Bartosz <gang65(a)poczta.onet.pl> --- @florian Could you please take a look at Merge Request, which improves performance of GdipDrawImagePointsRect? Link: https://gitlab.winehq.org/wine/wine/-/merge_requests/2864 -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #8 from Bartosz <gang65(a)poczta.onet.pl> --- I have run oprofile with wine-8.20, and here is the results: CPU: Intel Haswell microarchitecture, speed 3700 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 samples % image name app name symbol name 254487 17.1461 gdiplus.dll wine _resample_bitmap_pixel 170679 11.4995 no-vmlinux wine /no-vmlinux 160177 10.7919 win32u.so wine /opt/wine-devel/lib/wine/i386-unix/win32u.so 117085 7.8886 gdiplus.dll wine _GdipBitmapSetPixel(a)16 103532 6.9755 gdiplus.dll wine _GdipDrawImagePointsRect(a)48 87948 5.9255 anon (tgid:6993 range:0x7ae70000-0x7b172fff) wine anon (tgid:6993 range:0x7ae70000-0x7b172fff) 55544 3.7423 gdiplus.dll wine _sample_bitmap_pixel 55214 3.7200 gdiplus.dll wine _alpha_blend_pixels_hrgn -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #9 from Bartosz <gang65(a)poczta.onet.pl> --- After building wine from source code, it is noticed that floorf is called from ucrtbase.dll library, which is very slow. Full Oprofile output: Counted cpu_clk_unhalted events () with a unit mask of 0x00 (Core cycles when at least one thread on the physical core is not in halt state) count 100000 samples % image name app name symbol name 895705 33.4761 ucrtbase.dll.so wine floor 351683 13.1438 no-vmlinux wine /no-vmlinux 330995 12.3706 gdiplus.dll.so wine resample_bitmap_pixel 300093 11.2157 win32u.so wine blend_rects_8888 272645 10.1898 gdiplus.dll.so wine GdipDrawImagePointsRect 137421 5.1360 gdiplus.dll.so wine sample_bitmap_pixel 112476 4.2037 gdiplus.dll.so wine convert_32bppARGB_to_32bppPARGB 35455 1.3251 no-vmlinux wineserver /no-vmlinux 35223 1.3164 no-vmlinux wine64-preloader /no-vmlinux 30252 1.1306 libc.so.6 wine /usr/lib/i386-linux-gnu/libc.so.6 -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #10 from Fabian Maurer <dark.shadow4(a)web.de> --- Created attachment 75435 --> https://bugs.winehq.org/attachment.cgi?id=75435 Test program I created a little program for easier testing. For some reason your patch actually slows it down though, it's a bit weird. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #11 from Fabian Maurer <dark.shadow4(a)web.de> --- For me (Arch Linux) there is a massive performance regression between wine-8.9 and wine-8.8. 8.9 is about 3 times slower. I split this off into bug 55899. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #12 from Bartosz <gang65(a)poczta.onet.pl> --- Created attachment 75672 --> https://bugs.winehq.org/attachment.cgi?id=75672 Screenshot where slowness is visible during scrolling The regression was fixed with commit: https://gitlab.winehq.org/wine/wine/-/commit/4b458775bb8c9492ac859cfd167c5f5... @Fabian Could you please describe where the slowness is appearing for you? I have noticed it on title screen, when the summary page is appearing and it is scrolling. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #13 from Fabian Maurer <dark.shadow4(a)web.de> --- Especially at the beginning, yes. I usually measure the time (approximately) from starting the game until it begins scrolling up. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #14 from Fabian Maurer <dark.shadow4(a)web.de> --- Created attachment 75675 --> https://bugs.winehq.org/attachment.cgi?id=75675 Profiling data (Inverted call stack) I did some profiling on my -O2 wine using perf (bfd version). Results attached. To functions eat most of the time: blend_rects_8888 calling into blend_argb (inlined) GdipDrawImagePointsRect calling into resample_bitmap_pixel (InterpolationModeNearestNeighbor) calling into sample_bitmap_pixel. No idea how either could be meaningfully improved. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #15 from Bartosz <gang65(a)poczta.onet.pl> --- Thanks Fabian. This information is extremaly useful. Could you please describe (or add some link), how I could profile it in that way? -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #16 from Fabian Maurer <dark.shadow4(a)web.de> --- You need to compile perf 6.6 (or greater) yourself from https://github.com/torvalds/linux/tree/master/tools/perf Like in https://aur.archlinux.org/packages/perf-bfd Then you need wine with debug symbols (e.g. self built). Run "perf record -g /home/fabian/Programming/Wine/wine/loader/wine DailyChthonicle.exe" After a while, ALT-F4 the program. In the same folder, run "perf script report gecko" Then tick the "Invert call stack" checkbox. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #17 from florian.will(a)gmail.com --- (In reply to Fabian Maurer from comment #10)
I created a little program for easier testing.
Thanks for the little benchmark program, good to have some numbers! I just rebased my patches at https://github.com/w-flo/wine/commits/zusi_opt on top of latest winehq git. The patchset improves the benchmark result from
Success, time per run: 87.500000 ms [git master]
to
Success, time per run: 23.500000 ms [zusi_opt]
for me. After "winetricks gdiplus" I get 54.5 ms, so the patchset seems to make this particular benchmark run faster than when using native gdiplus. The improvement is mostly thanks to "WIP: gdiplus: Fast path for resampling unrotated bitmaps" (which needs the two preceeding refactoring commits for resample_bitmap() and apply_tiling()). I don't really like that patch though, it seems really convoluted and I'm not convinced it works correctly in general (it seems to work fine for my use-case ZusiDisplay, as well as this bechmark program). I wonder if a similar or better improvement could be achieved using less code / easier to read code. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 --- Comment #18 from Fabian Maurer <dark.shadow4(a)web.de> --- Created attachment 76157 --> https://bugs.winehq.org/attachment.cgi?id=76157 Profiling data with your patches Attaching profiling data as comparison. For my test program I get - Vanilla Wine: 324ms - With native gdiplus: 16ms - Your fork: 14ms The game itself is about 2 times faster, which is nice. Although I kinda fear that it makes SIMD optimization harder, since now there's two paths to rewrite. -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
https://bugs.winehq.org/show_bug.cgi?id=53947 Esme Povirk <madewokherd(a)gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |madewokherd(a)gmail.com -- Do not reply to this email, post in Bugzilla using the above URL to reply. You are receiving this mail because: You are watching all bug changes.
participants (1)
-
WineHQ Bugzilla