It can be substantially faster in some cases. Notepad++ (with several plugins) start up goes down from around ~1.03 sec to ~0.61 sec on my machine (with wineprefix active).
9b7669592d6f8b40976b571b70f8543777d35167 is what introduced this performance regression. I don't know why exactly, but it made my Notepad++ noticeably slower when launched from a file manager (so it's not just at wineprefix startup overhead), which is very annoying.
From: Gabriel Ivăncescu gabrielopcode@gmail.com
It can be substantially faster in some cases. Notepad++ (with several plugins) start up goes down from around ~1.03 sec to ~0.61 sec on my machine (with wineprefix active).
Signed-off-by: Gabriel Ivăncescu gabrielopcode@gmail.com --- dlls/win32u/ntuser_private.h | 1 + dlls/win32u/winstation.c | 14 ++++++++++++-- 2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/dlls/win32u/ntuser_private.h b/dlls/win32u/ntuser_private.h index 3b6cab5bdc9..0cffa8b2208 100644 --- a/dlls/win32u/ntuser_private.h +++ b/dlls/win32u/ntuser_private.h @@ -127,6 +127,7 @@ struct user_thread_info UINT spy_indent; /* Current spy indent */ BOOL clipping_cursor; /* thread is currently clipping */ DWORD clipping_reset; /* time when clipping was last reset */ + signed char cached_is_virtual_desktop; /* ternary value: zero = unknown, negative = FALSE, positive = TRUE */ };
C_ASSERT( sizeof(struct user_thread_info) <= sizeof(((TEB *)0)->Win32ClientInfo) ); diff --git a/dlls/win32u/winstation.c b/dlls/win32u/winstation.c index b187b246941..0882b79e1ed 100644 --- a/dlls/win32u/winstation.c +++ b/dlls/win32u/winstation.c @@ -42,12 +42,21 @@ WINE_DECLARE_DEBUG_CHANNEL(win);
BOOL is_virtual_desktop(void) { - HANDLE desktop = NtUserGetThreadDesktop( GetCurrentThreadId() ); + struct user_thread_info *thread_info = get_user_thread_info(); USEROBJECTFLAGS flags = {0}; + HANDLE desktop; DWORD len; + BOOL ret; + + if (thread_info->cached_is_virtual_desktop) + return thread_info->cached_is_virtual_desktop > 0;
+ desktop = NtUserGetThreadDesktop( GetCurrentThreadId() ); if (!NtUserGetObjectInformation( desktop, UOI_FLAGS, &flags, sizeof(flags), &len )) return FALSE; - return !!(flags.dwFlags & DF_WINE_CREATE_DESKTOP); + ret = !!(flags.dwFlags & DF_WINE_CREATE_DESKTOP); + + thread_info->cached_is_virtual_desktop = ret ? 1 : -1; + return ret; }
/*********************************************************************** @@ -261,6 +270,7 @@ BOOL WINAPI NtUserSetThreadDesktop( HDESK handle ) { struct user_thread_info *thread_info = get_user_thread_info(); struct user_key_state_info *key_state_info = thread_info->key_state; + thread_info->cached_is_virtual_desktop = 0; thread_info->client_info.top_window = 0; thread_info->client_info.msg_window = 0; if (key_state_info) key_state_info->time = 0;
I really like the idea of making startup faster but I would rather fix it at the source rather than adding another layer of caching. Virtual desktop and explorer initialization is already enough of a mess as it is.
Fwiw while looking at https://gitlab.winehq.org/wine/wine/-/merge_requests/5057 I believe I found some things that cause unnecessary refreshes of the display device cache during virtual desktop activation. The https://gitlab.winehq.org/wine/wine/-/blob/master/dlls/win32u/winstation.c#L... line specifically should probably be changed to FALSE (force).
On Mon Feb 12 18:51:02 2024 +0000, Rémi Bernon wrote:
I really like the idea of making startup faster but I would rather fix it at the source rather than adding another layer of caching. Virtual desktop and explorer initialization is already enough of a mess as it is. Fwiw while looking at https://gitlab.winehq.org/wine/wine/-/merge_requests/5057 I believe I found some things that cause unnecessary refreshes of the display device cache during virtual desktop activation. The https://gitlab.winehq.org/wine/wine/-/blob/master/dlls/win32u/winstation.c#L... line specifically should probably be changed to FALSE (force).
Unfortunately that doesn't seem to help much. I admit my measurement is not very precise (I just launch notepad++ in an active prefix to avoid prefix initialization, and use `time` and then just press CTRL+C when window shows up), so I'm basically taking anything within ~0.1sec as margin of error.
With this MR though, it's very noticeable, since it basically goes down by like half a second.
Of course, I didn't even time it when I looked for the regression, because it was "obvious" it was way slower. It's *that* noticeable and annoying for me. :/
I don't know why there's such a big difference, I'm guessing too many wineserver round trips?
On Mon Feb 12 18:51:02 2024 +0000, Gabriel Ivăncescu wrote:
Unfortunately that doesn't seem to help much. I admit my measurement is not very precise (I just launch notepad++ in an active prefix to avoid prefix initialization, and use `time` and then just press CTRL+C when window shows up), so I'm basically taking anything within ~0.1sec as margin of error. With this MR though, it's very noticeable, since it basically goes down by like half a second. Of course, I didn't even time it when I looked for the regression, because it was "obvious" it was way slower. It's *that* noticeable and annoying for me. :/ I don't know why there's such a big difference, I'm guessing too many wineserver round trips?
Hopefully it's not about wineserver calls. Are you using it with virtual desktop mode enabled or disabled?
On Mon Feb 12 18:55:26 2024 +0000, Rémi Bernon wrote:
Hopefully it's not about wineserver calls. Are you using it with virtual desktop mode enabled or disabled?
No virtual desktop, just normal managed windows and with winex11.
On Mon Feb 12 18:57:05 2024 +0000, Gabriel Ivăncescu wrote:
No virtual desktop, just normal managed windows and with winex11.
Using https://gitlab.winehq.org/rbernon/wine/-/commit/671d98a6e2f7d75040ebff605723... to get prefix boot time relative timestamps, and a trace in `X11DRV_MapNotify`, I don't see any meaningful difference between 9.2 and a hacked version where `is_virtual_desktop` directly returns FALSE. For me in both cases it takes ~1.2s for the notepad window to appear.
On Tue Feb 13 13:46:53 2024 +0000, Rémi Bernon wrote:
Using https://gitlab.winehq.org/rbernon/wine/-/commit/671d98a6e2f7d75040ebff605723... to get prefix boot time relative timestamps, and a trace in `X11DRV_MapNotify`, I don't see any meaningful difference between 9.2 and a hacked version where `is_virtual_desktop` directly returns FALSE. For me in both cases it takes ~1.2s for the notepad window to appear.
1.2 sec seems a bit on the high side. Are you sure you're running it on an **active** prefix? That is, have a simple app open in the background.
I'm asking because I don't know see how boot time relative timestamps would help in this scenario.
On Tue Feb 13 16:15:21 2024 +0000, Gabriel Ivăncescu wrote:
1.2 sec seems a bit on the high side. Are you sure you're running it on an **active** prefix? That is, have a simple app open in the background. I'm asking because I don't know see how boot time relative timestamps would help in this scenario.
I used your patch and added traces to `init_cpu_info` in ntdll and in `X11DRV_MapNotify`, then subtracted them. My results are:
Upstream: 1070 With return FALSE hack: 615
Again, with an app sitting in the background before launching Notepad++, so prefix initialization is avoided (I mean, it's not a contrived case, I actually do launch notepad++ from other apps, which is why it's annoying me).
On Tue Feb 13 16:21:22 2024 +0000, Gabriel Ivăncescu wrote:
I used your patch and added traces to `init_cpu_info` in ntdll and in `X11DRV_MapNotify`, then subtracted them. My results are: Upstream: 1070 With return FALSE hack: 615 Again, with an app sitting in the background before launching Notepad++, so prefix initialization is avoided (I mean, it's not a contrived case, I actually do launch notepad++ from other apps, which is why it's annoying me).
Here's some more information and measurements. You guess correctly first time that it's *mostly* the display cache update that is causing it, though it's still weird for me.
The main culprit are the two calls in `update_display_cache`. BUT, the weird thing is that, despite the first one being unconditionally called, it's actually the second one that has a much worse impact.
``` Hacking only the first is_virtual_desktop() to FALSE: 988 Hacking only the second is_virtual_desktop() to FALSE: 783 (?!?) Hacking both of them to FALSE: 735 Hacking *ALL* is_virtual_desktop to FALSE: 615 ```
It makes little sense to me why the second is_virtual_desktop (the one on the `if (ret && is_virtual_desktop())` line) has so much more impact.
But it also shows that it's these 2 calls that together make up the bulk of the perf impact.
Any ideas?