This is the continuation of https://gitlab.winehq.org/wine/wine/-/merge_requests/7317.
There are a lot of smaller commits (some even no-op renames), because there are some inconsistencies in the codebase currently of what an NT thread priority vs a base thread priority should be (or even constant names being wrong), and I hope that this clears that up a bit.
I tried making the commits as atomic as possible. I hope this is fine... There are still a few tiny details missing, like boosting and `KeSetPriorityThread` and friends, which are either stubs or not correctly working as well and have been moved to the next part.
I wrote a pretty extensive test for all this too, since pretty much none of how this works is documented (or even misrepresented quite often). For testing, thread priority boosting has been disabled in order to make the behavior on Windows not flaky and dynamic. In fact, on Windows, there is some thread priority boost decay going on after the message has been processed and other mechanisms.
The `last` field of `get_thread_info` reply was moved to a flag in order to make space for the thread base priority, since it is already at the limit of 64 bytes.
--
v9: ntdll/tests: Add tests for process and thread priority.
server, ntdll: Fetch both process priority and base_priority.
server: Add process base priority helper.
kernelbase: Use ProcessPriorityClass info class in GetPriorityClass.
kernelbase: Use correct priority in GetThreadPriority.
server, ntdll: Fetch both thread priority and base_priority.
server, ntdll: Move last thread information to get_thread_info flags.
ntdll: Implement ThreadPriority class in NtSetInformationThread.
server: Implement setting thread priority directly.
server: Add set_thread_priority helper.
server: Rename thread priority to base_priority.
server: Rename base_priority to effective_priority in apply_thread_priority.
server: Infer process priority class in set_thread_priority.
https://gitlab.winehq.org/wine/wine/-/merge_requests/7516
x86_64 Windows and macOS both use `%gs` to access thread-specific data (Windows TEB, macOS TSD). To date, Wine has worked around this conflict by filling the most important TEB fields (`0x30`/`Self`, `0x58`/`ThreadLocalStorage`) in the macOS TSD structure (Apple reserved the fields for our use). This was sufficient for most Windows apps.
CrossOver's Wine had an additional hack to handle `0x60`/`ProcessEnvironmentBlock`, and binary patches for certain CEF binaries which directly accessed `0x8`/`StackBase`. Additionally, Apple's libd3dshared could activate a special mode in Rosetta 2 where code executing in certain regions would use the Windows TEB when accessing `%gs`.
Now that the PE separation is complete, GSBASE can be swapped when entering/exiting PE code. This is done in the syscall dispatcher, unix-call dispatcher, and for user-mode callbacks. GSBASE also needs to be set to the macOS TSD when entering signal handlers (in `init_handler()`), and then restored to the Windows TEB when exiting (in `leave_handler()`).
Some changes to the syscall dispatcher were needed to ensure that the TEB is not accessed through `%gs` while on the kernel stack (since a SIGUSR1 while on the kernel stack will result in GSBASE being set to the TSD).
---
I've tested this successfully on macOS 15 (Apple Silicon and Intel) and macOS 10.13 with several apps and games, including the `cefclient.exe` CEF sample.
Encouragingly, in some simple tests I didn't see a noticeable performance regression from this MR.
There are drawbacks though:
- libraries which jump directly from PE code into Unix code (expecting that %gs is always pointing to the macOS TSD) will crash. Notable examples are D3DMetal and DXMT. These will need to be changed to use Unix calls.
- If Windows code uses the `syscall` instruction directly, the stack pointer likely needs to be valid (which is probably not true on Windows). This is due to the syscall dispatcher saving registers onto the user stack and having to call `_thread_set_tsd_base`. I can't say I've ever seen direct syscalls done with an invalid `%rsp`, but it seems like something anticheat code might do.
---
macOS does not have a public API for setting GSBASE, but the private `_thread_set_tsd_base()` works and was added in macOS 10.12.
`_thread_set_tsd_base()` is a small thunk that sets `%esi`, `%eax`, and does the `syscall`: https://github.com/apple-oss-distributions/xnu/blob/8d741a5de7ff4191bf97d57….
The syscall instruction itself clobbers `%rcx` and `%r11`.
I've tried to save as few registers as possible when calling `_thread_set_tsd_base()`, but there may be room for improvement there.
---
I also tested an alternate implementation strategy for this which took advantage of the expanded "full" thread state which is passed to signal handlers when a process has set a user LDT. The full thread state includes GSBASE, so GSBASE is set back to whatever is in the sigcontext on return (like every other field in the context). This would avoid needing to explicitly reset GSBASE in `leave_handler()`.
This strategy was simpler, but I'm not using it for 2 reasons:
- the "full" thread state is only available starting with macOS 10.15, and we still support 10.13.
- more crucially, Rosetta 2 doesn't seem to correctly implement the GS.base field of the full thread state. It's set to 0 on entry, and isn't read on exit.
--
v6: ntdll: Remove x86_64 Mac-specific TEB access workarounds that are no longer needed.
ntdll: On macOS x86_64, swap GSBASE between the TEB and macOS TSD when entering/leaving PE code.
ntdll: Set %rsp to be inside syscall_frame before accessing %gs in x86_64 syscall dispatcher.
https://gitlab.winehq.org/wine/wine/-/merge_requests/6866
Microsoft Edge worked if windows version is set to win8.1 (or win7) but
now crashes in current wine due to this unimplemented function.
This patch makes it work again in win8.1 mode;
(It's still broken in win10 mode, see https://bugs.winehq.org/show_bug.cgi?id=56378 for more info)
--
v15: win32u: Add stub for NtUserSetAdditionalForegroundBoostProcesses.
https://gitlab.winehq.org/wine/wine/-/merge_requests/7607
Microsoft Edge worked if windows version is set to win8.1 (or win7) but
now crashes in current wine due to this unimplemented function.
This patch makes it work again in win8.1 mode;
(It's still broken in win10 mode, see https://bugs.winehq.org/show_bug.cgi?id=56378 for more info)
--
v14: v2: handle handles inside array and rename variable.
win32u: Add stub for NtUserSetAdditionalForegroundBoostProcesses.
https://gitlab.winehq.org/wine/wine/-/merge_requests/7607
In CopyFileEx, and DeleteFile functions, by default, the file name
and path are limited to MAX_PATH characters. To extend this limit
to 32,767 wide characters, need prepend "\\\\?\\" to the path.
--
v4: kernelbase: Limit the maximum path length for DeleteFile.
kernelbase: Fix DeleteFileA doesn't support long path.
kernelbase: Limit the maximum path length for filesystem.
ntdll: Load the value of longPathAware from manifest.
kernel32/tests: Add tests for maximum path length limitation.
https://gitlab.winehq.org/wine/wine/-/merge_requests/7540
In case of empty block reader we currently return uninitialized pointer
that is accessed later.
Signed-off-by: Nikolay Sivov <nsivov(a)codeweavers.com>
--
v2: windowscodecs/tests: Add some tests for the App0 reader.
https://gitlab.winehq.org/wine/wine/-/merge_requests/7653
This MR is similar to MR !7563 but targets the MFT audio decoders rather than the video decoders.
--
v2: mf/tests: Add timestamp tests for wma audio decoder.
mf/tests: Add timestamp tests for aac audio decoder.
https://gitlab.winehq.org/wine/wine/-/merge_requests/7649