This introduces a faster implementation of signal and wait operations on NT events, semaphores, and mutexes, which improves performance to native levels for a wide variety of games and other applications.
The goal here is similar to the long-standing out-of-tree "esync" and "fsync" patch sets, but without the flaws that make those patch sets not upstreamable.
The Linux "ntsync" driver is not currently released. It has been accepted into the trunk Linux tree for 6.14, so barring any extraordinary circumstances, the API is frozen and it will be released in its current form in about 2 months. Since it has passed all relevant reviewers on the kernel side, and the API is all but released, it seems there is no reason any more not to submit the Wine side to match.
Some important notes:
* This patch set does *not* include any way to disable ntsync support, since that kind of configuration often seems to be dispreferred where not necessary. In essence, ntsync should just work everywhere.
Probably the easiest way to effectively disable ntsync, for the purposes of testing, is to chmod the /dev/ntsync device to prevent its being opened. Regardless, a Wine switch to disable ntsync can be added simply enough. Note that it should probably not take the form of a registry key, however, since it needs to be easily accessible from the server itself.
* It is, generally speaking, not possible for only some objects, or some processes, to have backing ntsync objects, while others use the old server path. The esync/fsync patch sets explicitly protected against this by making sure every process had a consistent view of whether esync was enabled. This is not provided here, since no switch is provided to toggle ntsync, and it should not be possible to get into such an inconsistent state without gross misconfiguration.
* Similarly, no diagnostic messages are provided to note that ntsync is in use, or not in use. These messages are part of esync/fsync, as well as part of ntsync "testing" trees unofficially distributed. However, if ntsync is working correctly, no message should be necessary.
The basic structure is:
* Each type of server object which can be waited on by the client (including events, semaphores, mutexes, but also other types such as processes, files) must store an "inproc_sync" object.
This "inproc_sync" object is a full server object (note that this differs from esync/fsync). A vector and server request is introduced to retrieve an NT handle to this object from an arbitrary NT handle.
Since the actual ntsync objects are simply distinct file descriptions, we then call get_handle_fd from the client to retrieve an fd to the object, and then perform ioctls on it.
* Objects signaled by the server (processes, files, etc) perform ntsync ioctls on that object. The backing object in all such cases is simply an event.
* Signal and wait operations on the client side attempt to defer to an "inproc_*" function, falling back to the server implementation if it returns STATUS_NOT_IMPLEMENTED. This mirrors how in-process synchronization objects (critical sections, SRW locks, etc) used to be implemented—attempting to use an architecture-specific "fast_*" function and falling back if it returned STATUS_NOT_IMPLEMENTED.
* The inproc_sync handles, once retrieved, are cached per-process. This caching takes a similar form to the fd cache. It does not reuse the same infrastructure, however.
The primary reason for this is that the fd cache is designed to fit within a 64-bit value and uses 64-bit atomic operations to ensure consistency. However, we need to store more than 64 bits of information. [We also need to modify them after caching, in order to correctly implement handle closing—see below.]
The secondary reason is that retrieving the ntsync fd from the inproc_sync handle itself uses the fd cache.
* In order to keep the Linux driver simple, it does not implement access flags (EVENT_MODIFY_STATE etc.) Instead, the flags are cached locally and validated there. This too mirrors the fd cache. Note that this means that a malicious process can now modify objects it should not be able modify—which is less true than it is with wineserver—but this is no different from the way other objects (notably fds) are handled, and would require manual syscalls.
* In order to achieve correct behaviour related to closing objects while they are used, this patch set essentially relies on refcounting. This is broadly true of the server as well, but because we need to avoid server calls when performing object operations, significantly more care must be taken.
In particular, because waits need to be interruptable by signals and then be restarted, we need the backing ntsync object to remain valid until all users are done with it. On a process level, this is achieved by letting multiple processes own handles to the underlying inproc_sync server object.
On a thread level, multiple simultaneous calls need to refcount the process's local handle. This refcount is stored in the sync object cache. When it reaches zero, the cache is cleared.
Punting this behaviour to the Linux driver would have introduced a great deal more complexity, which is best kept in userspace and out of the kernel.
* The cache is, as such, treated as a cache. The penultimate commit, which introduces client support but does not yet cache the objects, effectively illustrates this by never actually caching anything, and retrieving a new NT handle and fd every time.
* Certain waits, on internal handles (such as async, startup_info, completion), are delegated to the server even when ntsync is used. Those server objects do not create an underlying ntsync object.
Will esync patches from staging be still usable after this MR is merged in case when Wine will be running on the kernel that doesn't have `/dev/ntsync` yet? They won't be useful after kernel with it will get released, but in the interim there would still be a need for that (and also for cases of those who have older kernels installed).
Aida Jonikienė (@DodoGTA) commented about dlls/ntdll/unix/sync.c:
{ union select_op select_op; UINT i, flags = SELECT_INTERRUPTIBLE;
unsigned int ret;
if (!count || count > MAXIMUM_WAIT_OBJECTS) return STATUS_INVALID_PARAMETER_1;
if (TRACE_ON(sync))
{
TRACE( "wait_any %u, alertable %u, handles {%p", wait_any, alertable, handles[0] );
for (i = 1; i < count; i++) TRACE( ", %p", handles[i] );
TRACE( "}, timeout %s\n", debugstr_timeout(timeout) );
}
if ((ret = inproc_wait( count, handles, wait_any, alertable, timeout )) != STATUS_NOT_IMPLEMENTED)
This is going to fail with no server fallback with completion waits (because `completion_wait_ops` has `no_get_inproc_sync`)
That failure can have realistic consequences (like GTA 3 crashing because of an `quartz` assert caused by this function failing)
So can there be an extra `STATUS_OBJECT_TYPE_MISMATCH` check here (or actually implementing completion wait support if that's even possible)?
On Thu Jan 30 03:43:56 2025 +0000, Shmerl wrote:
Will esync patches from staging be still usable after this MR is merged in case when Wine will be running on the kernel that doesn't have `/dev/ntsync` yet? They won't be useful after kernel with it will get released, but in the interim there would still be a need for that (and also for cases of those who have older kernels installed).
I was able to compile ntsync with esync/fsync together after a bit of patching (and both fsync and ntsync at least work) so I guess that's possible (I doubt Zeb is going to keep esync in staging after this thing gets merged though)
On Thu Jan 30 03:46:59 2025 +0000, Aida Jonikienė wrote:
I was able to compile ntsync with esync/fsync together after a bit of patching (and both fsync and ntsync at least work) so I guess that's possible (I doubt Zeb is going to keep esync in staging after this thing gets merged though)
From practical standpoint for end users, if esync staging patches will be dropped together with merging this, can this MR be synced with the 6.14 kernel release? This will allow users easier transition from esync to ntsync without leaving a gap that will require manual patches wrangling and running unreleased kernel.
It's a style issue, but `linux` appears a lot in the names of functions, server calls, variables, etc. In theory the ntsync driver could be ported to macOS or FreeBSD, and it would be confusing to be using the "linux" sync functions there.
More broadly, maybe it would be clearer to use `ntsync_` instead of `linux_` or `inproc_` (i.e. `ntsync_signal_and_wait()` instead of `inproc_signal_and_wait()`).
On Thu Jan 30 11:09:57 2025 +0000, Brendan Shanks wrote:
It's a style issue, but `linux` appears a lot in the names of functions, server calls, variables, etc. In theory the ntsync driver could be ported to macOS or FreeBSD, and it would be confusing to be using the "linux" sync functions there. More broadly, maybe it would be clearer to use `ntsync_` instead of `linux_` or `inproc_` (i.e. `ntsync_signal_and_wait()` instead of `inproc_signal_and_wait()`).
I'm inclined to think that non-ntsync code will be eventually phased out to minimize bitrotting code, instead introducing a "dummy" ntsync backend (for older kernels and other OSes) in tandem with the "Linux" ntsync backend. In that perspective, `ntsync_` seems more fit if such code structuring is ever to take place.
On Thu Jan 30 11:10:53 2025 +0000, Jinoh Kang wrote:
I'm inclined to think that non-ntsync code will be eventually phased out to minimize bitrotting code, instead introducing a "dumb" ntsync backend (for older kernels and other OSes) in tandem with the "Linux" ntsync backend (from this MR). In that perspective, `ntsync_` seems more fit if such code structuring is ever to take place.
I doubt non-ntsync support will be dropped, while wine is mainly used on Linux, it also has some uses on non Linux/ntsync systems.
On Thu Jan 30 04:01:03 2025 +0000, Shmerl wrote:
From practical standpoint for end users, if esync staging patches will be dropped together with merging this, can this MR be synced with the 6.14 kernel release? This will allow users easier transition from esync to ntsync without leaving a gap that will require manual patches wrangling and running unreleased kernel.
@shmerl you can always revert patches if you use your own branch while waiting for esync/fsync patchsets to be updated on top of ntsync.
I think it's better to not disrupt wine developers because of unofficial patchsets too much.
On Thu Jan 30 03:39:37 2025 +0000, Aida Jonikienė wrote:
This is going to fail with no server fallback with completion waits (because `completion_wait_ops` has `no_get_inproc_sync`) That failure can have realistic consequences (like GTA 3 crashing because of an `quartz` assert caused by this function failing) So can there be an extra `STATUS_OBJECT_TYPE_MISMATCH` check here (or actually implementing completion wait support if that's even possible)?
Waits on completion_wait_ops objects are only used internally in NtRemoveIoCompletion[Ex], handles to them are not exposed to application and they are not waited on together with any other objects. What can appear here is completion port object itself. Implementing ntsync object for completion_wait object is probably quite not trivial, they have special scheduling rules and action on satisfied wait currently.
On Thu Jan 30 16:04:52 2025 +0000, Paul Gofman wrote:
Waits on completion_wait_ops objects are only used internally in NtRemoveIoCompletion[Ex], handles to them are not exposed to application and they are not waited on together with any other objects. What can appear here is completion port object itself. Implementing ntsync object for completion_wait object is probably quite not trivial, they have special scheduling rules and action on satisfied wait currently.
Like other internal objects they go through server_wait_for_object().
On Thu Jan 30 15:46:20 2025 +0000, Loïc Rebmeister wrote:
I doubt non-ntsync support will be dropped, while wine is mainly used on Linux, it also has some uses on non Linux/ntsync systems.
My original thought was that if we got support for another OS it would likely have subtle differences that would need a new patch set to be written. I don't know anymore if that's realistically true, but the other problem is that "ntsync", in the context of Wine, is a lot less disambiguating than it is in the context of the kernel.
From private conversation with Alexandre I believe he felt similarly cool towards using "ntsync" as a prefix in Wine. "inproc" came from that same conversation. Regardless I'd be fine with renaming anything if it'd help, but I'd like to be sure since it would be quite some effort.
On Thu Jan 30 16:42:22 2025 +0000, Elizabeth Figura wrote:
Like other internal objects they go through server_wait_for_object().
I mean that the `NtWaitForMultipleObjects()` call will always return `STATUS_OBJECT_TYPE_MISMATCH` with a completion port object under ntsync because of the `no_get_inproc_sync` handler being used (the path goes like: `object->ops->get_fast_sync()` -> `get_linux_sync_obj()` -> (ntdll.so) `get_fast_sync_obj()` -> `fast_wait()`)
The failure of `NtWaitForMultipleObjects()` gets passed to `NtRemoveIoCompletion()` (which is the function Paul mentioned above) which causes that function to return early and not call the `get_thread_completion()` server function that fills up the `value` (later `*overlapped`) parameter with data
`GetQueuedCompletionStatus()` call now returns from that completion function with a failure and a very likely NULL `*overlapped` value (that `*overlapped` value gets passed to `CONTAINING_RECORD()` macro inside quartz's `io_thread()` function inside filesource.c which causes an integer underflow that gets accidentally picked up by the `assert()` nearby which prevents the error path from even being called)
So one way of preventing this issue could be to just return `STATUS_NOT_SUPPORTED` for the `get_fast_sync()` handler in `completion_wait_ops` (the other way could be what I've suggested above)
The relevant `quartz` function could maybe also set a sane value for `req` (so that the error path gets actually taken)
I mean that the `NtWaitForMultipleObjects()` call will always return `STATUS_OBJECT_TYPE_MISMATCH` with a completion port object under ntsync because of the `no_get_inproc_sync` handler being used (the path goes like: `object->ops->get_fast_sync()` -> `get_linux_sync_obj()` -> (ntdll.so) `get_fast_sync_obj()` -> `fast_wait()`)
It should never be possible for NtWaitForMultipleObjects() to be called with a completion_wait object. NtRemoveIoCompletion() uses server_wait_for_object(), and the wait handle is never exposed outside of that function.
On Thu Jan 30 21:29:04 2025 +0000, Elizabeth Figura wrote:
I mean that the `NtWaitForMultipleObjects()` call will always return
`STATUS_OBJECT_TYPE_MISMATCH` with a completion port object under ntsync because of the `no_get_inproc_sync` handler being used (the path goes like: `object->ops->get_fast_sync()` -> `get_linux_sync_obj()` -> (ntdll.so) `get_fast_sync_obj()` -> `fast_wait()`) It should never be possible for NtWaitForMultipleObjects() to be called with a completion_wait object. NtRemoveIoCompletion() uses server_wait_for_object(), and the wait handle is never exposed outside of that function.
The exact same code path seems to affect the Amazon Games app (it just closes when trying to open it with no obvious error messages)
The exact same code path seems to affect the Amazon Games app (it just closes when trying to open it with no obvious error messages)
How is NtWaitForMultipleObjects() being called with a completion handle? Can you please attach a log with +sync,+server,+pid?
Can you please attach a log with +sync,+server,+pid?
Here's a relevant snippet of my log (which should be good enough to reveal the issue): [gta3_ntsync_assert_snippet.log](/uploads/08e3614916cdccc931ab45107fe5e361/gta3_ntsync_assert_snippet.log)
On Tue Feb 4 08:42:06 2025 +0000, Aida Jonikienė wrote:
Can you please attach a log with +sync,+server,+pid?
Here's a relevant snippet of my log (which should be good enough to reveal the issue): [gta3_ntsync_assert_snippet.log](/uploads/08e3614916cdccc931ab45107fe5e361/gta3_ntsync_assert_snippet.log)
That log is from wine-staging. Can you reproduce the same issue with upstream wine with only !7726 applied?
On Tue Feb 4 10:42:38 2025 +0000, Jinoh Kang wrote:
That log is from wine-staging. Can you reproduce the same issue with upstream wine with only !7726 applied?
``` 0020:0130:err:sync:get_fast_sync_obj get_linux_sync_obj server ret 0xc0000024 0020:0130:err:sync:fast_wait get_fast_sync_obj ret 0xc0000024 ```
That's also not even this patch set.
On Tue Feb 4 19:03:36 2025 +0000, Elizabeth Figura wrote:
0020:0130:err:sync:get_fast_sync_obj get_linux_sync_obj server ret 0xc0000024 0020:0130:err:sync:fast_wait get_fast_sync_obj ret 0xc0000024
That's also not even this patch set.
I just realized this patchset (as well as `ntsync7` from the secret Wine repo) changes the `NtWaitForSingleObject()` call to `server_wait_for_object()` in the `NtRemoveIoCompletion()` function (so the problematic `fast_wait()` call is avoided)
There might be some apps messing with some NT internals which might re-trigger that issue (but that's basically an impossibility I guess)
That's also not even this patch set.
I added some debug prints there to explain my problem better (but this is also the older `ntsync5` patchset which has different behavior as I've realized above)