This introduces a faster implementation of signal and wait operations on NT events, semaphores, and mutexes, which improves performance to native levels for a wide variety of games and other applications.
The goal here is similar to the long-standing out-of-tree "esync" and "fsync" patch sets, but without the flaws that make those patch sets not upstreamable.
The Linux "ntsync" driver is not currently released. It has been accepted into the trunk Linux tree for 6.14, so barring any extraordinary circumstances, the API is frozen and it will be released in its current form in about 2 months. Since it has passed all relevant reviewers on the kernel side, and the API is all but released, it seems there is no reason any more not to submit the Wine side to match.
Some important notes:
* This patch set does *not* include any way to disable ntsync support, since that kind of configuration often seems to be dispreferred where not necessary. In essence, ntsync should just work everywhere.
Probably the easiest way to effectively disable ntsync, for the purposes of testing, is to chmod the /dev/ntsync device to prevent its being opened. Regardless, a Wine switch to disable ntsync can be added simply enough. Note that it should probably not take the form of a registry key, however, since it needs to be easily accessible from the server itself.
* It is, generally speaking, not possible for only some objects, or some processes, to have backing ntsync objects, while others use the old server path. The esync/fsync patch sets explicitly protected against this by making sure every process had a consistent view of whether esync was enabled. This is not provided here, since no switch is provided to toggle ntsync, and it should not be possible to get into such an inconsistent state without gross misconfiguration.
* Similarly, no diagnostic messages are provided to note that ntsync is in use, or not in use. These messages are part of esync/fsync, as well as part of ntsync "testing" trees unofficially distributed. However, if ntsync is working correctly, no message should be necessary.
The basic structure is:
* Each type of server object which can be waited on by the client (including events, semaphores, mutexes, but also other types such as processes, files) must store an "inproc_sync" object.
This "inproc_sync" object is a full server object (note that this differs from esync/fsync). A vector and server request is introduced to retrieve an NT handle to this object from an arbitrary NT handle.
Since the actual ntsync objects are simply distinct file descriptions, we then call get_handle_fd from the client to retrieve an fd to the object, and then perform ioctls on it.
* Objects signaled by the server (processes, files, etc) perform ntsync ioctls on that object. The backing object in all such cases is simply an event.
* Signal and wait operations on the client side attempt to defer to an "inproc_*" function, falling back to the server implementation if it returns STATUS_NOT_IMPLEMENTED. This mirrors how in-process synchronization objects (critical sections, SRW locks, etc) used to be implemented—attempting to use an architecture-specific "fast_*" function and falling back if it returned STATUS_NOT_IMPLEMENTED.
* The inproc_sync handles, once retrieved, are cached per-process. This caching takes a similar form to the fd cache. It does not reuse the same infrastructure, however.
The primary reason for this is that the fd cache is designed to fit within a 64-bit value and uses 64-bit atomic operations to ensure consistency. However, we need to store more than 64 bits of information. [We also need to modify them after caching, in order to correctly implement handle closing—see below.]
The secondary reason is that retrieving the ntsync fd from the inproc_sync handle itself uses the fd cache.
* In order to keep the Linux driver simple, it does not implement access flags (EVENT_MODIFY_STATE etc.) Instead, the flags are cached locally and validated there. This too mirrors the fd cache. Note that this means that a malicious process can now modify objects it should not be able modify—which is less true than it is with wineserver—but this is no different from the way other objects (notably fds) are handled, and would require manual syscalls.
* In order to achieve correct behaviour related to closing objects while they are used, this patch set essentially relies on refcounting. This is broadly true of the server as well, but because we need to avoid server calls when performing object operations, significantly more care must be taken.
In particular, because waits need to be interruptable by signals and then be restarted, we need the backing ntsync object to remain valid until all users are done with it. On a process level, this is achieved by letting multiple processes own handles to the underlying inproc_sync server object.
On a thread level, multiple simultaneous calls need to refcount the process's local handle. This refcount is stored in the sync object cache. When it reaches zero, the cache is cleared.
Punting this behaviour to the Linux driver would have introduced a great deal more complexity, which is best kept in userspace and out of the kernel.
* The cache is, as such, treated as a cache. The penultimate commit, which introduces client support but does not yet cache the objects, effectively illustrates this by never actually caching anything, and retrieving a new NT handle and fd every time.
* Certain waits, on internal handles (such as async, startup_info, completion), are delegated to the server even when ntsync is used. Those server objects do not create an underlying ntsync object.
-- v3: ntdll: Cache in-process synchronization objects. ntdll: Use server_wait_for_object() when waiting on only the queue object. ntdll: Use in-process synchronization objects. ntdll: Introduce a helper to wait on an internal server handle. server: Allow creating an event object for client-side user APC signaling. server: Introduce select_inproc_queue and unselect_inproc_queue requests. server: Add a request to retrieve the in-process synchronization object from a handle. server: Create in-process synchronization objects for fd-based objects. server: Create in-process synchronization objects for timers. server: Create in-process synchronization objects for threads. server: Create in-process synchronization objects for message queues. server: Create in-process synchronization objects for jobs. server: Create in-process synchronization objects for processes. server: Create in-process synchronization objects for keyed events. server: Create in-process synchronization objects for device managers. server: Create in-process synchronization objects for debug objects. server: Create in-process synchronization objects for console servers. server: Create in-process synchronization objects for consoles. server: Create in-process synchronization objects for completion ports. server: Create in-process synchronization objects for mutexes. server: Create in-process synchronization objects for semaphores. server: Create in-process synchronization objects for events. server: Add an object operation to retrieve an in-process synchronization object. ntdll: Retrieve and cache an ntsync device in wait calls. ntdll: Add stub functions for in-process synchronization. ntdll: Add some traces to synchronization methods.
This merge request has too many patches to be relayed via email. Please visit the URL below to see the contents of the merge request. https://gitlab.winehq.org/wine/wine/-/merge_requests/7226
On Thu Feb 6 12:19:19 2025 +0000, Rémi Bernon wrote:
ioctl returns an int, not a NTSTATUS, it's also not very much used, what about getting rid of it (here and elsewhere below):
if (ioctl( obj, NTSYNC_IOC_SEM_RELEASE, &count ) < 0)
No idea why I wrote it that way; changed in v3.
On Thu Feb 6 12:19:20 2025 +0000, Rémi Bernon wrote:
As you've changed the `enum inproc_sync_type` definition, INPROC_SYNC_AUTO_EVENT is now 0, this will break this check I think? Maybe add a specific INPROC_SYNC_UNKNOWN = 0 enum value?
Right, I should have known I had it that way for a reason...
On Mon Feb 10 21:43:53 2025 +0000, Elizabeth Figura wrote:
That's a valid concern, but I'm more than a little hesitant to *not* lazily initialize just because of that. I'd rather just add a comment explaining why we lazily create and that the state is identical to what it is at creation time.
I don't think we should preemptively guard against issues like that at the cost of making the code so much difficult to follow. If wineserver / kernel state usages are exclusive, it should IMO be explicit in the code. If we still need and can make things lazier later then we can consider it, but maybe it's not even going to be needed. If it really is, it should probably be hidden in the inproc sync logic (like only create the fd lazily).
On Tue Feb 11 09:16:46 2025 +0000, Rémi Bernon wrote:
I don't think we should preventively guard against issues like that at the cost of making the code so much difficult to follow. If wineserver / kernel state usages are exclusive, it should IMO be explicit in the code (not through comments). If we still need and can make things lazier later then we can consider it, but maybe it's not even going to be needed. If it really is, it should probably be hidden in the inproc sync logic (like only create the fd lazily).
But it's not preemptive, there are multiple applications that open huge quantities of handles that they never use.
Moving the lazy initialization to the inproc_sync object itself would alleviate it partially, but we'd still be allocating tons of memory in wineserver that we don't need. Lazily allocating seems like a worthwhile tradeoff, when we can explain the arrangement in comments.
Did something break during a rebase? This doesn't appear to work for me (on 6.14-rc2, with CONFIG_NTSYNC=y). `/dev/ntsync` exists but `lsof` doesn't find any processes that have it open even while wine is running. This MR also has almost no debug logs so it's hard to say what's going wrong.
edit:
`device` is -1 from `get_linux_sync_device`
you might need to check the permission of /dev/ntsync first. might be you did not has permission to read that device.
On Thu Feb 13 09:09:26 2025 +0000, Chunhao Hung wrote:
you might need to check the permission of /dev/ntsync first. might be you did not has permission to read that device.
Indeed, but why does `/dev/ntsync` require sudo to read? I just used chmod to give my user permission for it. Is this an issue with the kernel module?
On Thu Feb 13 09:09:26 2025 +0000, llyyr wrote:
Indeed, but why does `/dev/ntsync` require sudo to read? I just used chmod to give my user permission for it. Is this an issue with the kernel module?
usually you can change it by udev rule, it just default only root could read due to device usually is relative to hardware. This is mine udev rule ``` KERNEL=="ntsync", GROUP="input", MODE="0666" ```
On Thu Feb 13 09:17:42 2025 +0000, Chunhao Hung wrote:
usually you can change it by udev rule, it just default only root could read due to device usually is relative to hardware. This is mine udev rule
KERNEL=="ntsync", GROUP="input", MODE="0666"
Is it possible to fix in the kernel module? I'm sure it's going to be quite some time till distros provide such udev rule OOTB, so the functional simply won't be working, potentially leading to a stream of bugreports and unhappy users.
On Thu Feb 13 09:22:20 2025 +0000, Konstantin Kharlamov wrote:
Is it possible to fix in the kernel module? I'm sure it's going to be quite some time till distros provide such udev rule OOTB, so the functional simply won't be working, potentially leading to a stream of bugreports and unhappy users.
I agree, I think it'd be best to fix the kernel module while it's still in rc stages.
On Thu Feb 13 11:00:56 2025 +0000, llyyr wrote:
I agree, I think it'd be best to fix the kernel module while it's still in rc stages.
If ntsync isn't available then wine should continue working as before; I don't see why there would be a "stream of bug reports and unhappy users".
That said, I don't think there's a reason not to create the device with permissive bits by default. I think I only didn't because I was having trouble figuring out how to actually do that from the kernel.
If ntsync isn't available then wine should continue working as before; I don't see why there would be a "stream of bug reports and unhappy users".
Because similarly to @llyyr here people would be checking out on whether ntsync in use, and since it isn't they would be creating reports about this.
If ntsync isn't available then wine should continue working as before
"Before" is working esync/fsync, and those patches probably won't be carried once ntsync is merged. So the "before" is actually server side sync. I understand wine doesn't need to care about this, but it'd still be a good idea to create /dev/ntsync with more permissive bits by default.
On Thu Feb 13 18:31:12 2025 +0000, llyyr wrote:
If ntsync isn't available then wine should continue working as before
"Before" right now is working esync/fsync, and those patches probably won't be carried once ntsync is merged. So the "before" after this MR is actually server side sync. I understand wine doesn't need to care about this, but it'd still be a good idea to create /dev/ntsync with more permissive bits by default.
I'd say especially if Wine is the primary and de-facto the only user of /dev/ntsync at this point, it makes sense to give ntsync more relaxed permissions by default. Is it even possible on the kernel side or kernel delegates this to udev?
On Thu Feb 13 18:43:38 2025 +0000, Shmerl wrote:
I'd say especially if Wine is the primary and de-facto the only user of /dev/ntsync at this point, it makes sense to give ntsync more relaxed permissions by default. Is it even possible on the kernel side or kernel delegates this to udev?
You do it in the kernel driver using the `module_param` macro.
That said, I don't think there's a reason not to create the device with permissive bits by default. I think I only didn't because I was having trouble figuring out how to actually do that from the kernel.
I'd guess setting .mode in struct misc_device should do that? Unless something is overriding it in userspace by default on device file creation of course.
On Tue Feb 11 21:15:14 2025 +0000, Elizabeth Figura wrote:
But it's not preemptive, there are multiple applications that open huge quantities of handles that they never use. Moving the lazy initialization to the inproc_sync object itself would alleviate it partially, but we'd still be allocating tons of memory in wineserver that we don't need. Lazily allocating seems like a worthwhile tradeoff, when we can explain the arrangement in comments.
If the problem is the amount of memory being allocated maybe we should probably consider whether this can be done using a more lightweight and specific mechanism?
It doesn't seem that the inproc_sync objects and fds are really useful as standalone objects from wineserver side. Except for a couple of specific calls which use ioctl to signal/reset the fds they are not used for anything else?
I understand they are there because it's then convenient for handle allocation and sending fds to the client side, but perhaps we should consider doing that differently? If you could just keep the ntsync fds and the inproc type around, and send the fds to the client side through `send_client_fd`/`receive_fd` and without going through NT handles that would reduce the memory usage by a lot.
I think that could also be interesting to release pressure from the fd cache that you are here also using for waitable objects. And by the way, having two different caches being used at the same time also seems wrong to me. The inproc subsystem should be self-sufficient and use its own cache to cache the fds it needs to cache.
Rémi Bernon (@rbernon) commented about server/inproc_sync.c:
file_set_error();
return NULL;
- }
- if (!(device = alloc_object( &linux_device_ops )))
- {
close( unix_fd );
return NULL;
- }
- if (!(device->fd = create_anonymous_fd( &inproc_sync_fd_ops, unix_fd, &device->obj, 0 )))
- {
release_object( device );
return NULL;
- }
- allow_fd_caching( device->fd );
I don't see any reason to use the fd cache for this, it's going to be retrieved only once and already going to be cached separately.
Rémi Bernon (@rbernon) commented about dlls/ntdll/unix/sync.c:
/* someone beat us to it */
if (needs_close) close( fd );
NtClose( handle );
}
/* otherwise don't close the device */
}
else
{
InterlockedCompareExchange( &device, -1, -2 );
NtClose( handle );
}
}
else
{
InterlockedCompareExchange( &device, -1, -2 );
}
The comment above still applies and I don't think we should be leaking the handle or using the fd cache her. Then we don't need to keep or leak the handle and this could still thus be simplified to:
```suggestion:-21+0 if (ret || server_get_unix_fd( handle, 0, &fd, &needs_close, NULL, NULL )) fd = -1; if (InterlockedCompareExchange( &device, fd, -2 ) != -2 && fd != -1) close( fd ); /* someone beat us to it */ if (!ret) NtClose( handle ); ```
Optionally with an `assert( fd == -1 || needs_close );` if you think it's worth being careful but I don't think it's necessary.
On Fri Feb 14 16:04:50 2025 +0000, Rémi Bernon wrote:
I don't see any reason to use the fd cache for this, it's going to be retrieved only once and already going to be cached separately.
I added it because the fd *is* cacheable, so it's more correct from a certain point of view, but indeed it doesn't really matter whether we take advantage of the cache or not.
If the problem is the amount of memory being allocated maybe we should probably consider whether this can be done using a more lightweight and specific mechanism?
I wouldn't necessarily say that memory allocation is a problem in general, it's just a couple of particularly ill-behaved applications I'm worried about. As you say, I'm trying to accommodate some known bad cases while using the most straightforward approach to the rest.
It doesn't seem that the inproc_sync objects and fds are really useful as standalone objects from wineserver side. Except for a couple of specific calls which use ioctl to signal/reset the fds they are not used for anything else?
I understand they are there because it's then convenient for handle allocation and sending fds to the client side, but perhaps we should consider doing that differently? If you could just keep the ntsync fds and the inproc type around, and send the fds to the client side through `send_client_fd`/`receive_fd` and without going through NT handles that would reduce the memory usage by a lot.
We could demote the objects from being full server objects, and instead use send_client_fd() and receive_fd() directly. That wasn't done partly for historical reasons (in an earlier revision, I may have mentioned, the objects weren't fds but rather held an index which the server managed—so we really needed this design) and partly because that would make this code one out of one place which uses receive_fd() externally. Also, see the points raised above and below in this comment.
Since that amounts to a significant rewrite of the patch set I'd kind of appreciate confirmation from Alexandre that we want to do this before I go about it.
I think that could also be interesting to release pressure from the fd cache that you are here also using for waitable objects. And by the way, having two different caches being used at the same time also seems wrong to me. The inproc subsystem should be self-sufficient and use its own cache to cache the fds it needs to cache.
I don't really see why two caches is wrong? Note also that the fd cache uses a flat array indexed by handle value, so simply caching a handle elsewhere instead of the fd cache actually *increases* memory usage. If we were to remove the handles altogether, that would potentially help (though it's questionable by how much).
On Fri Feb 14 16:04:50 2025 +0000, Rémi Bernon wrote:
The comment above still applies and I don't think we should be leaking the handle or using the fd cache her. Then we don't need to keep or leak the handle and this could still thus be simplified to:
if (ret || server_get_unix_fd( handle, 0, &fd, &needs_close, NULL, NULL )) fd = -1; if (InterlockedCompareExchange( &device, fd, -2 ) != -2 && fd != -1) close( fd ); /* someone beat us to it */ if (!ret) NtClose( handle );
Optionally with an `assert( fd == -1 || needs_close );` if you think it's worth being careful but I don't think it's necessary.
I don't see why this is an improvement? There's not any problem with "leaking" an object that lasts the entire duration of the process (we do it elsewhere), and if there was this would still mean we're "leaking" the fd. This just makes the code more unidiomatic and adds reliance on implementation details.
Sorry for the bother, you may already know this, but commit https://gitlab.winehq.org/wine/wine/-/commit/3c29dc1eb86cbe96697193cf6799373... breaks NTSync. It does not create descriptors to ``/dev/ntsync`` by processes other than wineserver. With ``WINEDEBUG=+server`` following messages appear despite correct permissions to the ``/dev/ntsync`` file: ``` 0024: get_linux_sync_device( ) 0024: get_linux_sync_device( ) = ACCESS_DENIED { handle=0000 } 002c: get_linux_sync_device( ) 002c: get_linux_sync_device( ) = ACCESS_DENIED { handle=0000 } 003c: get_linux_sync_device( ) ```
After revert the specified commit, everything works. Thanks!
On Sun Feb 16 15:31:16 2025 +0000, Vasiliy Stelmachenok wrote:
Sorry for the bother, you may already know this, but commit https://gitlab.winehq.org/wine/wine/-/commit/3c29dc1eb86cbe96697193cf6799373... breaks NTSync. It does not create descriptors to ``/dev/ntsync`` by processes other than wineserver. With ``WINEDEBUG=+server`` following messages appear despite correct permissions to the ``/dev/ntsync`` file:
0024: get_linux_sync_device( ) 0024: get_linux_sync_device( ) = ACCESS_DENIED { handle=0000 } 002c: get_linux_sync_device( ) 002c: get_linux_sync_device( ) = ACCESS_DENIED { handle=0000 } 003c: get_linux_sync_device( )
After revert the specified commit, everything works. Thanks!
Does replacing `alloc_handle()` call with `alloc_handle_no_access_check()` in `get_linux_sync_device()` help with the issue?
On Sun Feb 16 15:31:16 2025 +0000, Aida Jonikienė wrote:
Does replacing `alloc_handle()` call with `alloc_handle_no_access_check()` in `get_linux_sync_device()` help with the issue?
Replacing `alloc_handle()` with `alloc_handle_no_access_check()` only in `get_linux_sync_device()` resulted in errors, but if replace all 3 occurrences of `alloc_handle()` in patch, NTSync actually works again. Thanks!
On Sun Feb 16 16:36:04 2025 +0000, Vasiliy Stelmachenok wrote:
Replacing `alloc_handle()` with `alloc_handle_no_access_check()` only in `get_linux_sync_device()` resulted in errors, but if replace all 3 occurrences of `alloc_handle()` in patch, NTSync actually works again. Thanks!
It's not required in `get_inproc_alert_event`, because it has a non-zero access mask already (`SYNCHRONIZE`).