On 6/29/21 5:04 PM, Rémi Bernon wrote:
On 6/29/21 7:31 PM, Zebediah Figura (she/her) wrote:
On 6/29/21 12:25 PM, Rémi Bernon wrote:
On 6/29/21 7:17 PM, Zebediah Figura (she/her) wrote:
On 6/29/21 2:21 AM, Rémi Bernon wrote:
Signed-off-by: Rémi Bernon rbernon@codeweavers.com
dlls/winebus.sys/main.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/dlls/winebus.sys/main.c b/dlls/winebus.sys/main.c index 897be25ed51..f3e63e87aa2 100644 --- a/dlls/winebus.sys/main.c +++ b/dlls/winebus.sys/main.c @@ -374,10 +374,6 @@ void bus_unlink_hid_device(DEVICE_OBJECT *device) EnterCriticalSection(&device_list_cs); list_remove(&pnp_device->entry); LeaveCriticalSection(&device_list_cs);
- EnterCriticalSection(&ext->cs); - ext->removed = TRUE; - LeaveCriticalSection(&ext->cs); } void bus_remove_hid_device(DEVICE_OBJECT *device) @@ -700,11 +696,21 @@ static NTSTATUS pdo_pnp_dispatch(DEVICE_OBJECT *device, IRP *irp) status = STATUS_SUCCESS; break; + case IRP_MN_SURPRISE_REMOVAL: + EnterCriticalSection(&ext->cs); + remove_pending_irps(device); + ext->removed = TRUE; + LeaveCriticalSection(&ext->cs); + break;
This is fine, I guess, but it makes the lock in remove_pending_irps() redundant.
case IRP_MN_REMOVE_DEVICE: { struct pnp_device *pnp_device = ext->pnp_device; + EnterCriticalSection(&ext->cs); remove_pending_irps(device); + ext->removed = TRUE; + LeaveCriticalSection(&ext->cs); ext->vtbl->free_device(device);
I believe that ntoskrnl should guarantee we won't receive any IRPs after IRP_MN_REMOVE_DEVICE, which makes this hunk redundant.
Although we do, I don't think there's guarantees that IRP_MN_SURPRISE_REMOVAL is always sent first.
According to [1] the sequence could be IRP_MN_START_DEVICE -> IRP_MN_REMOVE_DEVICE -but probably no IRP could be queued in between- or IRP_MN_QUERY_REMOVE_DEVICE -> IRP_MN_REMOVE_DEVICE -which we don't support.
[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/understandi...
Right, but either way we shouldn't get any IRPs *after* IRP_MN_REMOVE_DEVICE. So nothing should be checking ext->removed, or trying to access the CS.
I don't see exactly what prevents this, but alright. FWIW the CS is currently acquired in IRP_MN_REMOVE_DEVICE, even before these patches.
The only way I can see that this is indeed unnecessary is if IRP_MN_REMOVE_DEVICE is special and only sent when all the handles to the device have been closed and all IRPs have been processed (either completed or marked pending).
(And if there's no such guarantees I can easily see how this could go wrong if IRP_MN_REMOVE_DEVICE completes before another IRP finished its processing)
As far as I can tell from the IRP_MN_REMOVE_DEVICE call sites, we just send it when the device is removed from the list, and it doesn't look obvious to me that there's such guarantees.
My understanding is that this *is* guaranteed, if only by convention, although on rereading all the documentation I can find I'm less sure.
According to [2] the children always get remove IRPs first, and we have tests for this and everything. This implies (though of course I can't find it stated outright) that by the time we get IRP_MN_REMOVE_DEVICE nothing should be sending us a new IRP (no user space handles, no child devices...) But in theory any kernel component could find our device and send it IRPs...
I suspect that ultimately the kernel doesn't stop you from sending IRPs on a removed device if you're still holding a reference to the device object. In theory I could even see components actually doing that, although I can't imagine why they would.
On the other hand, you're supposed to clean up *all* your allocations during IRP_MN_REMOVE_DEVICE. But that excludes the device extension, and in theory you could put things there...
In the case of winebus specifically, it only makes sense for winehid to be sending us IRPs (on behalf of hidclass), and hidclass shouldn't send any IRPs once it's shut down. See also my comment about hidclass's expectations in [3]. In fact, if we have any pending IRPs in IRP_MN_REMOVE_DEVICE, something's already gone wrong.
We may want to take this approach generally, i.e. assume that IRPs should be complete by IRP_MN_REMOVE_DEVICE and that no new ones should be queued, and start complaining very vocally if any native (or builtin) drivers violate this. And if that situation does occur we can start examining how native drivers deal with it.
From my years of working with DirectShow this pattern becomes familiar: design a complex interface involving several layers of callbacks, and then completely underspecify what functions are valid to call when and from which context, and how things should be synchronized. You'd think that the kernel would be more careful about this, but no.
[2] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-an...
[3] https://www.winehq.org/pipermail/wine-devel/2021-June/189142.html