On 6/21/21 12:29 PM, Rémi Bernon wrote:
On 6/18/21 11:06 PM, Zebediah Figura (she/her) wrote:
On 6/18/21 1:49 PM, Rémi Bernon wrote:
On 6/18/21 5:59 PM, Zebediah Figura (she/her) wrote:
On 6/18/21 7:06 AM, Rémi Bernon wrote:
So that mini driver gets notified first and has a chance to cancel pending IRPs.
Which minidriver?
Isn't it the responsibility of the child to terminate all pending IRPs (and disallow further ones) on removal? See also [0], [1], [2], which say that queued requests should be completed in (both, for some reason) IRP_MN_SURPRISE_REMOVAL and IRP_MN_REMOVE_DEVICE.
I think this deserves tests.
[0] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-...
[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-an...
[2] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-...
I really don't have the big picture yet, so I meant the one implemented in driver_hid.c. If it doesn't cancel the IRP it has queued on device removal, the test never completes the device removal on Windows.
On Wine, driver_hid never receives the IRP_MN_SURPRISE_REMOVAL if we don't call it before removing the "children" in ntoskrnl.exe. I don't really understand who are the children there, but driver_hid apparently isn't.
Allow me to explain in detail. Apologies if any of this is old news to you. If nothing else I'll adapt it to a wiki page later ;-)
TLC PDO winehid.sys + hidclass.sys HID FDO
HID PDO winebus.sys root FDO
root PDO ntoskrnl.exe [PNP manager]
In the case of the HID test driver it's a little different:
TLC PDO driver_hid.dll + hidclass.sys root/HID FDO
root/HID PDO ntoskrnl.exe [PNP manager]
A root PNP device is the one passed to AddDevice for a root PNP driver. Usually it's intended to enumerate children; here we kind of cheat and make the root PnP device the same device as the HID device. It's a bit weird but it works.
The TLC (top-level collection) PDO is managed entirely within hidclass.sys; the minidriver never interacts with it. In theory there's supposed to be one or more of these; we don't handle the "or more" case yet.
The application only ever interacts with the TLC PDO. It's actually impossible to open the HID device stack (IIRC it fails with STATUS_UNSUCCESSFUL). In response to read/write/hidd ioctls, hidclass creates new irps and sends them to the HID device stack, i.e. the "HID FDO/HID PDO" pair in the above diagrams.
[Note that IRPs are always sent to the top of a stack first, so in this case the minidriver has a chance to handle it. In a real HID driver the minidriver would always handle it itself, rather than sending it down to the PDO. You can mentally replace winehid with hidusb, and winebus with (say) usbccgp. Obviously usbccgp doesn't know anything about HID. Our architecture is a little weird, though, for various reasons.]
There is usually no TLC FDO, but in theory there could be, and in practice I believe mouse and keyboard drivers attach to the TLC PDO.
On reflection, I think I may have confused myself, so this patch actually makes more sense than it initially did. It'd still be nice to have tests, though, and the commit message should ideally be more specific.
Thanks, this is definitely not old news, although it's still quite blurry.
From [1] I understand that the hid minidriver lives at the HID FDO level, which means that (minidriver, hidclass.sys) act as the bus FDO driver here.
Yep, that's correct.
Reading [2] or [3] doesn't make it completely explicit if the children should receive the removal IRP first, but from what I understand it is how it's generally supposed to work.
So [4] actually explicitly says this. It's also something I gleaned from trying to run some driver, I think it may have even been the toaster sample.
I added some quick tests here, and they seem to confirm it.
With that understanding I now believe it's correct to call the IRP on the children of the device first.
The problem I'm facing here comes from the way hidclass.sys PDO handles its removal. It waits for the reading thread to complete, which may be waiting on IRPs queued in the HID FDO.
We could either:
- call the surprise removal first on the device stack, then call the
normal removal on the stack, giving a chance for the HID FDO to cancel its queued IRPs. or,
- combine the waits between halt_event and the pending IRP in the HID
PDO device thread, so that the thread can terminate even with a pending IRP.
So it turns out that, as far as I can tell, the intended solution is mostly (1). Device removal is extremely complicated in Windows, and I've already gotten it wrong multiple times, but after some extensive reading and testing here's what I've found:
* When a device is reported missing via IoInvalidateDeviceRelations, IRP_MN_SURPRISE_REMOVAL is sent. And it turns out that this patch is sort of correct: it is indeed sent to the parent before IRP_MN_REMOVE_DEVICE is sent to the child. However, children are still supposed to receive IRP_MN_SURPRISE_REMOVAL before their parents.
It's a bit tricky to actually test this, though. You need a function driver, and that made my tests a lot more difficult (it wreaks havoc with WM_DEVICECHANGE for some reason). So I'm willing to retract my comment demanding more tests.
* And it is also correct that a lower (parent) device is supposed to complete any IRP under its jurisdiction during IRP_MN_SURPRISE_REMOVAL. [2] prescribes this, and the toaster sample bears it out.
* When the parent device is removed via SetupDiRemoveDevice, it does not receive IRP_MN_SURPRISE_REMOVAL. Instead, it receives IRP_MN_QUERY_REMOVE. This is sent recursively, just like IRP_MN_SURPRISE_REMOVAL, and if successful IRP_MN_REMOVE_DEVICE is sent.
* Microsoft doesn't document that you're supposed to complete IRPs in IRP_MN_QUERY_REMOVE. But a quick test shows that hidclass depends on it, and all of the few samples in the Windows 7 WDK that I checked seem to do so as well. I'll have to check if the native third-party HID driver I have does as well...
Unrelated, but I'm not sure to understand why the HID PDO has to re-create IRPs every time it needs to call the minidriver (except for the device thread of course), wouldn't it be possible to just use the normal IRP propagation down the stack?
The tricky thing about it is that it's not the same stack. There's a driver stack, and then there's *the* driver stack, and unfortunately I don't even think that Microsoft uses different terminology to refer to either.
Generally speaking, an IRP can only be sent down a single FDO->(filter)->PDO chain. But the PDO might need to make some query about the driver stack below it, e.g. as a HID device might need to make a USB request.
[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/hid/hid-architectu...
[2] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-an...
[3] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/removing-a-...
[4] https://docs.microsoft.com/en-us/windows-hardware/drivers/kernel/handling-an...