Is there something not right about this? Fwiw my intention is to avoid doing mapping in the drivers as much as possible, especially host to win32 rect mapping, because I want to introduce non-bijective mappings for virtual display settings.
I think it is easier and cleaner to keep all the mappings done in win32u. The drivers would work in "raw" coordinates (ie: raw monitor DPI, using the host/physical display mode), while win32u would work in "virtual" coordinates (ie: effective monitor DPI or window DPI, using the client/current display mode), mapping rects from virt to raw before calling the drivers, and from raw to virt when called from the drivers (mostly only in NtUserSetRawWindowPos).
For mouse input, wineserver would be informed of the virt/raw DPI mapping transformations as well, although ultimately that could perhaps be done by moving the monitor rects fully into it and it could probably be solely in charge of doing all the mappings.