Is the above in the direction you are envisioning?
More or less something like that though I haven't really thought about it a lot. IMO win32u should be the authority and only tell drivers what to do (create/destroy a monitor window, map a monitor window or a portion of the host screen to a given virtual screen area, etc).
Basically moving toward a custom compositing engine, because we'll likely need that at some point (think implementing DirectComposition [1] for instance).
However this is a very long road, and to begin with there's already some issues that need to be solved in winex11 and other drivers. Neither winemac nor winewayland or wineandroid support virtual desktop mode at the moment.
On the other hand, in winex11, the virtual desktop mode is still very strongly rooted in its core, and the driver expects a unique desktop window per process (the `root_window`), using it pretty much everywhere.
Drivers also probably expect a unique host surface per HWND, which would also need to be broken up somehow, as the desktop window would then need to span over multiple monitors.
[1] https://learn.microsoft.com/en-us/windows/win32/directcomp/directcomposition...
Unfortunately, implementing a (rooted) virtual desktop mode is not possible with the current set of Wayland protocols.
Ugh. Didn't I read somewhere that you had some plan to implement some kind of cross-process rendering? How would that even work then?