I think the need for a weak ref is a sign that the locking pattern has gone out of hand. Adding more complexity to it isn't going to do any good IMO.
I've created https://gitlab.winehq.org/wine/wine/-/merge_requests/6323 as an alternative solution, that I think would be better.
I've moved the client surfaces out of the wayland_surface and to the wayland_win_data, decoupling the surfaces used for rendering from the wayland_surface which purpose is windowing logic (it is still used as a rendering target for GDI drawing, but I think it's an implementation detail that we can ignore, and we could use some kind of client surface for it too, except that it needs to draw to non-client areas).
Then I also got rid of the wayland_surface locks entirely, doing anything that needed it within the win_data lock. I don't think there was anything time consuming there, so it's probably fine to hold the global lock for longer. This allows us to access any toplevel win data as well while holding a win data for a given window.
Currently I'm only using that to position any rendering surfaces over the windowing surface, including any child window that has an accelerated client surface. I think you can then do the same kind of thing to access any owner or parent toplevel win data, to implement wl_subsurface and attach a given wayland_surface (and its client surfaces) to another wayland_surface, for windowing purposes.