On 25.04.22 01:31, Zebediah Figura wrote:
On 4/24/22 21:18, Derek Lesho wrote:
Hi All,
In the wake of the new WOW64 implementation (recent explanation [1]), there has been discussion in informal channels about how to we are going to handle pointers to mapped graphics resource memory which we receive from the graphics API, as the possibility exists that it will fall outside of the 32-bit address space.
Over time, a few creative solutions have been proposed and discussed, with a common theme being that we need changes in either the kernel or the graphics drivers to do this properly. As we already know the requirements for a solution to this problem, I think it would be responsible to hash this out now and then work with the relevant project maintainers earlier as to avoid blocking work on the wine side too long and to possibly allow more users to test the new path earlier.
Thank you for starting this conversation! I agree with all of these points. WoW64 emulation is still a long way off, if it'll even happen by default on platforms other than Mac, but nevertheless this is something we should look into supporting sooner than later.
It would probably be good to start a dri-devel/mesa-dev thread to discuss this as well.
Agreed, I just filed a feature request at the Vulkan-Docs repo so that we can also hear the opinions of those working on non-mesa drivers like NV.
https://github.com/KhronosGroup/Vulkan-Docs/issues/1832
- Work with Khronos to introduce extensions into the relevant APIs
enabling us to tell drivers where in the address space we want resources mapped.
Pro: Wouldn't require going around the backs of the driver, resulting in a more hardened solution. (Out there, but what if a creative driver returns a mapping without read or write permission and handles accesses through a page fault handler?)
Cons: The extension would have to be implemented by each individual vendor for every relevant API. This would implicitly drop support for those with cards whose graphics drivers are no longer being updated.
- Hook the driver's mmap call when we invoke memory mappings
function, overriding the address to something in the 32-bit address space.
Pro: Unlike the other solutions, this wouldn't require any changes to other projects, and shares the advantage of the first solution.
Cons: Susceptible to breakage if the driver uses their own mapping mechanism separate from mmap. (Custom IOCTL, CPU driver returning something from the heap)
Here's a few other ideas / considerations I think are worth mentioning:
- Reserve the entire address space above 2G (or 3G with the
appropriate image flags). This is essentially what we already do for 32-bit programs. I'm not sure if reserving 2**48 bytes of memory will run into problems, though? Has this been tried?
- Linux has a personality(2) switch ADDR_LIMIT_32BIT. The
documentation is terse, so I'm not fully sure what this does, but it might be sufficient to ensure that new mappings are placed under 2 GB, while not breaking old mappings? And presumably it's also toggleable. It's not ideal exactly—we'd like to be able to set a 3 GB or 4 GB limit instead if the binary allows—but it's potentially already usable.
- We can emulate mappings for everything except coherent memory by
manually implementing mapping functions with a separate sysmem location. We can implement persistent mappings this way, too, by copying on a flush, but unfortunately we can't expose GL_ARB_buffer_storage without coherent mappings.
[Fortunately d3d doesn't require coherent memory or ARB_buffer_storage, and the Vulkan backend doesn't require coherent memory for map acceleration. The GL backend currently does, but could be made not to. We'd have to add a private extension to use ARB_buffer_storage while not actually marking any maps as coherent. Of course, d3d isn't the only user of GL or Vulkan, and unfortunately ARB_buffer_storage is core in 4.3, so I'm sure there are GL applications out there that rely on it...]
I think we can actually emulate coherent memory as well, by tracking resource bindings and manually flushing on draws. That's a little painful, though.
- Crazy idea: On Linux, parse /proc/self/maps to allow remapping
non-anonymous pages. Combined with mremap(2) or manual emulation, this allows mapping everything except for shared anonymous pages [and I can't imagine that a GPU driver would use those, especially given that the only way to make use of the SHARED flag is fork(2)].
Would this still work if the driver closed the FD after mmap-ing it?
ἔρρωσθε, Zeb