On 2020-08-19 18:21, Paul Gofman wrote:
On 8/19/20 18:59, Ken Thomases wrote:
On Aug 19, 2020, at 10:06 AM, Paul Gofman pgofman@codeweavers.com wrote:
On 8/19/20 17:56, Gabriel Ivăncescu wrote:
Sorry for the late reply, but I was a bit interested in this patch (which doesn't seem to have gone anywhere yet?).
I think this should also fix bug 38668.
I am frankly not quite sure how to move forward with this. I've got a comment from Rémi who rightfully considered the change to big for a one time change. While I do agree with that in principle, I don't see the way how to simplify the change without introducing a more complicated and ugly state in between (I have wrote some more details in the previous e-mail).
I also received a comment from Alexandre in the chat that he is sceptical about the idea because it is getting too complicated.
For now I don't see any other substantially different and good way to solve the bunch of related bugs. Besides the referenced bugs, it is also known to affect Mortal Kombat 11 and Star Wars: Battlefron II for now.
So I am not sure how to proceed from here. Should we just leave these bugs, or is there some direction of how to do it simpler which I am missing, or should I maybe try harder to structure the code in some more clear way?
I haven't paid really close attention. Is this just about doing bottom-up allocation?
This is the motivation, yeah, to force allocation order (whether TOP_DOWN is requested or bottom up if not), as a bunch of games assumes the memory pointers they get from VirtualAlloc() (without _TOP_DOWN flag of course) to be in some limited range. Some (like MK11) actually want the pointer to be within pre-Win8 user space area limit (44 bits), some other (like AION) do assume they get the pointers from lower space, e. g., AION was fine with pointers in ~16GB range. But it is not clear how to force even 44 bit limit in a simple way (which won't completely solve the issue anyway), as simply reducing the user_space_limit leads to allocating of TBs of virtual memory in reserved space before mmap() starts returning pointers from the range we need.
For the non-MAP_FIXED case, if you supply a non-zero address to mmap(), don't most systems take that as a hint and try to allocate there?
Yes, they do. We never try to allocate the space where Wine has allocated it previously, but mmap's called from host libraries are out of our control.
If they can't, do they then search from there for the closest available free region?
Linux doesn't, just allocates the same way as no hint was given (bottom up).
There is /proc/sys/vm/legacy_va_layout which, when turned on, will trigger the use of legacy VM allocator in kernel which was bottom up. That could potentially solve the problem with the games (while _TOP_DOWN allocations won't be de-facto supported that is probably less of an issue). But I would beware switching the system to some many years old unused allocator, and also that requires root privileges.
macOS behaves like that. Not sure about other platforms. (Then again, macOS has the Mach APIs which allow enumerating the free/allocated regions pretty directly. See the Mac-specific implementation of reserve_area() in libs/wine/mmap.c.)
-Ken
There is already the code in Wine which allocates the memory within non-reserved areas when zero_bits parameter is given to NtAllocateVirtualMemory() (and some apps are known to depend on this parameter to work correctly). This code gets an idea which areas are supposed to be free by surfing through the views, and then does search the found free area with the same function used in my patches. That has a performance impact as there is typically much more views than gaps between then (now when allocations are 64k aligned) and no quick way to find the free areas (FWIW, Linux kernel uses just the view tree when searches for a free VM block, but it maintains the info in view which holds the max block size to the side of the view which allows for binary search). But this has limited performance impact because zero_bits is used rarely, switching all the allocations to work this way has prohibitive performance impact.
There is also the code in Wine which allocates space within reserved areas by using free block list (maintained for reserved areas only).
My patch combines these two type of allocations by extending free block tracking for all views allocation, and handling whether the space should be allocated in reserved area or checked for conflicts with native mappings in alloc_area_in_reserved_or_between_callback(). The last patch also "lazily" marks natively mapped area as free which eliminates performance impact previously noticed without this.
I was thinking that we could do something like in the attached patches, where if the mapping in reserved areas failed in bottom up alloc, we reserve more memory and try again until it succeeds.
I think it's simpler and the existing code already handles the reservation with bisecting steps nicely. Wouldn't that work?
Of course this will cause some more memory to be definitively reserved for Windows, but that will then also be used as needed by following allocations.
Alternatively we could release the over-reserved memory every time, but that also means we will have to reserve again on next allocations.