Re: [PATCH 3/4] ntdll: Force virtual memory allocation order.

6 Aug 2020


      On 8/6/20 18:42, Rémi Bernon wrote:
...
I can understand what this is doing (extend the free ranges tracking 
over the whole address space, and merge all the code paths together), 
but it's a big change all at once.
Yes, this is the case, in a tiny bit more details the logic is:
1. iterate over free areas (skip too small right away);
2. within free area, enumerate reserved areas and allocate the memory 
during enumeration or right after if there is a space at the edges of 
enumeration;
In the majority of cases step 2 should succeed from the first time.
I thought of doing that in parts (by leaving reserved areas as is), but 
it was getting more complicated and ugly as it is now in this version. 
We would need to maintain a separate free area list for the free areas 
outside of reserved areas, which was getting a bit tricky for some 
corner cases, and looked very weird overall. If to keep one list, we 
would need to prevent the free list logic from joining free areas 
between reserved and "normal" space. That would move the complications 
there and result in the longer overall code, while probably not making 
free list managing nicer and is not needed long term.
...
The free ranges were only tracked within the reserved areas mostly 
because it was only useful there, but also because the mmap smaller 
alignment was causing a lot of fragmentation in an initial 
implementation. Now that we align all views and ranges to 64k I don't 
think it would fragment that much anymore so it could probably be done 
separately. And I think it would still be interesting to gather some 
statistics to see how many ranges and holes we usually have, just to 
check that it's not going crazy.
I did observe that statistics over some games and I think even still 
have some log recorded which I used as the data source for the test case 
I made to reproduce some real life allocation cases (keeping that from 
separate threads) when testing the performance. I will need some time to 
gather that once again and come up with some verified figures, but from 
what I can tell at once:
- The number of views varies greatly between the games, from a few 
thousands to hitting the default Linux mmap limit, with values roughly 
about ~10000-2000 seen often;
- The number of free ranges is not great, I doubt I ever saw more than a 
hundred. With forced 64k alignment a lot of allocations do not produce a 
lot of free ranges. To make this number great the application should use 
a really weird pattern of allocation by doing explicit VM allocs of a 
small size and then freeing a lot in between, and then allocating bigger 
chunks so the existing free blocks do not fit.
...
About the rest, wouldn't it be possible to just reserve some more 
pages with reserve_area where we expect some free memory to be, and if 
we are out of reserved space, then use the existing code to allocate 
the views within the newly reserved space. Of course it would possibly 
cause more memory to be reserved for Wine and less to be available to 
system, I'm not sure if we can mitigate that somehow.
As a game specific hack, sure, this can work, but it looks a bit 
problematic for me as a general solution. First of all, I am unsure how 
to sensibly choose this parameter in a universal way. We need to reserve 
as much memory as the application is going to ever allocate. E. g., when 
I was testing this with AION 64 bit and it was OK with addresses withing 
~16GB (apparently it was not planning to ever use more RAM), returning 
bigger pointers were crashing it. I guess this relates to every 
application affected, it just expects the pointer in the memory pointers 
to be in the certain range. The range may differ greatly (e. g., MK11 is 
fine with Windows 7 64 bit address space limit), but to guarantee that 
range with reserved areas we will have to always reserve as much RAM as 
application is using. Besides, there are pointers given to application 
which are obtained from native libraries, like Open GL / Vulkan 
mappings, audio buffers etc. Those pointers on Windows also fall into 
expected ranges. If we reserve low memory for Wine allocations native 
pointers will always fall outside. I am not aware if that will break any 
existing application, but it could.
Overall, do you think that maintaining that allocation "duality", within 
reserved areas and without, is making anything more straightforward, 
given we can have solution which avoids that and that is acceptable 
performance wise? I had some preliminary solution like that before your 
free lists for reserved areas were upstreamed, IMO (apart from my free 
lists having a very skeleton implementation) it looked much more 
cumbersome and was introducing a lot of code which should supposingly go 
away long term.
BTW do we need those reserved areas at all for anything besides ensuring 
core system libraries get to the right place and we do get some low 
memory for things like zero_mask allocations (while for the latter case 
this space can be taken away)? I suspect that not. Maybe once we switch 
to ordered allocations we can just remove reserved areas once those DLLs 
are loaded and thus simplify the alocations?

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

Re: [PATCH 3/4] ntdll: Force virtual memory allocation order.