Apex Legends game periodically (every 30 seconds) calls this function
with up to 22k virtual addresses. All but 1 of them is valid. Due to
amount of queries addresses, and cost of seek+read, this causes this
function to take up to about 50ms. So framerate drops from ~150 FPS to
20FPS for about a second.
As far as I can see, returning 0 entries from this function, still makes
Apex Legend work.
But keep code correct, and optimise it by:
1. Opening pagemap file once, and never closing it. This eliminates
repeated fopen/fseek/fread/fclose sequences, which helps even in queries
with small amount of virtual addresses.
2. Using pread, instead of seek+read.
3. Only performing pagemap read when the address is valid.
Future work might recognize continues pages in the query, and perform a
batch read of multiple pagemap entries, instead one page at a time, but
for now it is not necassary.
This change get_working_set_ex peek wall clock runtime from 57ms to
0.29ms.
Tested on Linux, but similar change done for the BSD part.
`Signed-off-by: Witold Baryluk <witold.baryluk(a)gmail.com>`
--
v6: ntdll: Keep pagemap file open after first use of NtQueryVirtualMemory(MemoryWorkingSetExInformation)
ntdll: Use pread in NtQueryVirtualMemory(MemoryWorkingSetExInformation)
https://gitlab.winehq.org/wine/wine/-/merge_requests/852
Apex Legends game periodically (every 30 seconds) calls this function
with up to 22k virtual addresses. All but 1 of them is valid. Due to
amount of queries addresses, and cost of seek+read, this causes this
function to take up to about 50ms. So framerate drops from ~150 FPS to
20FPS for about a second.
As far as I can see, returning 0 entries from this function, still makes
Apex Legend work.
But keep code correct, and optimise it by:
1. Opening pagemap file once, and never closing it. This eliminates
repeated fopen/fseek/fread/fclose sequences, which helps even in queries
with small amount of virtual addresses.
2. Using pread, instead of seek+read.
3. Only performing pagemap read when the address is valid.
Future work might recognize continues pages in the query, and perform a
batch read of multiple pagemap entries, instead one page at a time, but
for now it is not necassary.
This change get_working_set_ex peek wall clock runtime from 57ms to
0.29ms.
Tested on Linux, but similar change done for the BSD part.
`Signed-off-by: Witold Baryluk <witold.baryluk(a)gmail.com>`
--
v5: ntdll: Keep pagemap file open after first use of NtQueryVirtualMemory(MemoryWorkingSetExInformation)
https://gitlab.winehq.org/wine/wine/-/merge_requests/852
Apex Legends game periodically (every 30 seconds) calls this function
with up to 22k virtual addresses. All but 1 of them is valid. Due to
amount of queries addresses, and cost of seek+read, this causes this
function to take up to about 50ms. So framerate drops from ~150 FPS to
20FPS for about a second.
As far as I can see, returning 0 entries from this function, still makes
Apex Legend work.
But keep code correct, and optimise it by:
1. Opening pagemap file once, and never closing it. This eliminates
repeated fopen/fseek/fread/fclose sequences, which helps even in queries
with small amount of virtual addresses.
2. Using pread, instead of seek+read.
3. Only performing pagemap read when the address is valid.
Future work might recognize continues pages in the query, and perform a
batch read of multiple pagemap entries, instead one page at a time, but
for now it is not necassary.
This change get_working_set_ex peek wall clock runtime from 57ms to
0.29ms.
Tested on Linux, but similar change done for the BSD part.
`Signed-off-by: Witold Baryluk <witold.baryluk(a)gmail.com>`
--
v4: ntdll: Keep pagemap file open after first use of NtQueryVirtualMemory(MemoryWorkingSetExInformation)
ntdll: Use pread in NtQueryVirtualMemory(MemoryWorkingSetExInformation)
ntdll: Do not use hardcoded page shift in NtQueryVirtualMemory(MemoryWorkingSetExInformation)
ntdll: Speed up NtQueryVirtualMemory(MemoryWorkingSetExInformation) by conditional page check
https://gitlab.winehq.org/wine/wine/-/merge_requests/852
Apex Legends game periodically (every 30 seconds) calls this function
with up to 22k virtual addresses. All but 1 of them is valid. Due to
amount of queries addresses, and cost of seek+read, this causes this
function to take up to about 50ms. So framerate drops from ~150 FPS to
20FPS for about a second.
As far as I can see, returning 0 entries from this function, still makes
Apex Legend work.
But keep code correct, and optimise it by:
1. Opening pagemap file once, and never closing it. This eliminates
repeated fopen/fseek/fread/fclose sequences, which helps even in queries
with small amount of virtual addresses.
2. Using pread, instead of seek+read.
3. Only performing pagemap read when the address is valid.
Future work might recognize continues pages in the query, and perform a
batch read of multiple pagemap entries, instead one page at a time, but
for now it is not necassary.
This change get_working_set_ex peek wall clock runtime from 57ms to
0.29ms.
Tested on Linux, but similar change done for the BSD part.
Signed-off-by: Witold Baryluk's avatarWitold Baryluk <witold.baryluk(a)gmail.com>
--
v3: ntdll: Speed up NtQueryVirtualMemory(MemoryWorkingSetExInformation) by conditional pread.
https://gitlab.winehq.org/wine/wine/-/merge_requests/852
Changes since prior MR:
- Check if HWND is valid after sending `WM_GETOBJECT` and getting a 0 return value.
- Block on thread calling `UiaNodeFromHandle()` rather than the marshaling thread.
- Squash `IWineUiaNode::get_prop_val` commit into commit where it is actually used.
--
v2: uiautomationcore: Add tests for UiaNodeFromHandle.
uiautomationcore: Create UI Automation client thread.
uiautomationcore: Implement UiaNodeFromHandle.
uiautomationcore: Shutdown provider thread when all returned nodes are released.
uiautomationcore: Increment module reference count when starting provider thread.
uiautomationcore: Implement UiaReturnRawElementProvider.
https://gitlab.winehq.org/wine/wine/-/merge_requests/846
Changes since prior MR:
- Check if HWND is valid after sending `WM_GETOBJECT` and getting a 0 return value.
- Block on thread calling `UiaNodeFromHandle()` rather than the marshaling thread.
- Squash `IWineUiaNode::get_prop_val` commit into commit where it is actually used.
--
https://gitlab.winehq.org/wine/wine/-/merge_requests/846