I've played around with dbghelp performance. My test case was breaking at an unknown symbol (break gaga) while WoW was loaded in the debugger (wine winedbg WoW.exe). The time was hand stopped, memory usage measured with ps -AF and looked at the RSS column.
Test Time(s) Memory Usage(MB) current git 4.5 54 pool_heap.patch 4.5 63 process_heap.patch 4.5 126 insert_first.patch 4.5 54 current git, r300 115 146 pool_heap.patch, r300 17 119 process_heap.patch, r300 17 260 insert_first.patch, r300 27 167
insert_first is the patch from Eric Pouech. r300 means with the debug version of Mesas r300_dri.so, which has a total compilation unit size of around 9.2M (compared to the second biggest Wines user32 with 1.1M).
Conclusions: - current git wins with small debug files (<2M or so), pool_heap wins with bigger files. insert_first, process_heap are out. - small pools have less memory overhead than small heaps - big pools have more memory overhead than big heaps. - big pools are a lot slower than big heaps.
IMO the best results would give removing the pools (like in process_heap) and freeing unused memory manually, the other way round it was allocated. But at a first glance it looks like quite a bit of work, which I'm not sure is worth the result. I think the best approach would be to code some destroy functions in storage.c which would free the allocated vector, sparse_array and hash_table memory. And then gradually replace pool_alloc calls with HeapAlloc/HeapFree pairs.
Markus
Markus Amsler a écrit :
I've played around with dbghelp performance. My test case was breaking at an unknown symbol (break gaga) while WoW was loaded in the debugger (wine winedbg WoW.exe). The time was hand stopped, memory usage measured with ps -AF and looked at the RSS column.
Test Time(s) Memory Usage(MB) current git 4.5 54 pool_heap.patch 4.5 63 process_heap.patch 4.5 126 insert_first.patch 4.5 54 current git, r300 115 146 pool_heap.patch, r300 17 119 process_heap.patch, r300 17 260 insert_first.patch, r300 27 167
insert_first is the patch from Eric Pouech. r300 means with the debug version of Mesas r300_dri.so, which has a total compilation unit size of around 9.2M (compared to the second biggest Wines user32 with 1.1M).
Conclusions:
- current git wins with small debug files (<2M or so), pool_heap wins
with bigger files. insert_first, process_heap are out.
- small pools have less memory overhead than small heaps
- big pools have more memory overhead than big heaps.
- big pools are a lot slower than big heaps.
thanks for the testings & timings !
you're also missing a couple of elements: - for the memory overhead, in the first case you consider 50 MB (roughly) over 10 or 20 modules while in your r300 case the impact (and memory difference) is only on a single module - time to unload a module hasn't been computed (it's less used than loading a module)
what's also strange is that the pool_heap gets lower memory consumption than the process heap one, which is rather not a natural result... I wonder if some data haven't been swapped out and aren't accounted for in RSS A+
Eric Pouech wrote:
Markus Amsler a écrit :
I've played around with dbghelp performance. My test case was breaking at an unknown symbol (break gaga) while WoW was loaded in the debugger (wine winedbg WoW.exe). The time was hand stopped, memory usage measured with ps -AF and looked at the RSS column.
Test Time(s) Memory Usage(MB) current git 4.5 54 pool_heap.patch 4.5 63 process_heap.patch 4.5 126 insert_first.patch 4.5 54 current git, r300 115 146 pool_heap.patch, r300 17 119 process_heap.patch, r300 17 260 insert_first.patch, r300 27 167
insert_first is the patch from Eric Pouech. r300 means with the debug version of Mesas r300_dri.so, which has a total compilation unit size of around 9.2M (compared to the second biggest Wines user32 with 1.1M).
Conclusions:
- current git wins with small debug files (<2M or so), pool_heap wins
with bigger files. insert_first, process_heap are out.
- small pools have less memory overhead than small heaps
- big pools have more memory overhead than big heaps.
- big pools are a lot slower than big heaps.
thanks for the testings & timings !
you're also missing a couple of elements:
- for the memory overhead, in the first case you consider 50 MB
(roughly) over 10 or 20 modules while in your r300 case the impact (and memory difference) is only on a single module
I'm not sure what's your point is.
- time to unload a module hasn't been computed (it's less used than
loading a module)
Unloading is more or less instant in all cases.
what's also strange is that the pool_heap gets lower memory consumption than the process heap one, which is rather not a natural result... I wonder if some data haven't been swapped out and aren't accounted for in RSS
The process_heap is the one I sent to wine-patches, which never frees any memory. I've also tested an improved process_heap, which stores the allocated memory pointer in an array and frees it afterwards. Without luck, it's slower and uses more memory the pool_heap.
So I don't see a simple solution which only affects storage.c, is equal or better than the current, and is significantly faster at big debug files. Any ideas?
Markus
Markus Amsler a écrit :
Eric Pouech wrote:
Markus Amsler a écrit :
I've played around with dbghelp performance. My test case was breaking at an unknown symbol (break gaga) while WoW was loaded in the debugger (wine winedbg WoW.exe). The time was hand stopped, memory usage measured with ps -AF and looked at the RSS column.
Test Time(s) Memory Usage(MB) current git 4.5 54 pool_heap.patch 4.5 63 process_heap.patch 4.5 126 insert_first.patch 4.5 54 current git, r300 115 146 pool_heap.patch, r300 17 119 process_heap.patch, r300 17 260 insert_first.patch, r300 27 167
insert_first is the patch from Eric Pouech. r300 means with the debug version of Mesas r300_dri.so, which has a total compilation unit size of around 9.2M (compared to the second biggest Wines user32 with 1.1M).
Conclusions:
- current git wins with small debug files (<2M or so), pool_heap
wins with bigger files. insert_first, process_heap are out.
- small pools have less memory overhead than small heaps
- big pools have more memory overhead than big heaps.
- big pools are a lot slower than big heaps.
thanks for the testings & timings !
you're also missing a couple of elements:
- for the memory overhead, in the first case you consider 50 MB
(roughly) over 10 or 20 modules while in your r300 case the impact (and memory difference) is only on a single module
I'm not sure what's your point is.
- time to unload a module hasn't been computed (it's less used than
loading a module)
Unloading is more or less instant in all cases.
what's also strange is that the pool_heap gets lower memory consumption than the process heap one, which is rather not a natural result... I wonder if some data haven't been swapped out and aren't accounted for in RSS
The process_heap is the one I sent to wine-patches, which never frees any memory. I've also tested an improved process_heap, which stores the allocated memory pointer in an array and frees it afterwards. Without luck, it's slower and uses more memory the pool_heap.
So I don't see a simple solution which only affects storage.c, is equal or better than the current, and is significantly faster at big debug files. Any ideas?
Markus
Hi Markus, does the slightly modified version of pool_heap improve your performances (it shouldn't modify the perf for large files(or just a bit), but should reduce memory consumption for small pools (from 1 to 2M depending on your configuration)
A+
Eric Pouech wrote:
Markus Amsler a écrit :
Eric Pouech wrote:
Markus Amsler a écrit :
I've played around with dbghelp performance. My test case was breaking at an unknown symbol (break gaga) while WoW was loaded in the debugger (wine winedbg WoW.exe). The time was hand stopped, memory usage measured with ps -AF and looked at the RSS column.
Test Time(s) Memory Usage(MB) current git 4.5 54 pool_heap.patch 4.5 63 process_heap.patch 4.5 126 insert_first.patch 4.5 54 current git, r300 115 146 pool_heap.patch, r300 17 119 process_heap.patch, r300 17 260 insert_first.patch, r300 27 167
insert_first is the patch from Eric Pouech. r300 means with the debug version of Mesas r300_dri.so, which has a total compilation unit size of around 9.2M (compared to the second biggest Wines user32 with 1.1M).
Hi Markus, does the slightly modified version of pool_heap improve your performances (it shouldn't modify the perf for large files(or just a bit), but should reduce memory consumption for small pools (from 1 to 2M depending on your configuration)
A+
No, performance is exactly the same as pool_heap :( . I analyzed why your original insert_first version was slower and memory hungrier then pool_heap. It turned out pool_realloc is the problem, not pool_alloc. First there's a memory leak, if the memory is moved the old one is not freed. Second pool_realloc is O(n) that's the reason for the speed hits. Directly using heap functions for reallocs solves both problems (but looks to hackish to get commited, perhaps you have a better idea).
Here the results for pool_realloc on top of insert_first pool_realloc 4.5s 54M pool_realloc,r300 17s 104M
The next problem is vector_iter_[up|down], because vector_position is O(n). Explicitly storing the current iter position speeds r300 up to 8s (from original 115s)! But I'm not sure how to implement it cleanly. Directly use for() instead of vector_iter_*(), use an iterator, ...
Markus
Markus Amsler a écrit :
Eric Pouech wrote:
Markus Amsler a écrit :
Eric Pouech wrote:
Markus Amsler a écrit :
I've played around with dbghelp performance. My test case was breaking at an unknown symbol (break gaga) while WoW was loaded in the debugger (wine winedbg WoW.exe). The time was hand stopped, memory usage measured with ps -AF and looked at the RSS column.
Test Time(s) Memory Usage(MB) current git 4.5 54 pool_heap.patch 4.5 63 process_heap.patch 4.5 126 insert_first.patch 4.5 54 current git, r300 115 146 pool_heap.patch, r300 17 119 process_heap.patch, r300 17 260 insert_first.patch, r300 27 167
insert_first is the patch from Eric Pouech. r300 means with the debug version of Mesas r300_dri.so, which has a total compilation unit size of around 9.2M (compared to the second biggest Wines user32 with 1.1M).
Hi Markus, does the slightly modified version of pool_heap improve your performances (it shouldn't modify the perf for large files(or just a bit), but should reduce memory consumption for small pools (from 1 to 2M depending on your configuration)
A+
No, performance is exactly the same as pool_heap :( .
even for memory consumption ???
I analyzed why your original insert_first version was slower and memory hungrier then pool_heap. It turned out pool_realloc is the problem, not pool_alloc. First there's a memory leak, if the memory is moved the old one is not freed. Second pool_realloc is O(n) that's the reason for the speed hits. Directly using heap functions for reallocs solves both problems (but looks to hackish to get commited, perhaps you have a better idea).
we could try not to realloc the array of arrays but rather use a tree of arrays which should solve most of the issues, but that would make the code complicated another way is to double the size of the bucket each time we need to increase size (instead of adding one bucket)
Here the results for pool_realloc on top of insert_first pool_realloc 4.5s 54M pool_realloc,r300 17s 104M The next problem is vector_iter_[up|down], because vector_position is O(n). Explicitly storing the current iter position speeds r300 up to 8s (from original 115s)! But I'm not sure how to implement it cleanly. Directly use for() instead of vector_iter_*(), use an iterator, ...
likely use an interator which keeps track of current position (as we do for the hash tables) A+
Eric Pouech schrieb:
Markus Amsler a écrit :
No, performance is exactly the same as pool_heap :( .
even for memory consumption ???
Yes, it looks like HeapCreate has a default size of 64k.
I analyzed why your original insert_first version was slower and memory hungrier then pool_heap. It turned out pool_realloc is the problem, not pool_alloc. First there's a memory leak, if the memory is moved the old one is not freed. Second pool_realloc is O(n) that's the reason for the speed hits. Directly using heap functions for reallocs solves both problems (but looks to hackish to get commited, perhaps you have a better idea).
we could try not to realloc the array of arrays but rather use a tree of arrays which should solve most of the issues, but that would make the code complicated another way is to double the size of the bucket each time we need to increase size (instead of adding one bucket)
I'll have a look at duplicating bucket size.
Here the results for pool_realloc on top of insert_first pool_realloc 4.5s 54M pool_realloc,r300 17s 104M The next problem is vector_iter_[up|down], because vector_position is O(n). Explicitly storing the current iter position speeds r300 up to 8s (from original 115s)! But I'm not sure how to implement it cleanly. Directly use for() instead of vector_iter_*(), use an iterator, ...
likely use an interator which keeps track of current position (as we do for the hash tables)
Iterator for an vector looks a bit like an overkill, I was in favor of for(i=0; i<vector_length(); i++). Either solution will add some code on the caller side.
Markus