The original message was apparently blocked by the mailing list rules/filter. Now I am subscribed to wine-devel, sorry if you get the message twice.
Andrea
---------- Forwarded message ---------- From: Andrea Canciani ranma42@gmail.com Date: Fri, Jun 23, 2017 at 9:29 AM Subject: Improvements to the heap allocator To: wine-devel@winehq.org
Yesterday I opened a bug report about a performance issue in the wine heap allocator: https://bugs.winehq.org/show_bug.cgi?id=43224 Since I was also asking for some suggestions about what the preferred fix would be, Sebastian Lackner pointed to this mailing list.
The bug report contains the source code for a program that has an allocation pattern that is very bad (for the current wine allocator).
This appears to be a known problem, but I was unable to find any test/benchmark that shows such degenerate behavior. So far none of the attempts at fixing it seem to have landed. Some possible approaches are: 1. ensure that the freelist sizes cover all (alignment rounded) sizes up to a certain value; 2. when looking up for a free block, start from the freelist whose smallest element is at least as big as the desired block; 3. implement more complex free block management.
#1 would fix the issue for some allocations sizes, but for bigger allocations the problem would stay the same.
#2 would fix the performance problem at the cost of increased memory usage: for some sizes it would be possible that after a block is allocated and freed, the following allocation for that size does not re-use that memory (if the free block is stored in a freelist together with smaller blocks).
#3 would increase the complexity of the code; to keep it under control, it might be possible to re-use the red-black tree implementation already available in wine. To avoid regressing performance, the smallest blocks might still be handled with simple freelists (fixed as in #1), while larger blocks are managed in a tree.
Some of the current/previous attempts at tackling this issue are: - https://github.com/wine-compholio/wine-staging/blob/ master/patches/ntdll-Heap_FreeLists/0001-ntdll-Improve- heap-allocation-performance-by-using-m.patch (implements #1) - https://github.com/laino/wine-patches (IIUC, a combination of #1 and #3, in which the code tries to keep the freelist length under control, but AFAICT there is no guaranteed bound)
I would like to try and work on fixing this, but there are some decision to make that might depend on constraints I am not aware of. Would the solution I propose (#3, rbtree based, with some tuning for small allocation sizes) be acceptable? Are there any benchmarks that a new implementation should strive to improve or at least avoid regressing? Suggestions and alternative solutions are welcome :)
Sorry for the long message, thank you for your patience Andrea