The original message was apparently blocked by the mailing list rules/filter.

Now I am subscribed to wine-devel, sorry if you get the message twice.

Andrea

---------- Forwarded message ----------
From: Andrea Canciani <ranma42@gmail.com>
Date: Fri, Jun 23, 2017 at 9:29 AM
Subject: Improvements to the heap allocator
To: wine-devel@winehq.org

Yesterday I opened a bug report about a performance issue in the wine heap allocator: https://bugs.winehq.org/show_bug.cgi?id=43224

Since I was also asking for some suggestions about what the preferred fix would be, Sebastian Lackner pointed to this mailing list.

The bug report contains the source code for a program that has an allocation pattern that is very bad (for the current wine allocator).

This appears to be a known problem, but I was unable to find any test/benchmark that shows such degenerate behavior.

So far none of the attempts at fixing it seem to have landed. Some possible approaches are:

1. ensure that the freelist sizes cover all (alignment rounded) sizes up to a certain value;

2. when looking up for a free block, start from the freelist whose smallest element is at least as big as the desired block;

3. implement more complex free block management.

#1 would fix the issue for some allocations sizes, but for bigger allocations the problem would stay the same.

#2 would fix the performance problem at the cost of increased memory usage: for some sizes it would be possible that after a block is allocated and freed, the following allocation for that size does not re-use that memory (if the free block is stored in a freelist together with smaller blocks).

#3 would increase the complexity of the code; to keep it under control, it might be possible to re-use the red-black tree implementation already available in wine. To avoid regressing performance, the smallest blocks might still be handled with simple freelists (fixed as in #1), while larger blocks are managed in a tree.

Some of the current/previous attempts at tackling this issue are:

- https://github.com/wine-compholio/wine-staging/blob/master/patches/ntdll-Heap_FreeLists/0001-ntdll-Improve-heap-allocation-performance-by-using-m.patch (implements #1)
- https://github.com/laino/wine-patches (IIUC, a combination of #1 and #3, in which the code tries to keep the freelist length under control, but AFAICT there is no guaranteed bound)

I would like to try and work on fixing this, but there are some decision to make that might depend on constraints I am not aware of.

Would the solution I propose (#3, rbtree based, with some tuning for small allocation sizes) be acceptable?
Are there any benchmarks that a new implementation should strive to improve or at least avoid regressing?

Suggestions and alternative solutions are welcome :)

Sorry for the long message, thank you for your patience

Andrea