I'm not yet convinced by the rebalancing algorithm, particularly by the fact that `rebalance_threshold` can theoretically grow arbitrarily large and the fact that, independently of how large `rebalance_threshold` might become, you still transfer no more than `REBALANCE_SIZE` objects at each operation. Also, if you transfer objects in packets of `REBALANCE_SIZE` anyway, you could directly allocate them in packets of `REBALANCE_SIZE` and store in the rebalance just one pointer instead of many for each packet (so you waste less time in the critical `memcpy()`'s).