I use a self-balancing binary search tree (currently it is an AVL tree but can be substituted with another one). I noticed that there are distinct periods when only certain operation is performed: large deletion or insertion batches are executed rarely while most of the time it is immutable search tree. Will there be any perfomance gain if i postpone rebalancing to the end of the batch?
Batch operations with balancing trees
844 Views Asked by Andrey Godyaev At
2
There are 2 best solutions below
0
On
The asymptotic complexity of say AVL Tree is O(log(N)). But the real computations is obviously decided at runtime.
Delaying rebalancing can make it look fast, but there are risks involved. The avg time to insert is O(logN) only when the tree is balanced. If the batch inserts made the tree highly skewed, we could be getting O(N) insertions time.
Anyway the insert would take O(logN), so rebalancing wouldn't have any much impact as it will also be O(logN).
To get the actual benefits, you need consider batch size of insertion, and search during the insertion/deletion and value of keys which are getting inserted.
For batch insertions, insert the new items into a separate AVL tree, and then merge the two trees as described in http://www.geeksforgeeks.org/merge-two-balanced-binary-search-trees/. Merging the trees is an O(m+n) operation, where m and n are the sizes of the two trees. This approach should be much faster when the number of new items is small compared to the number of items in the existing tree.
For batch deletions, mark the items in the AVL tree as deleted. Then, do an inorder traversal of the tree, building a new AVL tree from the non-deleted nodes. See http://www.geeksforgeeks.org/sorted-array-to-balanced-bst/ for an example. Building a new tree from a sorted list of nodes is an O(n) operation.