AVL tree
This article needs additional citations for verification. (July 2016) (Learn how and when to remove this template message) 
AVL tree  

Type  tree  
Invented  1962  
Invented by  Georgy AdelsonVelsky and Evgenii Landis  
Time complexity in big O notation  

In computer science, an AVL tree (named after inventors AdelsonVelsky and Landis) is a selfbalancing binary search tree. It was the first such data structure to be invented.^{[2]} In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases, where is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.
The AVL tree is named after its two Soviet inventors, Georgy AdelsonVelsky and Evgenii Landis, who published it in their 1962 paper "An algorithm for the organization of information".^{[3]}
AVL trees are often compared with red–black trees because both support the same set of operations and take time for the basic operations. For lookupintensive applications, AVL trees are faster than red–black trees because they are more strictly balanced.^{[4]} Similar to red–black trees, AVL trees are heightbalanced. Both are, in general, neither weightbalanced nor balanced for any ^{[5]} that is, sibling nodes can have hugely differing numbers of descendants.
Contents
Definition[edit]
Balance factor[edit]
In a binary tree the balance factor of a node N is defined to be the height difference
 BalanceFactor(N) := Height(RightSubtree(N)) – Height(LeftSubtree(N)) ^{[6]}
of its two child subtrees. A binary tree is defined to be an AVL tree if the invariant
 BalanceFactor(N) ∈ {–1,0,+1}^{[7]}
holds for every node N in the tree.
A node N with BalanceFactor(N) < 0 is called "leftheavy", one with BalanceFactor(N) > 0 is called "rightheavy", and one with BalanceFactor(N) = 0 is sometimes simply called "balanced".
 Remark
In what follows, because there is a onetoone correspondence between nodes and the subtrees rooted by them, the name of an object is sometimes used to refer to the node and sometimes used to refer to the subtree.
Properties[edit]
Balance factors can be kept uptodate by knowing the previous balance factors and the change in height – it is not necessary to know the absolute height. For holding the AVL balance information in the traditional way, two bits per node are sufficient. However, later research showed if the AVL tree is implemented as a rank balanced tree with delta ranks allowed of 1 or 2 – with meaning "when going upward there is an additional increment in height of one or two", this can be done with one bit.
The height of an AVL tree with nodes lies in the interval:^{[8]}
with the golden ratio φ := (1+√5) ⁄_{2} ≈ 1.618, c := ^{1}⁄ log_{2} φ ≈ 1.44, and b := ^{c}⁄_{2} log_{2} 5 – 2 ≈ –0.328. This is because an AVL tree of height h contains at least F_{h+2} – 1 nodes where {F_{h}} is the Fibonacci sequence with the seed values F_{1} = 1, F_{2} = 1.
Operations[edit]
Readonly operations of an AVL tree involve carrying out the same actions as would be carried out on an unbalanced binary search tree, but modifications have to observe and restore the height balance of the subtrees.
Searching[edit]
This section needs additional citations for verification. (July 2016) (Learn how and when to remove this template message) 
Searching for a specific key in an AVL tree can be done the same way as that of a normal unbalanced binary search tree. In order for search to work effectively it has to employ a comparison function which establishes a total order (or at least a total preorder) on the set of keys. The number of comparisons required for successful search is limited by the height h and for unsuccessful search is very close to h, so both are in O(log n).
Traversal[edit]
This section needs additional citations for verification. (July 2016) (Learn how and when to remove this template message) 
Once a node has been found in an AVL tree, the next or previous node can be accessed in amortized constant time. Some instances of exploring these "nearby" nodes require traversing up to h ∝ log(n) links (particularly when navigating from the rightmost leaf of the root’s left subtree to the root or from the root to the leftmost leaf of the root’s right subtree; in the AVL tree of figure 1, moving from node P to the next but one node Q takes 3 steps). However, exploring all n nodes of the tree in this manner would visit each link exactly twice: one downward visit to enter the subtree rooted by that node, another visit upward to leave that node’s subtree after having explored it. And since there are n−1 links in any tree, the amortized cost is 2×(n−1)/n, or approximately 2.
Insert[edit]
This section has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)
(Learn how and when to remove this template message)

When inserting an element into an AVL tree, you initially follow the same process as inserting into a Binary Search Tree. More explicitly: In case a preceding search has not been successful the search routine returns the tree itself with indication EMPTY and the new node is inserted as root. Or, if the tree has not been empty the search routine returns a node and a direction (left or right) where the returned node does not have a child. Then the node to be inserted is made child of the returned node at the returned direction.
After this insertion it is necessary to check each of the node’s ancestors for consistency with the invariants of AVL trees: this is called "retracing". This is achieved by considering the balance factor of each node.^{[9]}^{[10]}
Since with a single insertion the height of an AVL subtree cannot increase by more than one, the temporary balance factor of a node after an insertion will be in the range [–2,+2]. For each node checked, if the temporary balance factor remains in the range from –1 to +1 then only an update of the balance factor and no rotation is necessary. However, if the temporary balance factor becomes less than –1 or greater than +1, the subtree rooted at this node is AVL unbalanced, and a rotation is needed. It may be pointed out explicitly that with insertion as the code below shows, the adequate rotation immediately perfectly rebalances the tree.
In figure 1, by inserting the new node Z as a child of node X the height of that subtree Z increases from 0 to 1.
 Invariant of the retracing loop for an insertion
The height of the subtree rooted by Z has increased by 1. It is already in AVL shape.
In order to update the balance factors of all nodes, first observe that all nodes requiring correction lie from child to parent along the path of the inserted leaf. If the above procedure is applied to nodes along this path, starting from the leaf, then every node in the tree will again have a balance factor of −1, 0, or 1.
The retracing can stop if the balance factor becomes 0 implying that the height of that subtree remains unchanged.
If the balance factor becomes ±1 then the height of the subtree increases by one and the retracing needs to continue.
If the balance factor temporarily becomes ±2, this has to be repaired by an appropriate rotation after which the subtree has the same height as before (and its root the balance factor 0).
The time required is O(log n) for lookup, plus a maximum of O(log n) retracing levels (O(1) on average) on the way back to the root, so the operation can be completed in O(log n) time.
Delete[edit]
The preliminary steps for deleting a node are described in section Binary search tree#Deletion. There, the effective deletion of the subject node or the replacement node decreases the height of the corresponding child tree either from 1 to 0 or from 2 to 1, if that node had a child.
Starting at this subtree, it is necessary to check each of the ancestors for consistency with the invariants of AVL trees. This is called "retracing".
Since with a single deletion the height of an AVL subtree cannot decrease by more than one, the temporary balance factor of a node will be in the range from −2 to +2. If the balance factor remains in the range from −1 to +1 it can be adjusted in accord with the AVL rules. If it becomes ±2 then the subtree is unbalanced and needs to be rotated. (Unlike insertion where a rotation always balances the tree, after delete, there may be BF(Z) ≠ 0 (see fig.s 4 and 5), so that after the appropriate single or double rotation the height of the rebalanced subtree decreases by one meaning that the tree has to be rebalanced again on the next higher level.) The various cases of rotations are described in section Rebalancing.
 Invariant of the retracing loop for a deletion
The height of the subtree rooted by N has decreased by 1. It is already in AVL shape.
The retracing can stop if the balance factor becomes ±1 (it must have been 0) meaning that the height of that subtree remains unchanged.
If the balance factor becomes 0 (it must have been ±1) then the height of the subtree decreases by one and the retracing needs to continue.
If the balance factor temporarily becomes ±2, this has to be repaired by an appropriate rotation. It depends on the balance factor of the sibling Z (the higher child tree in fig. 4) whether the height of the subtree decreases by one –and the retracing needs to continue– or does not change (if Z has the balance factor 0) and the whole tree is in AVLshape.
The time required is O(log n) for lookup, plus a maximum of O(log n) retracing levels (O(1) on average) on the way back to the root, so the operation can be completed in O(log n) time.
Set operations and bulk operations[edit]
In addition to the singleelement insert, delete and lookup operations, several set operations have been defined on AVL trees: union, intersection and set difference. Then fast bulk operations on insertions or deletions can be implemented based on these set functions. These set operations rely on two helper operations, Split and Join. With the new operations, the implementation of AVL trees can be more efficient and highlyparallelizable.^{[11]}
 Join: The function Join is on two AVL trees t_{1} and t_{2} and a key k will return a tree containing all elements in t_{1}, t_{2} as well as k. It requires k to be greater than all keys in t_{1} and smaller than all keys in t_{2}. If the two trees differ by height at most one, Join simply create a new node with left subtree t_{1}, root k and right subtree t_{2}. Otherwise, suppose that t_{1} is higher than t_{2} for more than one (the other case is symmetric). Join follows the right spine of t_{1} until a node c which is balanced with t_{2}. At this point a new node with left child c, root k and right child t_{2} is created to replace c. The new node satisfies the AVL invariant, and its height is one greater than c. The increase in height can increase the height of its ancestors, possibly invalidating the AVL invariant of those nodes. This can be fixed either with a double rotation if invalid at the parent or a single left rotation if invalid higher in the tree, in both cases restoring the height for any further ancestor nodes. Join will therefore require at most two rotations. The cost of this function is the difference of the heights between the two input trees.
 Split: To split an AVL tree into two smaller trees, those smaller than key x, and those larger than key x, first draw a path from the root by inserting x into the AVL. After this insertion, all values less than x will be found on the left of the path, and all values greater than x will be found on the right. By applying Join, all the subtrees on the left side are merged bottomup using keys on the path as intermediate nodes from bottom to top to form the left tree, and the right part is asymmetric. The cost of Split is , order of the height of the tree.
The join algorithm is as follows:
function joinRightAVL(T_{L}, k, T_{R}) (l,k',c)=expose(T_{L}) if (h(c)<h(T_{R})+1) T'=Node(c,k,T_{R}) if (h(T')<=h(l)+1) then return Node(l,k',T') else return rotateLeft(Node(l,k'rotateRight(T'))) else T'=joinRightAVL(c,k,T_{R}) T=Node(l,k',T') if (h(T_{L})>h(T_{R})+1) return T else return rotateLeft(T) function joinLeftAVL(T_{L}, k, T_{R}) /* symmetric to joinRightAVL */ function join(T_{L}, k, T_{R}) if (h(T_{L})>h(T_{R})+1) return joinRightAVL(T_{L}, k, T_{R}) if (h(T_{R})>h(T_{L})+1) return joinLeftAVL(T_{L}, k, T_{R}) return Node(T_{L},k,T_{R})
Here of a node the height of . expose(v)=(l,k,r) means to extract a tree node 's left child , the key of the node , and the right child . Node(l,k,r) means to create a node of left child , key , and right child .
The split algorithm is as follows:
function split(T,k) if (T=nil) return (nil,false,nil) (L,(m,c),R)=expose(T) if (k=m) return (L,true,R) if (k<m) (L',b,R')=split(L,k) return (L',b,join(R',m,R)) if (k>m) (L',b,R')=split(R,k) return (join(L,m,L'),b,R))
The union of two AVLs t_{1} and t_{2} representing sets A and B, is an AVL t that represents A ∪ B. The following recursive function computes this union:
function union(t_{1}, t_{2}): if t_{1} = nil: return t_{2} if t_{2} = nil: return t_{1} t_{<}, t_{>} ← split t_{2} on t_{1}.root return join(t_{1}.root,union(left(t_{1}), t_{<}),union(right(t_{1}), t_{>}))
Here, Split is presumed to return two trees: one holding the keys less its input key, one holding the greater keys. (The algorithm is nondestructive, but an inplace destructive version exists as well.)
The algorithm for intersection or difference is similar, but requires the Join2 helper routine that is the same as Join but without the middle key. Based on the new functions for union, intersection or difference, either one key or multiple keys can be inserted to or deleted from the AVL tree. Since Split calls Join but does not deal with the balancing criteria of AVL trees directly, such an implementation is usually called the "joinbased" implementation.
The complexity of each of union, intersection and difference is for AVLs of sizes and . More importantly, since the recursive calls to union, intersection or difference are independent of each other, they can be executed in parallel with a parallel depth .^{[11]} When , the joinbased implementation has the same computational DAG as singleelement insertion and deletion.
Rebalancing[edit]
If during a modifying operation (e.g. insert, delete) a (temporary) height difference of more than one arises between two child subtrees, the parent subtree has to be "rebalanced". The given repair tools are the socalled tree rotations, because they move the keys only "vertically", so that the ("horizontal") inorder sequence of the keys is fully preserved (which is essential for a binarysearch tree).^{[9]}^{[10]}
Let X be the node that has a (temporary) balance factor of −2 or +2. Its left or right subtree was modified. Let Z be the higher child. Note that Z is in AVL shape by induction hypothesis.
In case of insertion this insertion has happened to one of Z's children in a way that Z's height has increased. In case of deletion this deletion has happened to the sibling t_{1} of Z in a way so that t_{1}'s height being already lower has decreased. (In that case Z's balance factor may be 0.)
There are four situations that might arise. We will describe them as Dir1 Dir2, where Dir1 comes from the set { left, right } and Dir2 as a balance factor comes from the set { leftheavy = −1, balanced = 0, rightheavy = +1 }.^{[12]}
Situation Dir1 Dir2 denotes:
 Z is a Dir1 child of its parent and
 Z is Dir2heavy if Dir2 != Dir1
 Z is not (−Dir2)heavy if Dir2 == Dir1
i.e.
Right Right  => Z is a right child of its parent X and  Z is not leftheavy  (i.e. BalanceFactor(Z) ≥ 0)  (see figure 4)  
Left Left  => Z is a left child of its parent X and  Z is not rightheavy  (i.e. BalanceFactor(Z) ≤ 0)  
Right Left  => Z is a right child of its parent X and  Z is leftheavy  (i.e. BalanceFactor(Z) = −1)  (see figure 5)  
Left Right  => Z is a left child of its parent X and  Z is rightheavy  (i.e. BalanceFactor(Z) = +1) 
The balance violation of case Dir1 == Dir2 is repaired by a simple rotation rotate_(−Dir1) (rotate_Left in figure 4 resp. its mirror rotate_Right).
The case Dir1 != Dir2 is repaired by a double rotation rotate_(−Dir2)(−Dir1) == rotate_Dir1Dir2 (rotate_RightLeft in figure 5 resp. its mirror rotate_LeftRight).
The cost of a rotation, both simple and double, is constant.
Simple rotation[edit]
Figure 4 shows a Right Right situation. In its upper half, node X has two child trees with a balance factor of +2. Moreover, the inner child t_{23} of Z (i.e., left child when Z is right child resp. right child when Z is left child) is not higher than its sibling t_{4}. This can happen by a height increase of subtree t_{4} or by a height decrease of subtree t_{1}. In the latter case, also the pale situation where t_{23} has the same height as t_{4} may occur.
The result of the left rotation is shown in the lower half of the figure. Three links (thick edges in figure 4) and two balance factors are to be updated.
As the figure shows, before an insertion, the leaf layer was at level h+1, temporarily at level h+2 and after the rotation again at level h+1. In case of a deletion, the leaf layer was at level h+2, where it is again, when t_{23} and t_{4} were of same height. Otherwise the leaf layer reaches level h+1, so that the height of the rotated tree decreases.
 Code snippet of a simple left rotation
Input:  X = root of subtree to be rotated left 
Z = right child of X, Z is rightheavy  
with height == Height(LeftSubtree(X))+2  
Result:  new root of rebalanced subtree 
Double rotation[edit]
Figure 5 shows a Right Left situation. In its upper third, node X has two child trees with a balance factor of +2. But unlike figure 4, the inner child Y of Z is higher than its sibling t_{4}. This can happen by the insertion of Y itself or a height increase of one of its subtrees t_{2} or t_{3} (with the consequence that they are of different height) or by a height decrease of subtree t_{1}. In the latter case, it may also occur that t_{2} and t_{3} are of same height.
The result of the first, the right, rotation is shown in the middle third of the figure. (With respect to the balance factors, this rotation is not of the same kind as the other AVL single rotations, because the height difference between Y and t_{4} is only 1.) The result of the final left rotation is shown in the lower third of the figure. Five links (thick edges in figure 5) and three balance factors are to be updated.
As the figure shows, before an insertion, the leaf layer was at level h+1, temporarily at level h+2 and after the double rotation again at level h+1. In case of a deletion, the leaf layer was at level h+2 and after the double rotation it is at level h+1, so that the height of the rotated tree decreases.
 Code snippet of a rightleft double rotation
Input:  X = root of subtree to be rotated 
Z = its right child, leftheavy  
with height == Height(LeftSubtree(X))+2  
Result:  new root of rebalanced subtree 
Comparison to other structures[edit]
Both AVL trees and red–black (RB) trees are selfbalancing binary search trees and they are related mathematically. Indeed, every AVL tree can be colored red–black,^{[13]} but there are RB trees which are not AVL balanced. For maintaining the AVL resp. RB tree's invariants, rotations play an important role. In the worst case, even without rotations, AVL or RB insertions or deletions require O(log n) inspections and/or updates to AVL balance factors resp. RB colors. RB insertions and deletions and AVL insertions require from zero to three tailrecursive rotations and run in amortized O(1) time,^{[14]}^{[15]} thus equally constant on average. AVL deletions requiring O(log n) rotations in the worst case are also O(1) on average. RB trees require storing one bit of information (the color) in each node, while AVL trees mostly use two bits for the balance factor, although, when stored at the children, one bit with meaning «lower than sibling» suffices. The bigger difference between the two data structures is their height limit.
For a tree of size n ≥ 1
 an AVL tree’s height is at most
 where the golden ratio, and .
 an RB tree’s height is at most
 .^{[16]}
AVL trees are more rigidly balanced than RB trees with an asymptotic relation ^{AVL}⁄_{RB}≈0.720 of the maximal heights. For insertions and deletions, Ben Pfaff shows in 79 measurements a relation of ^{AVL}⁄_{RB} between 0.677 and 1.077 with median ≈0.947 and geometric mean ≈0.910.^{[17]}
See also[edit]
 Trees
 Tree rotation
 WAVL tree
 Red–black tree
 Splay tree
 Scapegoat tree
 Btree
 Ttree
 List of data structures
References[edit]
 ^ ^{a} ^{b} ^{c} ^{d} ^{e} ^{f} Eric Alexander. "AVL Trees".
 ^ Robert Sedgewick, Algorithms, AddisonWesley, 1983, ISBN 0201066726, page 199, chapter 15: Balanced Trees.
 ^ Georgy AdelsonVelsky, G.; Evgenii Landis (1962). "An algorithm for the organization of information". Proceedings of the USSR Academy of Sciences (in Russian). 146: 263–266. English translation by Myron J. Ricci in Soviet Mathematics  Doklady, 3:1259–1263, 1962.
 ^ Pfaff, Ben (June 2004). "Performance Analysis of BSTs in System Software" (PDF). Stanford University.
 ^ AVL trees are not weightbalanced? (meaning: AVL trees are not μbalanced?)
Thereby: A Binary Tree is called balanced, with , if for every node , the inequality  ^ Knuth, Donald E. (2000). Sorting and searching (2. ed., 6. printing, newly updated and rev. ed.). Boston [u.a.]: AddisonWesley. p. 459. ISBN 0201896850.
 ^ Rajinikanth. "AVL Tree :: Data Structures". btechsmartclass.com. Retrieved 20180309.
 ^ Knuth, Donald E. (2000). Sorting and searching (2. ed., 6. printing, newly updated and rev. ed.). Boston [u.a.]: AddisonWesley. p. 460. ISBN 0201896850.
 ^ ^{a} ^{b} Knuth, Donald E. (2000). Sorting and searching (2. ed., 6. printing, newly updated and rev. ed.). Boston [u.a.]: AddisonWesley. pp. 458–481. ISBN 0201896850.
 ^ ^{a} ^{b} Pfaff, Ben (2004). An Introduction to Binary Search Trees and Balanced Trees. Free Software Foundation, Inc. pp. 107–138.
 ^ ^{a} ^{b} Blelloch, Guy E.; Ferizovic, Daniel; Sun, Yihan (2016), "Just join for parallel ordered sets", Symposium on Parallel Algorithms and Architectures, ACM, pp. 253–264, doi:10.1145/2935764.2935768, ISBN 9781450342100.
 ^ Thereby, rotations with case Balanced do not occur with insertions.
 ^ Paul E. Black (20150413). "AVL tree". Dictionary of Algorithms and Data Structures. National Institute of Standards and Technology. Retrieved 20160702.
 ^ Mehlhorn & Sanders 2008, pp. 165, 158
 ^ Dinesh P. Mehta, Sartaj Sahni (Ed.) Handbook of Data Structures and Applications 10.4.2
 ^ Red–black tree#Proof of asymptotic bounds
 ^ Ben Pfaff: Performance Analysis of BSTs in System Software. Stanford University 2004.
Further reading[edit]
 Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. AddisonWesley, 1997. ISBN 0201896850. Pages 458–475 of section 6.2.3: Balanced Trees.
External links[edit]
The Wikibook Algorithm Implementation has a page on the topic of: AVL tree 
Wikimedia Commons has media related to AVLtrees. 
 This article incorporates public domain material from the NIST document: Black, Paul E. "AVL Tree". Dictionary of Algorithms and Data Structures.