Binary Search Tree (BST)
Updated:
⚠️ This post was created when I was in high school and is no longer maintained.
1. Binary search recap
algorithm BinarySearch(A, key, start, end):
if start < end then
mid := floor((start + end) / 2)
if key == A[mid] then
return mid
if key < A[mid] then
BinarySearch(A, key, start, mid1)
if key > A[mid] then
BinarySearch(A, key, mid+1, end)
=> Binary search takes Θ(log~2~n) time
2. Intuition
2.1 Components of a BST
 Two values: `key` and `satellite data`
 Three pointers: `left`, `right`, `parent` ![ccf68e7ad840ffd79b66e50d142fa114.png](http://hongyu.nl/_resources//25d79aa449f64240b8843edce1b9400b.png)
2.2 BST property
Tips: Be careful that a BST is NOT an almost complete binary tree!!!
3. BST operations
3.1 Searching
TREESEARCH: returns a pointer to a node with key k if one exists; otherwise, it returns NIL. The procedure begins its search at the root and traces a simple path downward in the tree. For each node x it encounters, it compares the key k with x.key
{ @x: starting node
 @k: target key
}
 recursive version 
algorithm TreeSearch(x, k)
if x == NIL or x.key == k then
return x
if k < x.key then
TreeSearch(x.left, k)
else
TreeSearch(x.right, k)
 iterative version 
algorithm TreeSearch(x, k):
While x != NIL and x.key != k do
if k < x.key then
x = x.left
else
x = x.right
return x
Tip: in TreeSearch()
, the sequence of nodes encountered forms a simple path downward from the root.
3.2 Minimum and Maximum:
We can always find an element in a binary search tree whose key is a minimum by following left child pointers from the root until we encounter a NIL —— CRLS
 we assume that the input node is nonNIL 
algorithm TreeMinimum(x):
While x.left != NIL do
x = x.left
return x
algorithm TreeMaximum(x):
While x.right != NIL do
x = x.right
return x
3.3 Successor and Predecessor
The structure of a binary search tree allows us to determine the successor/predecessor of a node without ever comparing keys. —— CRLS
 Successor:
If all keys are distinct, the successor of a node x is the node with the smallest key greater than
x.key
In the above diagram, the inorder successor of 8 is 10, the inorder successor of 10 is 12 and inorder successor of 14 is 20.
 Predecessor:
If all keys are distinct, the predecessor of a node x is the node with the greatest key smaller than
x.key
algorithm TreeSuccessor(x):
 there are two cases 
{ 1. node x has the right subtree:
 find the smallest node in its right subtree
}
if x.right != NIL then
return TreeMinimum(x.right)
{ 2. node x doesn't have right subtree:
 find the *lowest* ancestor of x whose left child
 is x.
}
y := x.parent
While y != NIL and x == y.right do
> (once x is no longer the right child of y, x > y)
x = y
y = y.parent
algorithm TreePredecessor(x):
 the opposite 
Final notes:
 Recall that every node is its ancestor.
 The worstcase relies on the height of the tree (height matters).
3.4 Insertion
BST insertion is nothing but fnding a correct leaf place (NIL
) for the newly inserted node:
{ Assume keys are distinct and n is the new node that need inserting
 with z.key = v and z.left == z.right == NIL
}
algorithm TreeInsert(T, n):
trailing := NIL > y (in the book) is the trailing pointer that represnts the parent of x
target := T.root
While target != NIL do > Newly inserted node is always a leaf so we need to find a proper NIL for it
trailing := target
if n.key < target.key then
target = target.left
else
target = target.right
n.parent := traling > this is where the trailing pointer comes into play (set the parent pointer of the new node)
if trailing == NIL then > empty tree
T.root = n
else if n.key < trailing.key then
trailing.left = n
else
trailing.right = n
[Question]: why we need trailing pointer?
… This NIL occupies the position where we wish to place the input item n. We need the trailing pointer y because by the time we find the NIL where n belongs, the search has proceeded one step beyond the node that needs to be changed. —— CLRS
By saying “one step beyond”, it means we cannot use a NIL
node to access its parent node that is needed to set n.parent
(like a singlely linkedlist). In other words, we use the trailing pointer to tack the parent when performing searching.
3.5 Deletion
There are 3 cases:
Sum up:
(say z
is the node that is wanted to be deleted)
If z
has:
 one child / single child: replace the deleted node by NIL or its child directly; > situation: (a), (b)
 two children:
 first, we need to find
z
’s successor nodey
which must be in the right subtree of z; (tips:y
has no left child < PBC see my book notes)  elevate
y
’s child to takey
’s position  repace
z
byy
and givey
allz
’s children
 first, we need to find
This eventually boils down to find the successor of the removed node and transplant it.
{ TRANSPLANT replaces one subtree as a child of its
 parent with another subtree. When TRANSPLANT replaces
 the subtree rooted at node 'old' with the subtree
 rooted at node 'new', node 'old'’s parent becomes node
 'new'’s parent, and 'old'’s parent ends up having
 'new' as its appropriate child.
}
algorithm Transplant(T, old, new):
 'old' is the root 
if old.parent == NIL then > Why not old == T.root?
T.root = new
 set parent node 
else if old == old.parent.left then
old.parent.left = new
else
old.parent.right = new
 set parent pointer 
if new != NIL then
new.parent = old.parent

algorithm TreeDelete(T, z):
 case (a), (b): single child 
if z.left == NIL then
Transplant(z, z.right)
else if z.right == NIL then
Transplant(z, z.left) > replace directly (assume all subtrees satisfy BST property)
else
 case (c), (d): two children 
successor := TreeMinimum(z.right) > y
if successor.parent != z then
 case d (prepare successor) 
Transplant(T, successor, successor.right)
 update right child 
successor.right = z.right
successor.right.parent = successor
 case c, d (replace z by its successor) 
Transplant(T, z, successor)
 update left child 
successor.left = z.left
successor.left.parent = successor
Tips: the most timeconsuming procedure lies in the TreeMinimum()
which may take Θ(logn) time.
4. Time Complexity

All
TreeSearch() TreeMinimum() TreeMaximum() TreeSuccessor() TreePredeseccor()
takeO(h)
time on a binary search tree of height h. 
For
TreeDelete()
,Tranplant()
takes constant time onlyTreeMinimun()
takesO(h)
time, so bothTreeInsert()
andTreeDelete()
takeO(h)
time on a binary search tree of height h.
5. Intermezzo: How many BT with n nodes exist?
first Catalan numbers: 1; 1; 2; 5; 14; 42; 132; 429; 1430; 4862; 16796; 58786; 208012; 742900; 2674440; 9694845; 35357670; 129644790; 477638700; 1767263190; 6564120420; 24466267020; 91482563640; 343059613650; 1289904147324; 4861946401452; …
Leave a comment