CSC 103 Lecture Notes Week 7
More Kinds of Trees -- Heaps, B-Trees, Red-Black Trees
- 
Overview
- 
In these notes we examine some additional types and applications of tree data
structures.
- 
Heaps are a type of tree that can be used to represent a priority
queue data structure, which is a very useful variant of a queue.
- 
B-Trees are a form of n-ary search tree, typically used to hold large
databases on external storage devices such as disks.
- 
Red-black trees are a popular alternative to AVL trees for maintaining
height-balancing in a binary search tree.
 
 
- 
Heaps as priority queues
- 
A priority queue data abstraction provides three basic operations:
insert and deleteMin.
- 
insert is like the enqueue operation in a queue, but with a
priority number.
- 
deleteMin is a priority-based dequeue operation.
- 
findMin is a priority-based first operation.
 
- 
In what follows, we will use a tree data structure as the concrete data
representation of a priority queue.
- 
Specifically, we will use a form of balanced binary tree called a
heap.
- 
This heap-based representation of a priority queue provides the following
performance for the basic priority queue operations:
- 
O(log N) for insert
- 
O(log N) for deleteMin
- 
O(1) for findMin
 where N is the number of elements in the priority queue.
 
- 
Heap structure property
- 
A binary tree that is completely filled, except perhaps for the last row.
- 
This is called a complete binary tree (see Figure 6.2).
- 
For height h, contains between 2h and 2h+1
nodes.
- 
This means the height is O(N), for N = the number of nodes.
 
- 
A very nice property of heaps is that they can be represented directly in an
array, without using a pointer-based structure.
- 
Figure 6.3 shows the array-based representation of Figure 6.2.
- 
For any element at position i:
- 
the left child is at position 2i
- 
the right child is the element after the left at position 2i +1
- 
the parent is at position |i/2|.
 
- 
These facts mean that tree traversal is very simple and efficient.
 
- 
A max size estimate must be provided whenever a new array-based heap object is
constructed
- 
This is not typically a big problem.
- 
Plus, we can resize if necessary.
 
- 
Figure 6.4 is a class skeleton.
- 
Heap order property
- 
The primary goal for a heap is to find the minimum-value element quickly.
- 
Given this, it makes sense to store the smallest element at the root, and
maintain this property recursively throughout the tree.
- 
Hence, the heap order property is:
For every node X, the key value of the parent of X is <=
the key of X, except for the parentless root of the tree.
 
- 
By this property, the min value in any tree is always at the root, which means
findMin runs in O(1) time.
- 
Figure 6.5 illustrates two complete binary trees, one a heap (on the left) the
other not a heap (on the right).
 
 
 
- 
Applications of priority queues.
- 
They're used a lot in operating systems, where jobs are put on a queue, but
given a priority in terms of how soon they should be run.
- 
A heap is also used as the basis of an O(N log N) sorting algorithm, known not
coincidentally as heapsort.
 
 
- 
Implementing the basic heap operations
- 
insert
- 
To insert a node X into a heap, we create a hole at the next available
location.
- 
Since we must maintain the complete tree structure property, this hole must be
a the next available spot along the frontier of the tree.
- 
If X can be placed in the whole without violating the order property,
we do it and we're done.
- 
Otherwise, we slide the hole-node's parent into the hole, thus bubbling the
hole up towards the root.
- 
We continue this until X can be placed in the location of the hole
without violating the heap order property.
- 
Figures 6.6 and 6.7 illustrate.
- 
This process is percolate up, whereby an element to be inserted is
percolated up towards the root until it finds its proper place in heap order.
- 
The code is give in Figure 6.8.
- 
Note that the algorithm outlined above does not swap elements, but just moves
the hole, so as to avoid wasting the extra time for an unneeded assignment
statements.
- 
An array-based trace is on the back of page 190.
 deleteMin
- 
Heap delete is done a manner similar to insert.
- 
First we find the minimum, which is guaranteed to be at the root.
- 
When the minimum is deleted, a hole is created at the root.
- 
In order to maintain the heap structure property, we must remove the last leaf,
X, and fill the hole with an appropriate value.
- 
If X can be placed a the root, we're done.
- 
If not, we slide the smaller of the hole's children into the hole, and repeat
the process until X can be properly placed.
- 
This is a percolate down process, analogous to the percolate
up of insert.
- 
Figures 6.9 through 6.11 illustrate.
- 
Figure 6.12 is the code.
- 
An array-based trace is on the back of page 192.
- 
Running times:
- 
O(log n) worst case.
- 
O(log n) average case, for equally likely keys.
 
 
 
 
- 
Other heap operations.
- 
A heap by itself is good at finding the minimum, but otherwise bad at finding
other elements, since it maintains no other ordering information.
- 
If we want to be able to get at the ith, we can use some additional
data structure, such as a hash table that stores the position of a given key
element (see Figure on back of 192).
- 
If we assume we have some indexing structure such as this, whereby we can find
the ith element, we can provide the following operations, all of which run in
O(log N) time.
- 
decreaseKey(position, amount)
- 
This lowers the value of the key at the given position by the given amount.
- 
If the change violates the order property, it can be fixed by percolating up.
 
- 
increaseKey(position, amount)
- 
This increases the value of the key at the given position by the given amount.
- 
If the change violates the order property, it can be fixed by percolating down.
 
- 
delete(position)
- 
Removes the element at the given position.
- 
Performed by first doing decreaseKey(position, maxInt) followed by
deleteMin().
 
 
- 
A particularly interesting operation is buildHeap, which takes
N input items (say in an array) and builds a heap.
- 
The strategy is:
- 
Place the N items in any order into the heap array, maintaining the
structure property but not initially the order property.
- 
Percolate down the top half of the tree, using the algorithm given in Figure
6.14.
 
- 
With careful analysis, this can be shown to run in O(N) time.
 
 
 
- 
The selection problem.
- 
A widely-used operation on collections is findKth, which finds the
kth smallest (or largest) element in collection.
- 
A quick-and-dirty algorithm to do this is to do a simple sort, in
O(N2 time, and them access the kth element in O(1) time,
for a total running time of O(N2).
- 
An algorithm that uses buildHeap can do this operation in O(N log N)
time, as follows:
- 
Read the N elements into an array.
- 
Apply buildHeap.
- 
Perform k deleteMin operations.
- 
That kth item deleted is the one we're looking form.
 
- 
Running time:
- 
Worst case for buildHeap is O(N).
- 
Worst case for deleteMin is O(log N).
- 
Since there are k deleteMins, we get a total running time of
O(N + k log N).
- 
If k = O(N / log N), buildHeap dominates the running time,
and we're O(N).
- 
For larger k, the running time maxes out at O(N log N).
 
 
 
- 
B-Trees -- from book.
 
- 
Red-black trees -- from book.
index
|
lectures
|
labs
|
handouts
|
examples
|
assignments
|
solutions
|
doc
|
grades
|
help