CSC 103 Lecture Notes Week 7
More Kinds of Trees -- Heaps, B-Trees, Red-Black Trees
-
Overview
-
In these notes we examine some additional types and applications of tree data
structures.
-
Heaps are a type of tree that can be used to represent a priority
queue data structure, which is a very useful variant of a queue.
-
B-Trees are a form of n-ary search tree, typically used to hold large
databases on external storage devices such as disks.
-
Red-black trees are a popular alternative to AVL trees for maintaining
height-balancing in a binary search tree.
-
Heaps as priority queues
-
A priority queue data abstraction provides three basic operations:
insert and deleteMin.
-
insert is like the enqueue operation in a queue, but with a
priority number.
-
deleteMin is a priority-based dequeue operation.
-
findMin is a priority-based first operation.
-
In what follows, we will use a tree data structure as the concrete data
representation of a priority queue.
-
Specifically, we will use a form of balanced binary tree called a
heap.
-
This heap-based representation of a priority queue provides the following
performance for the basic priority queue operations:
-
O(log N) for insert
-
O(log N) for deleteMin
-
O(1) for findMin
where N is the number of elements in the priority queue.
-
Heap structure property
-
A binary tree that is completely filled, except perhaps for the last row.
-
This is called a complete binary tree (see Figure 6.2).
-
For height h, contains between 2h and 2h+1
nodes.
-
This means the height is O(N), for N = the number of nodes.
-
A very nice property of heaps is that they can be represented directly in an
array, without using a pointer-based structure.
-
Figure 6.3 shows the array-based representation of Figure 6.2.
-
For any element at position i:
-
the left child is at position 2i
-
the right child is the element after the left at position 2i +1
-
the parent is at position |i/2|.
-
These facts mean that tree traversal is very simple and efficient.
-
A max size estimate must be provided whenever a new array-based heap object is
constructed
-
This is not typically a big problem.
-
Plus, we can resize if necessary.
-
Figure 6.4 is a class skeleton.
-
Heap order property
-
The primary goal for a heap is to find the minimum-value element quickly.
-
Given this, it makes sense to store the smallest element at the root, and
maintain this property recursively throughout the tree.
-
Hence, the heap order property is:
For every node X, the key value of the parent of X is <=
the key of X, except for the parentless root of the tree.
-
By this property, the min value in any tree is always at the root, which means
findMin runs in O(1) time.
-
Figure 6.5 illustrates two complete binary trees, one a heap (on the left) the
other not a heap (on the right).
-
Applications of priority queues.
-
They're used a lot in operating systems, where jobs are put on a queue, but
given a priority in terms of how soon they should be run.
-
A heap is also used as the basis of an O(N log N) sorting algorithm, known not
coincidentally as heapsort.
-
Implementing the basic heap operations
-
insert
-
To insert a node X into a heap, we create a hole at the next available
location.
-
Since we must maintain the complete tree structure property, this hole must be
a the next available spot along the frontier of the tree.
-
If X can be placed in the whole without violating the order property,
we do it and we're done.
-
Otherwise, we slide the hole-node's parent into the hole, thus bubbling the
hole up towards the root.
-
We continue this until X can be placed in the location of the hole
without violating the heap order property.
-
Figures 6.6 and 6.7 illustrate.
-
This process is percolate up, whereby an element to be inserted is
percolated up towards the root until it finds its proper place in heap order.
-
The code is give in Figure 6.8.
-
Note that the algorithm outlined above does not swap elements, but just moves
the hole, so as to avoid wasting the extra time for an unneeded assignment
statements.
-
An array-based trace is on the back of page 190.
deleteMin
-
Heap delete is done a manner similar to insert.
-
First we find the minimum, which is guaranteed to be at the root.
-
When the minimum is deleted, a hole is created at the root.
-
In order to maintain the heap structure property, we must remove the last leaf,
X, and fill the hole with an appropriate value.
-
If X can be placed a the root, we're done.
-
If not, we slide the smaller of the hole's children into the hole, and repeat
the process until X can be properly placed.
-
This is a percolate down process, analogous to the percolate
up of insert.
-
Figures 6.9 through 6.11 illustrate.
-
Figure 6.12 is the code.
-
An array-based trace is on the back of page 192.
-
Running times:
-
O(log n) worst case.
-
O(log n) average case, for equally likely keys.
-
Other heap operations.
-
A heap by itself is good at finding the minimum, but otherwise bad at finding
other elements, since it maintains no other ordering information.
-
If we want to be able to get at the ith, we can use some additional
data structure, such as a hash table that stores the position of a given key
element (see Figure on back of 192).
-
If we assume we have some indexing structure such as this, whereby we can find
the ith element, we can provide the following operations, all of which run in
O(log N) time.
-
decreaseKey(position, amount)
-
This lowers the value of the key at the given position by the given amount.
-
If the change violates the order property, it can be fixed by percolating up.
-
increaseKey(position, amount)
-
This increases the value of the key at the given position by the given amount.
-
If the change violates the order property, it can be fixed by percolating down.
-
delete(position)
-
Removes the element at the given position.
-
Performed by first doing decreaseKey(position, maxInt) followed by
deleteMin().
-
A particularly interesting operation is buildHeap, which takes
N input items (say in an array) and builds a heap.
-
The strategy is:
-
Place the N items in any order into the heap array, maintaining the
structure property but not initially the order property.
-
Percolate down the top half of the tree, using the algorithm given in Figure
6.14.
-
With careful analysis, this can be shown to run in O(N) time.
-
The selection problem.
-
A widely-used operation on collections is findKth, which finds the
kth smallest (or largest) element in collection.
-
A quick-and-dirty algorithm to do this is to do a simple sort, in
O(N2 time, and them access the kth element in O(1) time,
for a total running time of O(N2).
-
An algorithm that uses buildHeap can do this operation in O(N log N)
time, as follows:
-
Read the N elements into an array.
-
Apply buildHeap.
-
Perform k deleteMin operations.
-
That kth item deleted is the one we're looking form.
-
Running time:
-
Worst case for buildHeap is O(N).
-
Worst case for deleteMin is O(log N).
-
Since there are k deleteMins, we get a total running time of
O(N + k log N).
-
If k = O(N / log N), buildHeap dominates the running time,
and we're O(N).
-
For larger k, the running time maxes out at O(N log N).
-
B-Trees -- from book.
-
Red-black trees -- from book.
index
|
lectures
|
labs
|
handouts
|
examples
|
assignments
|
solutions
|
doc
|
grades
|
help