Binary Search Tree lab 2

CPE 103

Lookup the definition of tree height in Weiss.
For each exercise below, draw on paper the Binary Search Tree that would result after inserting the letters in the order shown.  Calculate the height of each tree.

1) N T W D F Z A

2) F W Z T N D A

3) W Z T N F D A


Add a findHeight() method to the Binary Search Tree from the previous lab.  Implement it using a recursive helper method. 

Create an application named BSTdriver that uses your BST. Read from standard input regular English text containing one word per line. Convert the word to upper case and add each new word to the BST.   Print the height of the tree that was created. List the words in alphabetical order.

Test your application using the data from the exercises above to be sure it works correctly.

Now perform an experiment, recording the procedure and observations.

Instrument your application with a timer.  Do empirical studies of how long it takes to build a tree from a text file containing "Huckleberry Finn" (alpha-only) by Mark Twain.

It is recommended that you use input redirection.
E.g.,
       java BSTdriver < huckfinncleanwords.txt

  1. What is the height of the huckfinncleanwords.txt tree?
  2. Use the Unix command for counting words in a file, wc, to discover the size of the data file. Consult the man pages for help. Calculate the performance of your driver in words per second.
  3. Use the Unix sort command to create a new file that has the huck finn data in alphabetical order. 
  4. Run the driver again on the alphabetized file.  What is its height?  How long does it take to create? 
  5. Summarize your observations in tabular format:
    File
    # words
    height
    time
    words/sec
    huckfinncleanwords.txt
    ?
    ?
    ?
    ?
    sortedhuckwords.txt ?
    ?
    ?
    ?
  6. Check your result against the instructor's solution.

  7. Now for comparison, run the same experiment except using a Map instead of a binary search tree.  You may use this driver.
    File
    # words
    time
    words/sec
    huckfinncleanwords.txt
    ?
    ?
    ?
    sortedhuckwords.txt ?
    ?
    ?
    Check your result against the instructor's solution.

  8. Make sure each author's name is included in the class header and submit a zip file of your completed source code to PolyLearn.

Hint: The running time for sortedhuckwords.txt should be less than ten seconds.  If not, you probably relied on the binary search tree to throw DuplicateItemException, which is very expensive, performance-wise.   Think of an alternate solution that doesn't catch any exceptions.
Hint: Stack overflow? Running time too long?  Don't squelch any exceptions.
Check the precondition on insertBinaryNode.