CSc 103 Programming Project
Famous Names
The goal of this project is to write a Java implementation of a
generic Binary Search Tree and write a client application that uses it.
Overview - Frequency Analysis of Famous Names
Using an online database, the instructor has created a file of nearly
34,000 names of famous people. Of course you are probably curious
what the most common first names are.
We want a program that can read a text file of names and analyze it to
determine the
frequency
of occurrence of first names. Assume that there is only one name
per line and that the person's first name starts at the beginning of
the line. Disregard differences between
lower
and upper case letters. Valid names contain only
alphabetic characters and have more than one letter. (Blanks and
punctuation are not considered as
letters). The output report should show the number of times each
name
was found in the text, ordered by frequency, then by upper case name
(alphabetically). Each line contains the name, the count, and the
percent of the total number of names (to 2 decimal places).
For example, an analysis of this data file
would yield:
BOB 3 20.00
MARY 3 20.00
TOM 3 20.00
ED 2 13.33
JILL 1 6.67
MARK 1 6.67
PRINCE 1 6.67
REVEREND 1 6.67
Note that even though "REVEREND" is a title, because it appears at the
start of the line we will treat it as a first name.
The application should obtain the input text from standard input.
(Do not print any prompt to the user).
Software Design
Click on a class in the diagram to view the javadocs.

You are to implement all the classes in the above design, except for FamousNameApp,
BinaryNode
and Natural.
BinarySearchTree This is a generic version
of StringSearchTree that we developed in lab.
NameCount, an ADT for a name count.
NameCountComparator, implements java.util.Comparator<NameCount>,
to enable sorting on frequency.
FamousName, an ADT for a special kind of string that
represents a famous name.
Document, process a stream of english text to count
the famous names.
NameFrequencyReport, generate the finished report.
Here are the .class files for the instructor components in a
JAR formatted file:
FamousNameComponents.jar
FamousNameComponents.jar (Ver 2)
For your convenience,
skeletons.zip
contains the skeletons for the files you need to implement.
Testing
You must write your own JUnit tests.
Your
components will be tested with instructor reference
tests. The reference tests are JUnit tests of your classes (shown
in green in the class diagram above). There will be a performance
test: the 34,000 names from the online
database. There's a time limit of 20 seconds.
On this project your solution must conform to the class coding
standard. Your unit tests
must demonstrate "statement" coverage (they cause every
statement in your program to be executed.)
Handing in Your Source
Electronically
- Submit a zip file containing your source code and unit tests to
Web-CAT. You do not need to submit FamousNameComponents.jar.
- When Web-CAT assigns a 100% score to your work, you should
finalize
your Time Log and submit it to the instructor at the next class
meeting.