Programming Assignment P2

Audio Reduce Algorithm


Problem Description

Audio Reduce is a method for encoding an english word based on the way the word sounds rather than how it is spelled. The advantage of Audio Reduce is its ability to group names by sound rather than the exact spelling. Surnames that sound the same, but are spelled differently, like SMITH and SMYTH, have the same Audio Reduction.  This algorithm is useful in applications where a name search may have a key that doesn’t exactly match the target name.

Take, for example, given the name Paine, we desire to match any names in our database that “sound like” Paine, such as Pain, Pane, Payn, Payne, etc. All of these variants will produce the same Audio Reduction, in this case, 15.  The Audio Reduce algorithm was developed to increase the chances of  finding a surname in a database even though it may have been recorded under a different spelling than the one that is being used as a search key.

The Algorithm

The Audio Reduce algorithm encodes a word by performing four transformations (or “reductions”) in the following order:

1. Replace each character in the word with a numeral using the following rules:

A, E, I, O, U, Y -> 0
B, F, P, V -> 1
C, G, J, K, Q, S, X, Z -> 2
D, T -> 3
H, W -> 4
M, N -> 5
L -> 6
R -> 7


2. For any non-zero numeral that's followed by 4, eliminate the 4. For example, 2034 would be reduced to 203.

3. For any repeated sequences of non-zero numerals, replace the repeats by a single instance. E.g., 105503 would be reduced to 10503.

4. Finally, eliminate all zero's.

5. Words which consist of only vowels such as “Yee” should be coded as “0”.

Examples

Step Jackson Sherrell Washington
1
2022205 24077066 4024052305
2
2022205 2077066 402052305
3
20205 20706 402052305
4
225 276 425235

Step
Payne Pfister Ashcraft
1
10050 1102307 02427013
2
10050 1102307 0227013
3
10050 102307 027013
4
15
1237
2713

Step Sacks Lee Callahan
1
20222 600
20660405
2
20222 600
20660405
3
202
600
2060405
4
22
6
2645


Implementation Note

It is not required to implement the algorithm in the same manner as it is described above. It is possible to create alternate formulations so that it isn't necessary to carry out four separate passes over each word.

Problem Requirements

Write a reusable module that implements the Audio Reduction algorithm. The module receives one parameter as input, a String to be reduced, and returns a String containing the reduction. There is no output to standard output.

You must follow this Java class definition:

public class AudioReduction 
{
  /**
     Encode a String using the Audio Reduce Algorithm.
     @param word the string to be encoded
     @return String the encoding, as a string of digits.
     @pre word contains only characters A-Z (case insensitive).
     @pre word.length() >= 1
     */
  public static String reduce (String word)
}


Unit Testing

You must submit execution output that demonstrates that your program can produce the correct results for the nine names in the Examples above.  This output can be created in one of two ways:

Perform any additional tests you want to convince yourself that your solution is correct.

Acceptance Testing

In addition, your program must pass the instructor's acceptance test.  Once you are satisfied that your program is correct and is passing your unit tests, create a new time log entry.  Enter "Test" for the phase and in the comment field enter "Acceptance Testing".

Submit your source code using the Web-CAT grader. On this project the grader will not run your own unit tests.  Web-CAT will report checkstyle errors in red, but they will not count in the correctness/testing score. You will be able to get a Final score of 10.0 even if there are coding standard violations.  

If Web-CAT reports any errors, tally them in a new section of the defect tally without an inject phase, with a removal phase of "ACCEPTANCE TEST".  Don't add them to Injected Phase or total on the Summary form.

When Web-CAT assigns a 100% score to your work, you should finalize your work according to the assignment directions. 




4/22/10 Fixed typo in "Washington" example.