Why study data structures?
Programs are comprised of two things: data and algorithms.
The algorithms describe the way the data is to be transformed.
The reason for learning about data structures is because adding
structure to our data can make
the algorithms much simpler, easier to maintain, and often faster.
Here's a very simple example:
/**
* SodaInventory keeps track of inventory of sodas.
* A list of transactions is entered with a soda brand
* and an amount. The inventory is updated by the amount.
*/
public class SodaNoDataStructure
{
public static void main(String[] args)
{
int coke = 0;
int jolt = 0;
int pepsi = 0;
int sprite = 0;
java.util.Scanner console = new
java.util.Scanner(System.in);
System.out.println("Enter transactions
(soda and amount):\n");
// Read all the transactions
while (console.hasNext())
{
String soda =
console.next().toUpperCase();
int amount =
console.nextInt();
char response =
soda.charAt(0);
if (response == 'C')
coke += amount;
if (response == 'P')
pepsi += amount;
if (response == 'J')
jolt += amount;
if (response == 'S')
sprite += amount;
}
System.out.println("Final inventory.\n");
System.out.println("coke " + coke);
System.out.println("jolt " + jolt);
System.out.println("pepsi " + pepsi);
System.out.println("sprite " + sprite);
}
}
The program keeps track of the inventory on hand of four different
kinds of soft drinks.
Each transaction is entered on a separate line with a positive number
for additions
to the inventory or a negative number for deductions from the
inventory.
The final inventory is reported at the end.
Enter transactions
(soda and
amount):
coke 40
jolt 20
sprite 50
coke -10
Final inventory.
coke 30
jolt 20
pepsi 0
sprite 50
In this simple version each inventory amount is stored in a separately
named variable.
There's no data structure. There's only individual data items.
The program works correctly but the shortcoming is that determing the
appropriate
variable to update requires a chain of decisions that explicitly check
every possibility.
There's nothing bad about that in itself, but the weakness of that
approach is that
it doesn't scale up very easily. If we want to enhance the
program to accept a fifth
variety of soda, like "Monster", we have to make modifications to three
different
parts of the code:
- add a new variable
- add another if statement
- add another print statement to the report
Each modification introduces the possibility of making a mistake.
In addition, certain kinds of computations become quite difficult to
express in any
succinct manner. For example, how would we find the average
inventory?
(coke + jolt + pepsi + sprite) / 4
When a fifth soft drink is added, one must remember to change the
denominator
as well as add the new item in the numerator.
What if we want to add five more soft drinks, or ten? At some
point, the code just
gets too unwieldy to manage. Obviously a real program that
has hundreds or
thousands of data items would be so long as to be prohibitively
expensive to write.
It's just not practical to store every data item in its own separately
named variable.
Consider this alternate solution that uses a data structure, in this
case an array of integers.
public class SodaWithArrays
{
public static void main(String[] args)
{
int[] inventory = new
int[256]; /* soda inventory */
String sodacodes = " CPJS";
java.util.Scanner console = new
java.util.Scanner(System.in);
System.out.println("Enter transactions
(soda and amount):\n");
// Read all the transactions
while (console.hasNext())
{
String soda =
console.next().toUpperCase();
int amount =
console.nextInt();
char response =
soda.charAt(0);
inventory[(int)response] = inventory[(int)response] + amount;
}
System.out.println("Final inventory.\n");
for (int index = 1; index < sodacodes.length();
index++)
{
char soda =
sodacodes.charAt(index);
System.out.print(soda
+" ");
System.out.println(inventory[(int)soda]);
}
}
}
Each inventory amount is saved in an element of the array.
The program operates exactly the same.
However, upgrading the program to include a fifth type of soda is
trivial.
All that is needed is to add a single letter to the sodacodes string.
String sodacodes = " CPJSM";
NOTHING ELSE has to change. Importantly, we make no changes to
the algorithm logic, which greatly reduces the likelihood that we'll
introduce an error.
Consider another example. The following program segment
represents
a checkerboard as a string.
public class CheckerboardString
{
String board = "B B B B B B B B B B B B B B B
BB B B B B B B
B
W
W
W
W
W
W
W WW W W W W W W W W W W W W W W w";
public boolean isEmpty(int row, int column)
{
int index = row * 8 + column;
char square = board.charAt(index);
return square == ' ';
}
}
The string is not easy to visually verify.
The algorithm to determine if a square is empty is long and difficult
to understand.
public class CheckerboardArray
{
char[][] board =
{{'B',' ','B',' ','B',' ','B',' ','B',' '},
{' ','B',' ','B',' ','B',' ','B',' ','B'},
{'B',' ','B',' ','B',' ','B',' ','B',' '},
{' ',' ',' ',' ',' ',' ',' ',' ',' ',' '},
{' ',' ',' ',' ',' ',' ',' ',' ',' ',' '},
{' ','W',' ','W',' ','W',' ','W',' ','W'},
{'W',' ','W',' ','W',' ','W',' ','W',' '},
{' ','W',' ','W',' ','W',' ','W',' ','W'}};
public boolean isEmpty(int row, int column)
{
return board[row][column] == ' ';
}
}
When represented as a two-dimensional array, the data structure is more
complex,
but the algorithm is simple and obvious.
Key Points
- In general, the more sophisticated the data structure, the
simpler the
algorithm.
- Simple algorithms are less expensive to develop.
- There is less code to read and comprehend.
- The logic is simpler and modifications are less likely to
introduce
errors.
- It's usually much easier to repair defects, make modifications,
or add
enhancements.