Duplication


Introduction | Concepts of Duplication | Imposed Duplication

Inadvertant Duplication | Impatient Duplication | Reuse VS. Duplication | Class Exercise

Return To Home Page


Introduction

Carrot Top - Just one of him is annoying.
Duplication can become annoying and unmanagable just like Carrot Top.   Computer technology is constantly changing and if a product has several different implementations, maintenance will become a nightmare.
To see what happens when duplication turns evil click here...
Back to Top of Page

Concepts of Duplication

Duplication can also be avoided in code by:
Back to Top of Page

Imposed Duplication

Many times programmers feel that duplication is unavoidable.   In most cases this is not true.   Documentation in code is a common instance where duplication occurs needlessly.

A good program should speak for itself.   A brief, high-level overview at the beginning of your module is all that is needed to describe its purpose.

Bad programs require a lot of comments.   If low-level comments are placed within code, duplication is inevitable.   When a line of code is changed, its accompanying comment should also be altered, but it is often overlooked.   This will cause a waste of time and energy later when someone is trying to decipher what your code's purpose is through your comments.
Back to Top of Page

Inadvertent Duplication

If you don't thoroughly think through a design for your code you could duplicate code without even knowing it. Duplication can be hidden in the simplest designs.

Example
Suppose you have a class called line that contains three variables: start, end, and length.
Back to Top of Page

Impatient Duplication

Software Maintenance Nightmare
Situation
A software company makes a printing and book-binding machine.   Twenty different businesses own this product.   Each business's printing and binding machine has a different version of the software to run their machine.

When the software company realizes that their software is expressed in twenty different ways, they ask each business if they can standardize the software.   Each business is skeptical.   Their own printing and binding machine is functioning properly and has been for years.   They do not see any reason why the software should be tampered with.   Their mentality is, "If it ain't broke, don't fix it."

A necessary upgrade to the printing and binding machine surfaces.   This machine prints envelopes and a zip code has changed.   Since each machine has different software on it, twenty different, specialized fixes are needed for each of the twenty machines.   Not only is this a waste of time and money, but it could have been avoided years before.

How could this have happened?
Each machine may have started out with the same code.   The code was copied, pasted, and edited for use in all twenty machines.   If any code changes were made, the company had to go get the source code for that particular machine and then make the change.   These changes expanded the code in each machine into its own version.

How could this have been avoided?
Instead of using the "copy and paste" method, which falls under impatient duplication, ACTUAL reuse should have been used, meaning composition and inheritance.   If time had been taken to include these techniques, time and money would have been saved in the long run.

Avoid Magic Numbers

What is a magic number?
A magic number is a numeric constant/numeric literal embedded in code.   This may include 0, 1, or 2 depending upon the language you are using.

Why are magic numbers bad?
How do you get rid of magic numbers?
Define the literal using a descriptive name.   This name is then used in place of the number throughout your program.

These names gives your code advantages: Examples
  1. Eliminate Several Instances of Literal Number

    public static final int currentAge = 23;


    Every time your age changes, this will be the only place it has to be changed.   This is easier than changing every instance of the numeric literal each time you have a birthday.

  2. Make the Code More Readable
    • Wrong way:
      bool apartmentsAvailable = (tenants < 200);
      This way 200 is an ambiguous number.

    • Correct way:
      public static final int NumberOfApartments = 200; ...
      bool apartmentsAvailable = (tenants < NumberOfApartments);

      Now we can easily see that the tenants variable must be less than the number of Apartments. Also, if the apartment complex expands, then the number of apartments can be easily changed.

  3. Make Sure the Name is Descriptive
    • Wrong way:
      for(int i=0; i < 100; i++) {
          ...
      }
      Here 100 is an ambiguous number.

    • Still wrong:
      public static final int X = 100;

      for(int i=0; i < X; i++) {
          ...
      }
      This time X is chosen to represent 100, so we can use X in every instance of 100.   The problem is, we still don't know what X represents since the name is not descriptive.

    • Correct way:
      public static final int NumOfRecords = 100;

      for(int i=0; i < NumOfRecords; i++) {
          ...
      }
      Finally, we can see that the counter must be less than the number of records.   This also gives us some insight into what the loop contains.   It most likely will manipulate each record in some way.

Back to Top of Page

Reuse VS. Duplication

Reuse and duplication each have their advantages and drawbacks.   Reuse is illustrated using three examples: inheritance, composition, and interfaces.   Duplication is represented by the copy and paste practice.

The main issues considered in making the choice between these four options are ownership, trust, and time.

Copy and Paste

Copy and pasting a class creates two separate entities of that class.   This way you will have total ownership of both of these classes.   This may be a good thing in some cases since only changes you make will affect your code.   It will become unmanageable since this one change will have to be made in all copied versions of the code.   This links in with the trust issue.   If you have full ownership of this code then you know you can trust it since you have full control over it.   As far as time is concerned, you will save a lot of time immediately.   In the long run, you will lose time.   One change multiplies into how many copies of the code you have floating around.

In the example you can see that the student class has been copied and pasted to make the grad student class.   Duplicating methods saves us time up front.

Inheritance

A class using inheritance acquires all the methods of the class it inherits from.   You do not have ownership of the inheriting class since it is the parent class plus its own details.   This "is-a" relationship makes the inheriting class vulnerable to any changes made to the parent class.   This is good in the sense that all changes are made down the inheriting line automatically.   This may not be what you want or may happen unexpectedly.   This is where trust comes in.   Your class relies on the trust of the other classes preceding it in the inheritance tree.   Time may not be saved up front, but in the long run less time will be spent on changes since a change to the parent class will make that change for all classes beneath it.

In the example Grad Student is a Student plus its own details.   Anything that is added or changed for the Student class will be applied to the inheriting Grad Student class.   A downside is that it makes the classes tightly coupled.

Composition

A class using composition can use other class' methods.   You have more ownership of your class than with inheritance because the relationship is "has-a" rather than "is-a".   Your class "has-a" class that it can borrow methods from.   You still need to trust the class whose methods you have access to, because a change in them can still affect you.   You are not as reliant on this class in composition as you are with inheritance.   In terms of time, you have the advantages of reuse in the long run, but will use a bit more time up front.

In the example Grad Student can use all of Student's methods without relying on it as much as with inheritance.   The classes are less tightly coupled than with inheritance.   You still are liable to be affected by changes made to classes your class is composed of.

Interfaces

An interface defines a set of methods but does not implement them.   This essentially calls for better specifications for your code and in Java, each program using this interface is checked to make sure it meets all the criteria of the interface.   Interfaces give you ownership of your code.   Since you have ownership, there is less need to worry about trust.   If your interface has been properly defined, then you should find no trouble trusting it.   Plus, Java won't let anyone compile code that does not follow the interface.   This will take more time up front but save you time later.

In the example both Student and Grad Student use the Student interface.   They both must implement the methods described in the Student Interface.   This may be a sort of duplication, but it is kept under control by the interface.   It allows programmers who have never talked before to have their programs work together and follow a similar design.   Their systems can use Student and Grad Student almost interchangeably by attaching the interface.   If ownership is more important to you, this might be a good technique to use.
Here is a pdf file for all four reuse and duplication UML diagrams.
Back to Top of Page

Class Exercise

Back to Top of Page