CPE 103 Programming Project

URL Cache Simulation


The purpose of this assignment is to write a program that simulates a URL cache that a web browser maintains to keep a local store of frequently accessed web pages.  By keeping a local cache of web pages the browser can reduce the need for unnecessary and expensive network page retrievals.

When the cache receives a request for a web page, it first looks in its local storage (which we will simulate with a Map) to see if the cache already has a copy of the page.  If so, it returns that copy, otherwise it issues a request for the URL from the internet and caches the value that is retrieved.

The input is a list of URL's (sample).  The output is a report of how often each page was requested (sample), with a cache size of 5.

Class Diagram

class diagram

There are four classes you must implement in this application:

Here are the .class files for the instructor components (URLCacheApp and FakeURL) in a JAR formatted file:   FakeURL.jar
  DO NOT UNZIP THIS FILE. Add it as a custom library to your BlueJ project. It will be considered cheating if you unzip the jar file, submit the jar file to Web-CAT, or submit the unzipped classes with your project.

Implementation Constraints

You must implement the Cache using a HashMap.

Here are the skeletons for the source code files.

Testing

Web-CAT will run your own unit tests and expect 100% statement coverage. Your components will be tested with instructor reference tests.  The reference tests are JUnit tests of your classes.  There will be a performance test of a over 10000 items to make sure previously fetched pages are returned from the cache and don't cause redundant fetches from the internet. There's a time limit of 10 seconds.

Here's a simple test case:
    public void testSimpleReport() throws java.io.IOException
    {
        try
        {
            String dataIn = "http://www.calpoly.edu";
            StringReader rdr = new StringReader(dataIn);
            StringWriter sw = new StringWriter();
            CacheSimulator sim = new CacheSimulator(10);
            sim.simulate(rdr);
            sim.makeReport(sw);
            Scanner scan = new Scanner(sw.toString()); // allowed in Java 7
            assertEquals("Wrong count", 1, scan.nextInt());
        }
        catch (java.util.NoSuchElementException ex)
        {
            fail("Not enough lines in report");
        }
    }

The trickiest test case is when the cache fills up.  Here is a simple test to demonstrate the desired behavior for a simple set of data.
    
    public void testCacheLimit()
    {
        // Create a cache with room for 3 items
        Cache aCache = new Cache(3);
        
        final String url1 = "http://primes.utm.edu/lists/small/1000.txt";
        final String url2 = "http://history.eserver.org/gettysburg-address.txt";
        final String url3 = "http://users.csc.calpoly.edu/~jdalbey/103/Demo/CalcDirection.java";
        final String url4 = "http://www.census.gov/tiger/tms/gazetteer/zips.txt";
        
        // Request these items twice
        for (int loop=1; loop<=2;loop++)
        {
            aCache.requestPage(url1);
            aCache.requestPage(url2);
        }
        // Request this item once (it's the low frequency item)
        aCache.requestPage(url3);
        
        aCache.requestPage(url4);  // should force removal of url3
        
        // should increase request count but not fetch count
        aCache.requestPage(url1);
        aCache.requestPage(url1);  

        assertEquals(8,aCache.getRequestCount());
        assertEquals(4,aCache.getUrlFetchCount());
        ArrayList<WebPage> list = aCache.getPageList();
        assertEquals(3,list.size());    
        assertEquals(url4,list.get(list.size()-1).getUrl());    
        assertEquals(1,list.get(list.size()-1).getFrequency());
    }



Handing in Your Source Electronically
GRADING

Your score on Web-CAT contributes to your project score as follows:
Results from Running Your Tests (100)
Estimate of Problem Coverage (100)
Both of these must be 100 or your project score is zero.
60%
Code Coverage from Your Tests  (0-100)
20%
Style/Coding (0-10)  
10%
Design/Readability (inspection by grader)
10%




FAQ

Q: I'm getting this error from my requestPage() method: "unreported Exception ... must be caught or declared to be thrown."
A: Check the javadocs for requestPage(). It doesn't throw any exceptions. So you should catch them.

Q: Web-CAT is saying "No line found". But my scanner seems to be working okay.
A: That error could be from the instructor test. I call your CacheReport with a file, so it should send the output to a file, and then I try to read that file to verify the output. So it might be something is wrong with your output format. Some students have found the problem was that they were using the low-level write() method. I suggest wrapping the Writer in a PrintWriter so you can use higher level methods like println().

Q: What do you mean by "column 1?"
A: Sorry this was so confusing; I simply mean the first column of the line.

Q: How can I force an IOException to be thrown, so I can get 100% code coverage?
A: There is now a "secret" url, http://forceioexception, that when provided as input, will cause FakeURL to throw IOException.