CPE 103 Programming Project
URL Cache Simulation
The purpose of this assignment is to write a program that simulates
a
URL cache that a web browser maintains to keep a local store of
frequently accessed web pages. By keeping a local cache of web
pages the browser can reduce the need for unnecessary and expensive
network page retrievals.
When the cache receives a request for a web page, it first looks in
its
local storage (which we will simulate with a Map) to see if the
cache
already has a copy of the page. If so, it returns that copy,
otherwise it issues a request for the URL from the internet and
caches
the value that is retrieved.
The input is a list of URL's (sample).
The output is a report of how often each page was requested (sample), with a
cache size of 5.
Class Diagram
There are four classes you must implement in this application:
- Cache represents the browser URL
cache. Implement this using
HashMap.
- WebPage represents a page
fetched
from the Internet.
- WebPageComparator
provides a comparator for sorting WebPages.
- CacheSimulator Runs the
simulation from a set of input URLs and creates a report.
Here are the .class files for the instructor components (URLCacheApp and FakeURL) in a JAR formatted
file:
FakeURL.jar
DO NOT UNZIP THIS FILE. Add it as a custom library
to your BlueJ project. It will be considered cheating if you unzip
the jar file, submit the jar file to Web-CAT, or submit the unzipped
classes with your project.
Implementation Constraints
You must implement the Cache using a HashMap.
Here are the skeletons for the source
code files.
Testing
Web-CAT will run your own unit tests and expect 100% statement
coverage.
Your components will be tested with instructor reference
tests. The reference tests are JUnit tests of your
classes.
There will be a performance
test of a over 10000 items to make sure previously fetched pages
are returned from the cache and don't cause redundant fetches from
the
internet. There's a time limit of 10 seconds.
Here's a simple test case:
public void testSimpleReport() throws
java.io.IOException
{
try
{
String dataIn = "http://www.calpoly.edu";
StringReader rdr = new StringReader(dataIn);
StringWriter sw = new StringWriter();
CacheSimulator sim = new CacheSimulator(10);
sim.simulate(rdr);
sim.makeReport(sw);
Scanner scan = new Scanner(sw.toString()); // allowed in Java 7
assertEquals("Wrong count", 1, scan.nextInt());
}
catch
(java.util.NoSuchElementException ex)
{
fail("Not enough lines in report");
}
}
The trickiest test case is when the cache fills up. Here is a
simple test to demonstrate the desired behavior for a simple set of
data.
public void testCacheLimit()
{
// Create a cache with room for 3 items
Cache aCache = new Cache(3);
final String url1 = "http://primes.utm.edu/lists/small/1000.txt";
final String url2 = "http://history.eserver.org/gettysburg-address.txt";
final String url3 = "http://users.csc.calpoly.edu/~jdalbey/103/Demo/CalcDirection.java";
final String url4 = "http://www.census.gov/tiger/tms/gazetteer/zips.txt";
// Request these items twice
for (int loop=1; loop<=2;loop++)
{
aCache.requestPage(url1);
aCache.requestPage(url2);
}
// Request this item once (it's the low frequency item)
aCache.requestPage(url3);
aCache.requestPage(url4); // should force removal of url3
// should increase request count but not fetch count
aCache.requestPage(url1);
aCache.requestPage(url1);
assertEquals(8,aCache.getRequestCount());
assertEquals(4,aCache.getUrlFetchCount());
ArrayList<WebPage> list = aCache.getPageList();
assertEquals(3,list.size());
assertEquals(url4,list.get(list.size()-1).getUrl());
assertEquals(1,list.get(list.size()-1).getFrequency());
}
Handing in Your Source
Electronically
- Submit a zip file containing your source code and unit tests
to Web-CAT. You do not need to submit FakeURL.jar.
- There is a limit of 20 submissions.
GRADING
Your score on Web-CAT contributes to your project score as follows:
Results from Running Your Tests (100)
Estimate of Problem Coverage (100)
Both of these must be 100 or your project score is zero. |
60%
|
Code Coverage from Your Tests (0-100)
|
20%
|
Style/Coding (0-10)
|
10%
|
Design/Readability (inspection by grader)
|
10%
|
FAQ
Q: I'm getting this error from my requestPage() method: "unreported
Exception ... must be caught or declared to be thrown."
A: Check the javadocs for requestPage(). It doesn't throw any
exceptions.
So you should catch them.
Q: Web-CAT is saying "No line found". But my scanner seems to be
working
okay.
A: That error could be from the instructor test. I call your
CacheReport with a file,
so it should send the output to a file, and then I try
to read that file to verify the output.
So it might be something is wrong with your output format.
Some students have found the problem was that they were using
the low-level write()
method. I suggest wrapping
the Writer in a PrintWriter so you can use higher level
methods like println()
.
Q: What do you mean by "column 1?"
A: Sorry this was so confusing; I simply mean the first column of
the
line.
Q: How can I force an IOException to be thrown, so I can get 100% code coverage?
A: There is now a "secret" url, http://forceioexception, that when provided
as input, will cause FakeURL to throw IOException.