Instructor: Alexander Dekhtyar, dekhtyar@calpoly.edu, 14-210
Office Hours:
|
| Who | Where | |
| Wednesday | 8:30am - 10:00am | Alex | 14-215 |
| Thursday | 1:10pm - 2:00pm | Alex | 14-215 |
| Friday | 8:30am - 10:00am | Alex | 14-215 |
Additional appoinments: send email.
| Syllabus | Postscript |
| Lab 1, Part 1 | Due: January 13 | JSON Generation | Postscript | Lab Data | [January 10, 2017] | |
| Lab 1, Part 2 | Due: January 20 | JSON Generation | Postscript | Lab Data | [January 13, 2017] | |
| Lab 2 | Due: January 25 | MongoDB find() queries | Postscript | [January 23, 2017] | ||
| Lab 4 | Due: February 1 | MongoDB Aggregate Pipelines | Postscript | [January 27, 2017] | ||
| Lab 5 | Due: February 10 | MongoDB application | Postscript | Lab Info | [February 4, 2017] | |
| Lab 6 | Due: February 17 | Getting Hadoop to Work | Postscript | Lab Info | [February 15, 2017] | |
| Lab 7 | Due: February 24 | Simple Hadoop Programs | Postscript | Lab Info | [February 17, 2017] | |
| Lab 8 | Due: March 3 | Medium-difficulty Hadoop Programs | Postscript | [February 27, 2017] | ||
| Lab 9 | Due: March 19 | Real Data Hadoop Programs | Postscript | Lab Info | [March 5, 2017] |
| January 30, February 1 | mongo-queries.txt | st collection | prof collection |
| org.apache.hadoop Version 2.7 javadocs | API | |
| Bash local variable settings | bashrc-commands.txt | Paste into the bottom of your .bashrc file |
| MapReduce (Hadoop v. 2.7) tutorial | HTML |
| Hadoop program template | template.java | |
| Our first Hadoop program | switchMR.java | |
| Data file for switchMR.java | data | |
| Input Format Tests | ||
|---|---|---|
| TextInputFormat test | FITest.java | |
| KeyValueTextInputFormat test | KeyValueTest.java | |
| FixedRecordInputFormat test | FixedRecordTest.java | |
| NLineInputFormat test | NLTest.java | |
| Multiple chained MapReduce jobs | filter.java | words (input file) |
| Multiple Input Files/Multiple Mappers | multiInMR.java | users.in, messages.in (input files) |
| Use of JSON | ||
| Using JSON objects | JsonJob.java | json.in,simple.json (input files) |
| Multiline JSON | MultilineJsonJob.java | test.json (input file) |
| Multiline JSON Input Format | json-mapreduce-1.0.jar | Advanced Hadoop Features |
| Finding Max | FindMax.java | numbers.txt |
| Map-Side Join with Distributed Cache | dCacheDemo.java | users.in, messages.in (input files) |
| Combiner Test: graph scan with no Combiner | TwitterTest.java | |
| Combiner Test: graph scan with Combiner | CombinerTest.java |
| Lecture 1 | What's in this class? | Postscript | [January 4, 2016] | |
| Lecture 2 | Motivating Examples | Postscript | [January 4, 2016] | |
| Lecture 3 | Maps, Dictionaries, Key-Value Pairs | Postscript | [January 12, 2016] | |
| Lecture 3-1 | JSON | Postscript | [January 10, 2017] | |
| Lecture 4 | MongoDB Basics | Postscript | [January 18, 2016] | |
| Lecture 5 | MongoDB Java Connectivity | Postscript | [January 28, 2016] | |
| Lecture 6 | MongoDB Aggregation Pipeline | Postscript | [January 27, 2017] | |
| Lecture 7 | MongoDB Aggregation Pipeline: Part 2 | Postscript | [Feb 3, 2017] | |
| Lecture 8 | Overview of Distributed Systems | Postscript | [February 4, 2017] | |
| Lecture 9 | MapReduce | Postscript | [January 28, 2016] | |
| Lecture 10 | Hadoop on CSLVM cluster | Postscript | [February 17, 2016] | |
| Lecture 11 | HDFS commands primer | Postscript | [February 13, 2017] | |
| Lecture 12 | Hadoop Input Data Formats | Postscript | [February 21, 2017] | |
| Lecture 14 | Matrix Multiplication in MapReduce | Postscript | [March 5, 2017] | |
| Lecture 15 | MapReduce for Top K Problem | Postscript | [March 10, 2017] |
| JSON home page | json.org |
| JSON specification | ECMA-404: The JSON Data Interchange Format (PDF) |
| org.json Javadocs | Javadoc |