Instructor: Alexander Dekhtyar, dekhtyar@calpoly.edu, 14-210
Office Hours:
| 
 | Who | Where | |
| Monday | 9:10am - 10:00am | Alex | 14-212 | 
| Tuesday | 9:10am - 10:00am | Alex | 14-212 | 
| Wednesday | 9:10am - 11:00am | Alex | 14-212 | 
Additional appoinments: send email.
| DATA 301 Survey | 
| Syllabus | Postscript | ||
| Dennis Sun's Texbook | Github | ||
| Jupyter Labs Server | https://dev2.csc.calpoly.edu:5000/ | 
| Lab 1 | Due: April 4 | Python quiz | Instructions | Python Notebook | [March 29, 2022] | |
| Lab 2 | Due: April 7 | Python Data Frames | Python Notebook | [April 7, 2022] | ||
| Lab 3 | Due: April 12 | Work with Data Frame columns | Python Notebook | [April 7, 2022] | ||
| Lab 4 | Due: April 14 | Visualizations and Variable Transformations |  Python Notebook 1
 Python Notebook 2  | 
[April 12, 2022] | ||
| Lab 5 | Due: April 19 | Visualizations and Variable Transformations |  Python Notebook 1
 Python Notebook 2  | 
[April 14, 2022] | ||
| Lab 6 | Due: April 21 | Data Cubes and Contingency Tables |  Python Notebook 1
 Python Notebook 2  | 
[April 19, 2022] | ||
| Lab 7 | Due: April 26 | Analysis of Numeric Variables |  Python Notebook 1
 Python Notebook 2  | 
[April 26, 2022] | ||
| Mini Project 1 | Due: May | Data Preparation and Warehousing | Postscript | Superstore.csv | [April 26, 2022] | |
| Lab 8 | Due: April 28 | Analysis of Categorical Variables | Python Notebook | [April 28, 2022] | ||
| Lab 9 | Due: May 3 | Linear Regression | Python Notebook | [April 28, 2022] | ||
| Lab 10 | Due: May 5 | K-Nearest Neighbors | Python Notebook | [May 5, 2022] | ||
| Lab 11 | Due: May 10 | Evaluation of Regression Models |  Python Notebook 1
 Python Notebook 2  | 
[May 5, 2022] | ||
| Lab 12 | Due: May 17 | Hyperparameter Tuning and Model Selection | Python Notebook 1 | [May 12, 2022] | ||
| Lab 13 | Due: May 19 | Classification |  Python Notebook 1
 Python Notebook 2  | 
[May 19, 2022] | ||
| Lab 14 | Due: May 26 | Clustering | Python Notebook 1 | [May 19, 2022] | ||
| Lab 15 | Due: May 28 | Vectorization, TF-IDF |  Python Notebook 1
 Python Notebook 2  | 
[May 23, 2022] | ||
| Lab 16 | Due: May 31 | Visualization |  Python Notebook 1
 Python Notebook 2 Python Notebook 3  | 
[May 26, 2022] | ||
| Mini Project 2 | Due: June | Data Science in action | Postscript | [May 26, 2022] | 
| DoW | Date | Lecture | Notebooks | Lab | Materials | Other | 
|---|---|---|---|---|---|---|
| Tuesday | March 29 | Syllabus | Lab quiz (Lab 1) | Lecture 1, Lecture 2 | ||
| Tuesday | April 5 | Data Science Process |  Chapter 1.1
 titanic.csv tips.csv  | 
Complete Chapter 1.1 exercises | Lecture 1, Lecture 2 | Original Chapter 1.1. notebook | 
| Thursday | April 7 | Data Acquisition |  Chapter 1.2
 titanic.csv tips.csv  | 
Complete Chapter 1.2 exercises | Lecture 3, | Original Chapter 1.2. notebook | 
| Tuesday | April 12 | Simple Visualization  Data Manipulation/Feature Engineering  | 
 Chapter 1.3 Chapter 1.4 titanic.csv tips.csv AmesHousing.txt  | 
 Complete Chapter-1.3 and Chapter-1.4exercises  | 
Tabular Data | Original 
Chapter 1.3. notebook Chapter 1.4. notebook  | 
| Thursday | April 14 | Grouping and Aggregation |  Chapter 2.1
 Chapter 2.2 titanic.csv tips.csv  | 
 Complete Chapter 2.1,  Chapter 2.2, exercises  | 
Original 
Chapter 2.1, 
Chapter 2.2, notebooks  | 
|
| Tuesday | April 19 | Data Cubes and Analysis of variables |   Chapter 2.3 Chapter 3.2 titanic.csv tips.csv  | 
 Complete Chapter2.3 and Chapter3.2  | Original 
Chapter 2.3. notebook Chapter 3.2. notebook,  | 
|
| Thursday | April 21 | Analysis of Numeric Variables |  Chapter 3.1
 Chapter 4.1 titanic.csv tips.csv reds.csv  | 
 Complete Chapter 3.1,  Chapter 4.1, exercises  | 
Original 
Chapter 3.1,
Chapter 4.1, notebooks  | 
|
| Tuesday | April 26 | Analysis of Categorical Variables |   Chapter 4.2 titanic.csv tips.csv  | 
 Complete Chapter4.2 Mini-project 1  | Original 
Chapter 4.1. notebook | 
|
| Thursday | April 28 | Linear Regression |  Chapter 5.1
 AmesHousing.txt reds.csv  | 
Complete Chapter 5.1 exercises | ||
| Tuesday | May 3 | KNN Regression |   Chapter 5.2 AmesHousing.txt tips.csv  | 
 Complete Chapter5.2
exercises | ||
| Thursday | May 5 | Evaluation of Regression Methods |  Chapter 5.3
 Chapter 5.4 AmesHousing.txt tips.csv  | 
 Complete Chapter 5.3 
 and Chapter 5.4 exercises  | 
||
| Tuesday | May 10 | Lab Test | ||||
| Thursday | May 12 | Hyperparameter Tuning, Model Selection, Ensembles |  Chapter 5.5
 Chapter 5.5 AmesHousing.txt  | 
Complete Chapter 5.5 | ||
| Tuesday | May 17 | KNN Classification |   Chapter 6.1 Chapter 6.1 reds.csv whites.csv titanic.csv  | 
 Complete Chapter6.1
 Chapter6.2 exercises  | Lecture Notes | |
| Thursday | May 19 | Classification: KNN Classifier |  Chapter 7.1
 iris.csv two_moons.csv satellite.csv reds.csv whites.csv titanic.csv  | 
Complete Chapter7.1 exercises | Lecture Notes | |
| Tuesday | May 23 | Work with Text Collections, Vectorization |   Chapter 13.1 Chapter 13.2 SMSSpamCollection.txt profiles.csv greeneggsandham.txt  | 
 Complete Chapter13.1
 Chapter13.2 exercises  | Lecture Notes | |
| Thursday | May 26 | Visualization of Data |  Chapter 10.1
 Chapter 10.2 Chapter 10.3 titanic.csv AmesHousing.txt  | 
 Complete Chapter10.1 
 Chapter10.2 Chapter10.3 exercises  | 
| Lecture 1 | What is Data Science? | Postscript | [March 28, 2016] | |
| Lecture 2 | Data Science Process | Postscript | [April 3, 2016] | |
| Lecture 3 | Data Acquisition | Postscript | [April 3, 2016] | |
| Lecture 4 | Tabular Data | Postscript | [April 3, 2016] | |
| Lecture 5 | Textual Data | Postscript | [April 5, 2016] | Lecture 6 | XML Data | Postscript | [April 11, 2016] | 
| Lecture 7 | Document Object Model (DOM) | Postscript | [April 11, 2016] | |
| Lecture 8 | HTML and Beautiful Soup | Postscript | [April 20, 2016] | |
| Lecture 9 | Maps and JSON | Postscript | [April 20, 2016] | |
| Lecture 14 | Recommendation Predictions | Postscript | [May 11, 2016] | |
| Lecture 15 | Supervised Learning (Classification) | Postscript | [May 18, 2016] | |
| Lecture 16 | Unsupervised Learning (Clustering) | Postscript | [May 23, 2016] |