Instructor: Alexander Dekhtyar, dekhtyar@calpoly.edu, 14-210
Office Hours:
| Who | Where | |
Monday | 9:10am - 10:00am | Alex | 14-212 |
Tuesday | 9:10am - 10:00am | Alex | 14-212 |
Wednesday | 9:10am - 11:00am | Alex | 14-212 |
Additional appoinments: send email.
DATA 301 Survey |
Syllabus | Postscript | ||
Dennis Sun's Texbook | Github | ||
Jupyter Labs Server | https://dev2.csc.calpoly.edu:5000/ |
Lab 1 | Due: April 4 | Python quiz | Instructions | Python Notebook | [March 29, 2022] | |
Lab 2 | Due: April 7 | Python Data Frames | Python Notebook | [April 7, 2022] | ||
Lab 3 | Due: April 12 | Work with Data Frame columns | Python Notebook | [April 7, 2022] | ||
Lab 4 | Due: April 14 | Visualizations and Variable Transformations | Python Notebook 1
Python Notebook 2 |
[April 12, 2022] | ||
Lab 5 | Due: April 19 | Visualizations and Variable Transformations | Python Notebook 1
Python Notebook 2 |
[April 14, 2022] | ||
Lab 6 | Due: April 21 | Data Cubes and Contingency Tables | Python Notebook 1
Python Notebook 2 |
[April 19, 2022] | ||
Lab 7 | Due: April 26 | Analysis of Numeric Variables | Python Notebook 1
Python Notebook 2 |
[April 26, 2022] | ||
Mini Project 1 | Due: May | Data Preparation and Warehousing | Postscript | Superstore.csv | [April 26, 2022] | |
Lab 8 | Due: April 28 | Analysis of Categorical Variables | Python Notebook | [April 28, 2022] | ||
Lab 9 | Due: May 3 | Linear Regression | Python Notebook | [April 28, 2022] | ||
Lab 10 | Due: May 5 | K-Nearest Neighbors | Python Notebook | [May 5, 2022] | ||
Lab 11 | Due: May 10 | Evaluation of Regression Models | Python Notebook 1
Python Notebook 2 |
[May 5, 2022] | ||
Lab 12 | Due: May 17 | Hyperparameter Tuning and Model Selection | Python Notebook 1 | [May 12, 2022] | ||
Lab 13 | Due: May 19 | Classification | Python Notebook 1
Python Notebook 2 |
[May 19, 2022] | ||
Lab 14 | Due: May 26 | Clustering | Python Notebook 1 | [May 19, 2022] | ||
Lab 15 | Due: May 28 | Vectorization, TF-IDF | Python Notebook 1
Python Notebook 2 |
[May 23, 2022] | ||
Lab 16 | Due: May 31 | Visualization | Python Notebook 1
Python Notebook 2 Python Notebook 3 |
[May 26, 2022] | ||
Mini Project 2 | Due: June | Data Science in action | Postscript | [May 26, 2022] |
DoW | Date | Lecture | Notebooks | Lab | Materials | Other |
---|---|---|---|---|---|---|
Tuesday | March 29 | Syllabus | Lab quiz (Lab 1) | Lecture 1, Lecture 2 | ||
Tuesday | April 5 | Data Science Process | Chapter 1.1
titanic.csv tips.csv |
Complete Chapter 1.1 exercises | Lecture 1, Lecture 2 | Original Chapter 1.1. notebook |
Thursday | April 7 | Data Acquisition | Chapter 1.2
titanic.csv tips.csv |
Complete Chapter 1.2 exercises | Lecture 3, | Original Chapter 1.2. notebook |
Tuesday | April 12 | Simple Visualization Data Manipulation/Feature Engineering |
Chapter 1.3 Chapter 1.4 titanic.csv tips.csv AmesHousing.txt |
Complete Chapter-1.3 and Chapter-1.4exercises |
Tabular Data | Original
Chapter 1.3. notebook Chapter 1.4. notebook |
Thursday | April 14 | Grouping and Aggregation | Chapter 2.1
Chapter 2.2 titanic.csv tips.csv |
Complete Chapter 2.1, Chapter 2.2, exercises |
Original
Chapter 2.1,
Chapter 2.2, notebooks |
|
Tuesday | April 19 | Data Cubes and Analysis of variables | Chapter 2.3 Chapter 3.2 titanic.csv tips.csv |
Complete Chapter2.3 and Chapter3.2 | Original
Chapter 2.3. notebook Chapter 3.2. notebook, |
|
Thursday | April 21 | Analysis of Numeric Variables | Chapter 3.1
Chapter 4.1 titanic.csv tips.csv reds.csv |
Complete Chapter 3.1, Chapter 4.1, exercises |
Original
Chapter 3.1,
Chapter 4.1, notebooks |
|
Tuesday | April 26 | Analysis of Categorical Variables | Chapter 4.2 titanic.csv tips.csv |
Complete Chapter4.2 Mini-project 1 | Original
Chapter 4.1. notebook |
|
Thursday | April 28 | Linear Regression | Chapter 5.1
AmesHousing.txt reds.csv |
Complete Chapter 5.1 exercises | ||
Tuesday | May 3 | KNN Regression | Chapter 5.2 AmesHousing.txt tips.csv |
Complete Chapter5.2
exercises | ||
Thursday | May 5 | Evaluation of Regression Methods | Chapter 5.3
Chapter 5.4 AmesHousing.txt tips.csv |
Complete Chapter 5.3
and Chapter 5.4 exercises |
||
Tuesday | May 10 | Lab Test | ||||
Thursday | May 12 | Hyperparameter Tuning, Model Selection, Ensembles | Chapter 5.5
Chapter 5.5 AmesHousing.txt |
Complete Chapter 5.5 | ||
Tuesday | May 17 | KNN Classification | Chapter 6.1 Chapter 6.1 reds.csv whites.csv titanic.csv |
Complete Chapter6.1
Chapter6.2 exercises | Lecture Notes | |
Thursday | May 19 | Classification: KNN Classifier | Chapter 7.1
iris.csv two_moons.csv satellite.csv reds.csv whites.csv titanic.csv |
Complete Chapter7.1 exercises | Lecture Notes | |
Tuesday | May 23 | Work with Text Collections, Vectorization | Chapter 13.1 Chapter 13.2 SMSSpamCollection.txt profiles.csv greeneggsandham.txt |
Complete Chapter13.1
Chapter13.2 exercises | Lecture Notes | |
Thursday | May 26 | Visualization of Data | Chapter 10.1
Chapter 10.2 Chapter 10.3 titanic.csv AmesHousing.txt |
Complete Chapter10.1
Chapter10.2 Chapter10.3 exercises |
Lecture 1 | What is Data Science? | Postscript | [March 28, 2016] | |
Lecture 2 | Data Science Process | Postscript | [April 3, 2016] | |
Lecture 3 | Data Acquisition | Postscript | [April 3, 2016] | |
Lecture 4 | Tabular Data | Postscript | [April 3, 2016] | |
Lecture 5 | Textual Data | Postscript | [April 5, 2016] | Lecture 6 | XML Data | Postscript | [April 11, 2016] |
Lecture 7 | Document Object Model (DOM) | Postscript | [April 11, 2016] | |
Lecture 8 | HTML and Beautiful Soup | Postscript | [April 20, 2016] | |
Lecture 9 | Maps and JSON | Postscript | [April 20, 2016] | |
Lecture 14 | Recommendation Predictions | Postscript | [May 11, 2016] | |
Lecture 15 | Supervised Learning (Classification) | Postscript | [May 18, 2016] | |
Lecture 16 | Unsupervised Learning (Clustering) | Postscript | [May 23, 2016] |