Instructor: Alexander Dekhtyar, dekhtyar@calpoly.edu, 14210
Office Hours:
 Who  Where  
Monday  9:10am  10:00am  Alex  14212 
Tuesday  9:10am  10:00am  Alex  14212 
Wednesday  9:10am  11:00am  Alex  14212 
Additional appoinments: send email.
DATA 301 Survey 
Syllabus  Postscript  
Dennis Sun's Texbook  Github  
Jupyter Labs Server  https://dev2.csc.calpoly.edu:5000/ 
Lab 1  Due: April 4  Python quiz  Instructions  Python Notebook  [March 29, 2022]  
Lab 2  Due: April 7  Python Data Frames  Python Notebook  [April 7, 2022]  
Lab 3  Due: April 12  Work with Data Frame columns  Python Notebook  [April 7, 2022]  
Lab 4  Due: April 14  Visualizations and Variable Transformations  Python Notebook 1
Python Notebook 2 
[April 12, 2022]  
Lab 5  Due: April 19  Visualizations and Variable Transformations  Python Notebook 1
Python Notebook 2 
[April 14, 2022]  
Lab 6  Due: April 21  Data Cubes and Contingency Tables  Python Notebook 1
Python Notebook 2 
[April 19, 2022]  
Lab 7  Due: April 26  Analysis of Numeric Variables  Python Notebook 1
Python Notebook 2 
[April 26, 2022]  
Mini Project 1  Due: May  Data Preparation and Warehousing  Postscript  Superstore.csv  [April 26, 2022]  
Lab 8  Due: April 28  Analysis of Categorical Variables  Python Notebook  [April 28, 2022]  
Lab 9  Due: May 3  Linear Regression  Python Notebook  [April 28, 2022]  
Lab 10  Due: May 5  KNearest Neighbors  Python Notebook  [May 5, 2022]  
Lab 11  Due: May 10  Evaluation of Regression Models  Python Notebook 1
Python Notebook 2 
[May 5, 2022]  
Lab 12  Due: May 17  Hyperparameter Tuning and Model Selection  Python Notebook 1  [May 12, 2022]  
Lab 13  Due: May 19  Classification  Python Notebook 1
Python Notebook 2 
[May 19, 2022]  
Lab 14  Due: May 26  Clustering  Python Notebook 1  [May 19, 2022]  
Lab 15  Due: May 28  Vectorization, TFIDF  Python Notebook 1
Python Notebook 2 
[May 23, 2022]  
Lab 16  Due: May 31  Visualization  Python Notebook 1
Python Notebook 2 Python Notebook 3 
[May 26, 2022]  
Mini Project 2  Due: June  Data Science in action  Postscript  [May 26, 2022] 
DoW  Date  Lecture  Notebooks  Lab  Materials  Other 

Tuesday  March 29  Syllabus  Lab quiz (Lab 1)  Lecture 1, Lecture 2  
Tuesday  April 5  Data Science Process  Chapter 1.1
titanic.csv tips.csv 
Complete Chapter 1.1 exercises  Lecture 1, Lecture 2  Original Chapter 1.1. notebook 
Thursday  April 7  Data Acquisition  Chapter 1.2
titanic.csv tips.csv 
Complete Chapter 1.2 exercises  Lecture 3,  Original Chapter 1.2. notebook 
Tuesday  April 12  Simple Visualization Data Manipulation/Feature Engineering 
Chapter 1.3 Chapter 1.4 titanic.csv tips.csv AmesHousing.txt 
Complete Chapter1.3 and Chapter1.4exercises 
Tabular Data  Original
Chapter 1.3. notebook Chapter 1.4. notebook 
Thursday  April 14  Grouping and Aggregation  Chapter 2.1
Chapter 2.2 titanic.csv tips.csv 
Complete Chapter 2.1, Chapter 2.2, exercises 
Original
Chapter 2.1,
Chapter 2.2, notebooks 

Tuesday  April 19  Data Cubes and Analysis of variables  Chapter 2.3 Chapter 3.2 titanic.csv tips.csv 
Complete Chapter2.3 and Chapter3.2  Original
Chapter 2.3. notebook Chapter 3.2. notebook, 

Thursday  April 21  Analysis of Numeric Variables  Chapter 3.1
Chapter 4.1 titanic.csv tips.csv reds.csv 
Complete Chapter 3.1, Chapter 4.1, exercises 
Original
Chapter 3.1,
Chapter 4.1, notebooks 

Tuesday  April 26  Analysis of Categorical Variables  Chapter 4.2 titanic.csv tips.csv 
Complete Chapter4.2 Miniproject 1  Original
Chapter 4.1. notebook 

Thursday  April 28  Linear Regression  Chapter 5.1
AmesHousing.txt reds.csv 
Complete Chapter 5.1 exercises  
Tuesday  May 3  KNN Regression  Chapter 5.2 AmesHousing.txt tips.csv 
Complete Chapter5.2
exercises  
Thursday  May 5  Evaluation of Regression Methods  Chapter 5.3
Chapter 5.4 AmesHousing.txt tips.csv 
Complete Chapter 5.3
and Chapter 5.4 exercises 

Tuesday  May 10  Lab Test  
Thursday  May 12  Hyperparameter Tuning, Model Selection, Ensembles  Chapter 5.5
Chapter 5.5 AmesHousing.txt 
Complete Chapter 5.5  
Tuesday  May 17  KNN Classification  Chapter 6.1 Chapter 6.1 reds.csv whites.csv titanic.csv 
Complete Chapter6.1
Chapter6.2 exercises  Lecture Notes  
Thursday  May 19  Classification: KNN Classifier  Chapter 7.1
iris.csv two_moons.csv satellite.csv reds.csv whites.csv titanic.csv 
Complete Chapter7.1 exercises  Lecture Notes  
Tuesday  May 23  Work with Text Collections, Vectorization  Chapter 13.1 Chapter 13.2 SMSSpamCollection.txt profiles.csv greeneggsandham.txt 
Complete Chapter13.1
Chapter13.2 exercises  Lecture Notes  
Thursday  May 26  Visualization of Data  Chapter 10.1
Chapter 10.2 Chapter 10.3 titanic.csv AmesHousing.txt 
Complete Chapter10.1
Chapter10.2 Chapter10.3 exercises 
Lecture 1  What is Data Science?  Postscript  [March 28, 2016]  
Lecture 2  Data Science Process  Postscript  [April 3, 2016]  
Lecture 3  Data Acquisition  Postscript  [April 3, 2016]  
Lecture 4  Tabular Data  Postscript  [April 3, 2016]  
Lecture 5  Textual Data  Postscript  [April 5, 2016]  
Lecture 6  XML Data  Postscript  [April 11, 2016]  
Lecture 7  Document Object Model (DOM)  Postscript  [April 11, 2016]  
Lecture 8  HTML and Beautiful Soup  Postscript  [April 20, 2016]  
Lecture 9  Maps and JSON  Postscript  [April 20, 2016]  
Lecture 14  Recommendation Predictions  Postscript  [May 11, 2016]  
Lecture 15  Supervised Learning (Classification)  Postscript  [May 18, 2016]  
Lecture 16  Unsupervised Learning (Clustering)  Postscript  [May 23, 2016] 