CSC 466: Knowledge Discovery From Data
Fall 2023

Instructor: Alexander Dekhtyar, dekhtyar@csc.calpoly.edu, 14-212

Office Hours:
When
Who Where
Monday 11:00am - 12:00am Alex 14-212
Tuesday 1:00pm - 3:00pm Alex 14-212
Wednesday 10:00am - 11:00am Alex 14-212

Additional appoinments: send email.

News and Notes

Course Materials

Syllabus Postscript PDF
Jupyter Labs Server Login Page Log in with Cal Poly credentials
Sample Lab Test files Task1.ipynb, data1.csv Task2.ipynb, data5.csv Task3.ipynb, data8.csv
Sample Lab Test Solutions Task1-solution.ipynb Task2-solution.ipynb Task3-solution.ipynb


Labs

Lab 1 Due: September 28 (Tuesday) Insight From Data Postscript PDF Data [September 25, 2023]
Lab 2 Due: October 9 (Monday) Association Rules Postscript PDF Data [September 29, 2023]
Lab 3 Due: October 20 (Friday) - Part 1
October 27 (Friday) - Part 2
Supervised Learning (Classification) Postscript PDF Data [October 11, 2023]
Lab 4 Due: November 10 (Friday) Unsupervised Learning Postscript PDF Data [November 1, 2023]
Lab 5 Due: November 27 Collaborative Filtering Postscript PDF Data [November 13, 2023]
Lab 6 Due: December 4 (Monday) Information Retrieval/Text Mining Postscript PDF Data [November 17, 2023]
Lab 7 Due: December 12 (Tuesday) Link Analysis Postscript PDF Data [December 6, 2023]


Projects

Analytical Project Due: December 13 (multiple deadlines) Project Specification Postscript PDF [November 1, 2023]

Lecture Notes

Lecture 1 What is KDD? Postscript PDF
Lecture 2 Association Rules Mining: Apriori Postscript PDF Apriori Example (PDF) Apriori Example (Googledoc, read-only)
Lecture 3 Association Rules Mining: Apriori examples Postscript PDF Sample dataset (CSV)
Lecture 4 Classification. Decision Trees Postscript PDF
Lecture 5 Classification: C4.5. example Postscript PDF Sample dataset (CSV) Decision Tree (JSON)
Lecture 6 Classification: Beyond C4.5. Postscript PDF
Lecture 6.5 Predictive Linear Regression Postscript PDF
Lecture 7 Clustering: K-means Postscript PDF
Lecture 8 Distance Measures Postscript PDF
Lecture 9 Clustering: Hierarchical Postscript PDF
Lecture 10 Clustering: Density-Based Postscript PDF
Lecture 11 Collaborative Filtering: Intro Postscript PDF
Lecture 12 Collaborative Filtering: Evaluation Postscript PDF
Lecture 13 Information Retrieval: measures, models Postscript PDF
Lecture 14 Information retrieval: extending VSM Postscript PDF
Lecture 15 Social Network/Graph Mining Postscript PDF
Lecture 16 PageRank:The Algorithm Postscript PDF
Lecture 17 PageRank: The Math Postscript PDF
Lecture 18 Community Discovery Postscript PDF
Lecture 19 Naive Bayes Postscript PDF


September 18, 2021, dekhtyar at csc.calpoly.edu