CSC 466: Knowledge Discovery From Data
Fall 2021

Instructor: Alexander Dekhtyar,, 14-212

Office Hours:
Who Where
Monday 9:00am - 11:00am Alex 14-212
Tuesday 5:00pm - 6:00pm Alex 14-212
Wednesday 9:00am - 10:00am Alex 14-212

Additional appoinments: send email.

Final Exam Dates

Section 01 Thursday, December 9 10:10am - 1:00pm
Section 03 Tuesday, December 7 4:00pm - 7:00pm

Note: There is no final exam in this course, but we will use the final exam time for project presentations.

News and Notes

Course Materials

Syllabus Postscript PDF
Jupyter Labs Server Login Page Log in with Cal Poly credentials
Spring 2018 Exam files Task1.ipynb, data1.csv Task2.ipynb, data5.csv
Spring 2018 Exam Solutions Task1-solution.ipynb Task2-solution.ipynb


Lab 1 Due: September 28 (Tuesday) Insight From Data Postscript PDF Data [September 19, 2021]
Lab 2 Due: October 7 (Thursday) Association Rules Postscript PDF Data [September 30, 2021]
Lab 3-1 Due: October 19 (Tuesday) Supervised Learning (Classification) Postscript PDF Data [October 7, 2021]
Lab 3-2 Due: October 26 (Tuesday) Supervised Learning (Classification) Part 2 Postscript PDF Data [October 14, 2021]
Lab 4 Due: November 5 (Friday) Unsupervised Learning Postscript PDF Data [October 26, 2021]
Lab 5 Due: November 19 (Friday) Information Retrieval/Text Mining Postscript PDF Data [November 9, 2021]
Lab 7 Due: December 3 (Friday) Link Analysis Postscript PDF Data [November 18, 2021]


Analytical Project Due: December 10 Project Specification Postscript PDF [November 2, 2021]
Ethics and KDD Assignment Due: December 10 Science Fiction Dystopia Story Postscript PDF [November 2, 2021]

Lecture Notes

Lecture 1 What is KDD? Postscript PDF
Lecture 2 Association Rules Mining: Apriori Postscript PDF Apriori Example (PDF) Apriori Example (Googledoc, read-only)
Lecture 3 Association Rules Mining: Apriori examples Postscript PDF Sample dataset (CSV)
Lecture 4 Classification. Decision Trees Postscript PDF
Lecture 5 Classification: C4.5. example Postscript PDF Sample dataset (CSV) Decision Tree (JSON)
Lecture 6 Classification: Beyond C4.5. Postscript PDF
Lecture 6.5 Predictive Linear Regression Postscript PDF
Lecture 7 Clustering: K-means Postscript PDF
Lecture 8 Distance Measures Postscript PDF
Lecture 9 Clustering: Hierarchical Postscript PDF
Lecture 10 Clustering: Density-Based Postscript PDF
Lecture 11 Collaborative Filtering: Intro Postscript PDF
Lecture 12 Collaborative Filtering: Evaluation Postscript PDF
Lecture 13 Information Retrieval: measures, models Postscript PDF
Lecture 14 Information retrieval: extending VSM Postscript PDF
Lecture 15 Social Network/Graph Mining Postscript PDF
Lecture 16 PageRank:The Algorithm Postscript PDF
Lecture 17 PageRank: The Math Postscript PDF
Lecture 18 Community Discovery Postscript PDF
Lecture 19 Naive Bayes Postscript PDF

September 18, 2021, dekhtyar at