CSC 466: Knowledge Discovery From Data
Winter 2025

Instructor: Alexander Dekhtyar, dekhtyar@csc.calpoly.edu, 14-212

Office Hours:
When
Who Where
Tuesday 1:10pm - 3:00pm Alex 14-212
Wednesday 3:10pm - 4:00pm Alex 14-212
Friday 3:10pm - 4:00pm Alex 14-212

Additional appoinments: send email.

News and Notes

Course Materials

Syllabus Postscript PDF
Jupyter Labs Server Login Page Log in with Cal Poly credentials
Sample Lab Test files Task1.ipynb, data1.csv Task2.ipynb, data5.csv Task3.ipynb, data8.csv
Sample Lab Test Solutions Task1-solution.ipynb Task2-solution.ipynb Task3-solution.ipynb


Labs

Lab 0 Due: January 13 (Monday) Distance Metrics (and Python) Postscript PDF [January 8, 2025]
Lab 1 Due: January 20 (Tuesday) K-Nearest Neighbors Postscript PDF [January 12, 2025]
Lab 2 Due: January 31 (Friday) Classification: Decision Trees Postscript PDF Data [January 22, 2025]
Lab 3 Due: February 7 (Friday) Classification: Random Forests Postscript PDF Data [January 31, 2025]
Lab 4
Lab 4 Addendum
Due: February 21 (Friday) Unsupervised Learning
Cluster Evaluation
Postscript
Addendum (Postscript)
PDF
Addendum (PDF)
Data [February 10, 2025]
Lab 5 Due: February 28 Collaborative Filtering Postscript PDF Data [February 17, 2025]
Lab 6 Due: March 7 Information Retrieval/Text Mining Postscript PDF Data [February 27, 2025]
Lab 7 Due: Link Analysis Postscript PDF Data [March 7, 2025]


Projects

Analytical Project Due: February 7/March 19 Project Specification Postscript PDF [February 7, 2025]
Analytical Project: Final Deliverables Due: March 19 Project Specification Postscript PDF [March 10, 2025]

Lecture Notes

--->
(January 6, Monday) Lecture 1 What is KDD? Postscript PDF
(January 8, Wednesday) Lecture 2 Distance and Similarity Metrics Postscript PDF
(January 10, Friday) Lecture 3 Introduction to Supervised Learning Postscript PDF Iris-example.ipynb
(January 17, Friday) Lecture 4 Classification. Decision Trees Postscript PDF
(January 27, Monday) Lecture 5 Classification: C4.5. example Postscript PDF Sample dataset (CSV) Decision Tree (JSON)
(January 29, Wednesday) Lecture 6 Classification: Ensemble Methods Postscript PDF
(February 3, Monday) Lecture 7 Classification: Naive Bayes Postscript PDF
(February 5, Wednesday) Lecture 8 Clustering: K-means Postscript PDF KMeans Clustering Demo (.ipynb)
(February 12, Wednesday) Lecture 9 Evaluation of Clustering Methods Postscript PDF
(February 12, Wednesday) Lecture 10 Clustering: Hierarchical Postscript PDF
(February 12, Wednesday) Lecture 11 Clustering: Density-Based Postscript PDF
(February 19, Wednesday) Lecture 12 Collaborative Filtering: Intro Postscript PDF
(February 19, Wednesday) Lecture 13 Collaborative Filtering: Evaluation Postscript PDF
(February 26, Wednesday) Lecture 14 Information Retrieval: measures, models Postscript PDF
(February 26, Wednesday) Lecture 15 Information retrieval: extending VSM Postscript PDF
(March 3, Monday) Lecture 16 Graph Mining Postscript PDF
(March 5, Wednesday) Lecture 17 PageRank:The Algorithm Postscript PDF
(March 5, Wednesday) Lecture 18 PageRank: The Math Postscript PDF
(MArch 10, Monday) Lecture 19 Association Rule Mining: Apriori Postscript PDF
(MArch 10, Monday) Lecture 20 Association Rule Mining Example Postscript PDF


January 4, 2025, dekhtyar at csc.calpoly.edu