CSC 466: Knowledge Discovery From Data
Spring 2009

Instructor: Alexander Dekhtyar, dekhtyar@csc.calpoly.edu, 14-215

Office Hours:
When
Who Where
Tuesday 11:10am - 12:00pm Alex 14-215
Wednsday 9:00am - 12:00pm Alex 14-215
Thursday 11:10 - 12:00pm Alex 14-215

Additional appoinments: send email.

Final Exam Date: Tuesday, June 9, 2009, 4:10 - 7:00pm

(note: there is no final exam, but we may use the time for course-related activities)


News and Notes

Old News and Notes

Course Materials

Syllabus Postscript PDF
CSC 466 Wiki HTML
Datasets Wiki HTML


Labs

Lab 0 Due: April 2, 2009 SQL recap Postscript PDF [April 5, 2009]
Lab 1 Due: April 14, 2009 Data Warehousing Postscript PDF [April 5, 2009]
Lab 2 Due: April 21, 2009 Mining Association Rules Postscript PDF [April 13, 2009]
Lab 3 Due: April 30, 2009 Supervised Learning (Classification) Postscript PDF [April 20, 2009]
Lab 4 Due: May 12, 2009 Unsupervised Learning (Clustering) Postscript PDF [April 29, 2009]
Lab 5 Due: May 19, 2009 Collaborative Filtering Postscript PDF [May 13, 2009]
Lab 6 Due: May 28, 2009 Information Retrieval Postscript PDF [May 17, 2009]
Lab 7 Due: June 5, 2009 Link Analysis Postscript PDF [May 28, 2009]


Projects

Design Project: Stage 0 Due: April 21/23, 2009 Team formation Postscript PDF [April 16, 2009]
Design Project: Stage 1 Due: May 7, 2009 Initial design Postscript PDF [May 4, 2009]
Analytical Project Due: June 4, 2009 Assignment Postscript PDF [May 13, 2009]
Design Project: Stage 2 Due: May 7, 2009 Final Design Postscript PDF [May 26, 2009]

Lab Data
Lab 0 Dataset EXTENDED BAKERY (1000 Receipts) Description(PDF)
Lab 0 Dataset README(wiki) README (download)
Lab 1 Data Warehouse Design Tool designTool.zip Instructions
Lab 1 Dataset EXTENDED BAKERY (wiki) Description(PDF) README(wiki)
Lab 2 Dataset EXTENDED BAKERY (wiki) All CSV files in a zip (wiki) README(wiki)
Lab 2 Sample Dataset example.zip
Lab 3 Dataset ELECTIONS dataset home(wiki) dataset description (wiki)
Lab 4 Datasets Wiki Zip

Project and Lab Tests
Lab 1 Test Script 1 lab1_examples.sh [April 5, 2009]
Lab 1 Test Script 1 outputs 1000 receipts,
5000 receipts,
20,000 receipts,
75,000 receipts,
[April 12, 2009]
Lab 3 Restrictions file for small tree restrictions-tree02.csv [April 12, 2009]

JDBCTest.java
classes12.jar (JDBC driver)

Lecture Notes

Lecture 1 April 2 What is KDD? Postscript PDF [March 31, 2009]
Lecture 2 April 7 Data Warehouses: Data Cubes Postscript PDF [March 31, 2009]
Lecture 3 April 9 Data Warehouses: OLAP Operations []
Lecture 4 April 14 Association Rules Mining: Apriori Postscript PDF [April 12, 2009]
Lecture 4-1 April 14 Association Rules Mining: Apriori examples Postscript PDF [April 14, 2009]
Lecture 5 April 16 Association Rules Mining: Rule Generation []
Lecture 6 April 21 Classification. Decision Trees Postscript PDF [April 20, 2009]
Lecture 7 April 23 Classification: C4.5. example Postscript PDF [April 22, 2009]
Lecture 8 April 28 Classification: Beyond C4.5. Postscript PDF [April 25, 2009]
Lecture 9 April 30 Clustering: K-means Postscript PDF [April 29, 2009]
Lecture 9.5 Distance Measures Postscript PDF [April 29, 2009]
Lecture 10 May 5 Clustering: Hierarchical Postscript PDF [May 7, 2009]
Lecture 11 May 7 Clustering: Evaluation []
Lecture 12 May 12 Collaborative Filtering: Intro Postscript PDF [May 14, 2009]
Lecture 13 May 14 Collaborative Filtering: producing recommendations []
Lecture 14 May 19 Information Retrieval: measures, models Postscript PDF [May 20, 2009]
Lecture 15 May 21 Information Retrieval: Vector Space Model (VSM) []
Lecture 16 May 26 Information retrieval: extending VSM Postscript PDF [May 25, 2009]
Lecture 17 May 28 PageRank:The Algorithm Postscript PDF [May 28, 2009]
Lecture 18 June 1 PageRank: The Math Postscript PDF [June 1, 2009]
Lecture 19 June 4 Naive Bayes Postscript PDF [June 3, 2009]


March 30 2009, dekhtyar at csc.calpoly.edu