CPE 466: Knowledge Discovery in Data
Lab 4 materials

DATA

Reuters 50-50 C50.zip

Stopwords

Stopwords in MySQLMySQL stopwords
Onix Text Retrieval ToolkitStopword list

Stopword Files

Ranks.nl smallstopwords-short.txt
Ranks.nl mediumstopwords-medium.txt
Ranks.nl largestopwords-long.txt
MySQL stopwords-mysql.txt
Onix stopwords-onix.txt

Stemming Materials

Porter Stemming AlgorithmOfficial Web Page
Porter Algorithm original paperdef.txt
Porter Algorithm in Java java.txt
Porter Algorithm in Python python.txt


May 11, 2018 dekhtyar at calpoly.edu