This page is under construction. The schedule may change and the slides will be updated during the course.

num Topic Resources

1 Introduction to text mining, Course Organisation slides pdf
2 Classification, KNN, Features slides pdf, diabetes.csv, classifier-KNN.ipynb
3 TF.IDF, Simple classifier slides pdf, lecture-tfidf.ipynb, 4docs.csv,
lecture-simple-text-classification.ipynb, simple-review.csv
4 word embedding, word2Vec, CBOW, Skipgram slides pdf, visualisation at https://ronxin.github.io/wevi/
5 CNN for text classification pdf
6 Python implementation, train embedding, pre-trained slides pdf,Sample code on Google Colab
7 More recent research in text representation, BERT slides pdf, Use BERT in Python
8 Challenges and some research projects slides pdf
9 Discussion  
10 text clustering HAC pdf
11 Kmeans, DBSCAN pdf
12 Recommender Systems pdf
13 Content based Recommender systems pdf
14 Information Retrieval, Google and PageRank pdf
15 Personalised Search, Evaluation
16 Query Expansion  
17 Machine Translation pdf
18 Review

Presentation sign up

week 2 Friday, 5 March, topics related to text classification or clustering, such as new algorithms, deep learning models, or their applications
sign up Conner, Asher, Bonny (Yuguang)
week 4 Friday, 19 March, topics related to text representation such as word2vec, word embedding, word rank, new measures for word similarity
Sign up Ethan, Wenhao (Roy)Word2Vec-Roy(Wenhao Li).pptx, Hannah
week 7 Monday, 19 April, topics related to clustering algorithms, opinion mining, information extraction
Sign up Phillip, Peter, Zihan, Joshua
week 8 Friday, 30 April, topics related to recommender systems, such as the system used by Netflix, Amazon, youTube, etc.
sign up Finn Sargisson, George, Yunhan (John), Mathew
week10 Friday, 14 May, topics related to information retrieval, query expansion, personalised search, such as new search engines, new web services.
Sign up Alex, Guoqiang, Shengkun, Hugh
week12 Friday, 28 May , other topics including machine translation, other natural language processing tasks
Sign up Roger, Teh Yule Kim, William

Topic attachments
I Attachment Action Size Date Who Comment
4docs.csvcsv 4docs.csv manage 147 bytes 01 Mar 2021 - 08:18 Main.xgao  
classifier-KNN.ipynbipynb classifier-KNN.ipynb manage 2 K 26 Feb 2021 - 05:43 Main.xgao  
diabetes.csvcsv diabetes.csv manage 23 K 26 Feb 2021 - 05:43 Main.xgao  
lecture-simple-text-classification.ipynbipynb lecture-simple-text-classification.ipynb manage 10 K 01 Mar 2021 - 08:21 Main.xgao  
lecture-tfidf.ipynbipynb lecture-tfidf.ipynb manage 5 K 01 Mar 2021 - 08:22 Main.xgao  
simple-review.csvcsv simple-review.csv manage 4 MB 01 Mar 2021 - 08:22 Main.xgao