This page is under construction. The schedule may change and the slides will be updated during the course.

num Topic Resources

1 Introduction to text mining, Course Organisation slides pdf
2 Classification, KNN, Features slides pdf, diabetes.csv, classifier-KNN.ipynb
3 TF.IDF, Simple classifier slides pdf, lecture-tfidf.ipynb, 4docs.csv,
lecture-simple-text-classification.ipynb, simple-review.csv
4 word embedding, word2Vec, CBOW, Skipgram slides pdf, visualisation at https://ronxin.github.io/wevi/
5 CNN for text classification pdf
6 Python implementation, train embedding, pre-trained slides pdf,Sample code on Google Colab
7 More recent research in text representation, BERT slides pdf, Use BERT in Python
8 Challenges and some research projects slides pdf
9 Discussion  
10 Recommender Systems pdf
11 Content based Recommender systems pdf
12 Information Retrieval, Google and PageRank pdf
13 Personalised Search, Evaluation pdf
14 Language modeling pdf
15 Machine Translation pdf
16 Review, NLP and applications pdf

Presentation sign up

week 2 Friday, 5 March, topics related to text classification or clustering, such as new algorithms, deep learning models, or their applications
sign up Conner, Asher, Bonny (Yuguang)
week 4 Friday, 19 March, topics related to text representation such as word2vec, word embedding, word rank, new measures for word similarity
Sign up Ethan, Wenhao (Roy)Word2Vec-Roy(Wenhao Li).pptx, Hannah
week 7 Monday, 19 April, topics related to clustering algorithms, opinion mining, information extraction
Sign up Peter, Zihan, Joshua
week 8 Friday, 30 April, topics related to recommender systems, such as the system used by Netflix, Amazon, youTube, etc.
sign up Finn Sargisson, Yunhan (John), Mathew
week10 Friday, 14 May, topics related to information retrieval, query expansion, personalised search, such as new search engines, new web services.
Sign up Hugh,Guoqiang, Alex
week12 Friday, 28 May , other topics including machine translation, other natural language processing tasks
Sign up Roger(Yuheng), William, Shengkun, Finn Schofield

Topic attachments
I Attachment Action Size Date Who Comment
4docs.csvcsv 4docs.csv manage 147 bytes 01 Mar 2021 - 08:18 Main.xgao  
classifier-KNN.ipynbipynb classifier-KNN.ipynb manage 2 K 26 Feb 2021 - 05:43 Main.xgao  
diabetes.csvcsv diabetes.csv manage 23 K 26 Feb 2021 - 05:43 Main.xgao  
lecture-simple-text-classification.ipynbipynb lecture-simple-text-classification.ipynb manage 10 K 01 Mar 2021 - 08:21 Main.xgao  
lecture-tfidf.ipynbipynb lecture-tfidf.ipynb manage 5 K 01 Mar 2021 - 08:22 Main.xgao  
simple-review.csvcsv simple-review.csv manage 4 MB 01 Mar 2021 - 08:22 Main.xgao