CS249: Paper List – Winter 2020
This is the list of suggested papers that students may want to read and present. Most of these papers have been highly influential and considered a “required reading” in NLP. Every group will have to pick and present one paper for two hours during the quarter. If you want to see a few example summary, here are sample submissions from earlier years (on a paper not listed here.)
By Instructor
(Week 1 Mon)Class Introduction [(Slides)](…/slides/Lec01 Introduction.pptx)- (Week 1 Wed) Yoshua Bengio, et al.: A Neural Probabilistic Language Model, J. of Machine Learning Research, 2003. [(Slides)](…/slides/Lec02 Neural Language Model.pptx)
- (Week 2 Mon) Tomas Mikolov, et al.: Distributed Representations of Words and Phrases and their Compositionality, NIPS 2013. [(Slides)](…/slides/Lec03 Neural Language Model.pptx)
- (Week 2 Wed) Jeffrey Pennington, et al.: GloVe: Global Vectors for Word Representation, 2014. [(Slides)](…/slides/Lec04 Word Embedding.pptx)
(Week 3 Mon)Martin Luther King, Ju. Holiday
By Students
- (Week 3 Wed) Kamal Nigam, et al.: Text Classification from Labeled and Unlabeled Documents using EM, Machine Learning, 1999. [(Slides)](…/slides/Lec05 Text Classification.pptx)
- (Week 4 Mon) Adam Berger, Stephen Della Pietra, Vincent Pietra: A Maximum Entropy Approach to Natural Language Processing, J of Computational Linguistics 1996.
- (Week 4 Wed) Michael Collins: Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms, EMNLP 2002.
(Week 5 Mon)Project Presentations(Week 5 Wed)Project Presentations- (Week 6 Mon) John Lafferty, Andrew McCallum, Fernando C.N. Pereira: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, ICML 2001.
- (Week 6 Wed) Ryan McDonald, et al.: Non-Projective Dependency Parsing using Spanning-Tree Algorithms, EMNLP 2005.
(Week 7 Mon)Presidents’ Day Holiday- (Week 7 Wed) Danqi Chen, Christopher D. Manning: A Fast and Accurate Dependency Parser using Neural Networks, EMNLP 2014.
- (Week 8 Mon) Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio: Neural Machine Translation by Jointly Learning to Align and Translate, ICLR 2015.
- (Week 8 Wed) Ashish Vaswani, et al.: Attention is All You Need, NIPS 2017.
- (Week 9 Mon) Jacob Devlin, et al.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, 2018.
(Week 9 Wed)Project Presentations(Week 10 Mon)Project Presentations(Week 10 Wed)Project Presentations
Helpful Tutorials
- Maya R. Gupta, Yihua Chen: Theory and Use of the EM Algorithm, 2010.
- Kevin Knight: Statistical MT Tutorial Workbook, 1999.
- Charles Sutton, Andrew McCallum: An Introduction to Conditional Random Fields, 2012.
- Kevin Knight: Bayesian Inference with Tears, 2009.
- Bela A. Frigyik, Amol Kapila, Maya R. Gupta: Introduction to the Dirichlet Distribution and Related Processes, 2010.
- Philip Resnik, Eric Hardisty: Gibbs Sampling for the Uninitiated, 2010.
- Maya R. Gupta: A Measure Theory Tutorial (Measure Theory for Dummies), 2006.