UPPSALA UNIVERSITET : Inst. f. lingvistik och filologi : STP
Uppsala universitet
Hoppa över länkar


Schedule
Learning Outcomes
Examination
Reading List
Course Evaluations


Statistical Methods for NLP

Credits: 7,5 hp
Syllabus: 5LN704
Teachers: Joakim Nivre, Evelina Andersson (teaching assistant)

News

Schedule

Date Time Room Content Reading
1
23/1
13-15
9-2029
Probability theory
(Slides, Recording1, Recording2)
Schay, ch. 1, 3, 4 (not 4.2-4.3)
2
2/2
13-15
9-2029
Statistical inference
(Slides, Recording1, Recording2)
Schay, ch. 5 (not 5.3-5.6), 7
3
13/2
15-17
9-2029
Bayesian classification
(Slides, Recording1, Recording2)
Mitchell, 1-2.3; Jurafsky & Martin, 20.1-20.2;
Androutsopoulos et al.
4
16/2
13-15
9-2029
Hidden variables and EM
(Slides, Recording1, Recording2)
Prescher; Nigam et al.
5
27/2
13-15
9-2029
Sequence models
(Slides, Recording1, Recording2)
Jurafsky & Martin, 5.5, 6-1-6.5
6
12/3
13-15
9-2029
Stochastic grammars
(Slides, Recording1, Recording2)
Jurafsky & Martin, 14.1-14.6; Prescher

All lectures will be broadcast through SUNET's Adobe Connect server. Connect through:

Flash Player 8.0.0.0 or above is required and you will be prompted to allow an add‐in to be installed.

Intended Learning Outcomes

In order to pass the course, a student must be able to
  1. apply basic probability theory and principles of statistical inference to natural language data,
  2. implement simple statistical models for classification and sequence labeling in language technology,
  3. construct treebank grammars for use in natural language parsing,
  4. apply the principles of expectation-maximization to models with hidden variables,
with a certain degree of independent creativity, clearly stating and critically discussing methodological assumptions, applying state-of-the-art methods for evaluation, and presenting the result in a professionally adequate manner.

Examination and Grading Criteria

The course is examined by means of four assignments:
  1. Estimation and hypothesis testing
  2. Naive Bayes
  3. Hidden Markov models
  4. Treebank grammars
In order to pass the course, a student must pass each of one of these. In order to pass the course with distinction (Väl godkänt), a student must pass at least two assignments with distinction.

Reading List

NB: Schay (2007) is my suggestion for those who do not already have a book on probability theory, but any introductory textbook on the topic will do fine.

Course Evaluation

Course evaluation questionnaire