UPPSALA UNIVERSITET : Inst. f. lingvistik och filologi : STP
Uppsala universitet
Hoppa över länkar


Schedule
Content
Examination
Assignments
Reading


Language Technology: Research and Development

Credits: 15 hp
Syllabus: 5LN714
Teachers: Joakim Nivre, Eva Pettersson

News

Schedule

Date Time Room Content Reading
L1
3/9
10-12
16-0054
Introduction (Universal Dependencies, Historical Text Processing)

L2
11/9
10-12
16-0054
Science and research Okasha
S3.1
18/9
10-12
2-0025 (Hist1), 2-0026 (UD1)
Seminar - research papers

S3.2
18/9
12-14
2-0025 (Hist2), 2-0023 (UD2)
Seminar - research papers

L4
21/9
10-12
2-1024
Language technology research and development
Cunningham, Lee
S5.1
3/10
12-14
2-0025 (Hist1), 2-0026 (UD1)
Seminar - research papers

S5.2
3/10
14-16
2-0025 (Hist2), 2-0026 (UD2)
Seminar - research papers

L6
8/10
13-15
16-0043
R&D projects - from proposal to implementation
Zobel 10-11, 13
T
10/10
13-15
Chomsky
LaTeX tutorial

S7.1
16/10
10-12
2-0023 (Hist1), 2-0026 (UD1)
Seminar - research papers
S7.2
16/10
12-14
2-0028 (Hist2), 9-1016 (UD2)
Seminar - research papers
S8a
23/10
10-12
2-0025 (Hist), 2-0026 (UD)
Seminar - project proposals

S8b
23/10
13-14
2-0025 (Hist), 2-0026 (UD)
Seminar - project proposals

S8c
23/10
14-15
6-0023 (Hist), 22-0025 (UD)
Seminar - project proposals

S9.1
7/11
10-12
2-0025 (Hist1), 2-0026 (UD1)
Seminar - progress report

S9.2
7/11
13-15
2-0027 (Hist2), 2-0026 (UD2)
Seminar - progress report

L10
14/11
10-12
6-K1031
Dissemination of research results
Zobel 1-9, 14
S11.1
21/11
10-12
2-0025 (Hist1), 2-0026 (UD1)
Seminar - progress report
S11.2
21/11
13-15
2-0023 (Hist2), 2-0026 (UD2)
Seminar - progress report
T
23/11
10-12
Chomsky
LaTeX tutorial

L12
28/11
10-12
6-K1031
Review of scientific articles
Zobel 14
S13.1
5/12
10-12
2-0025 (Hist1), 2-0026 (UD1)
Seminar - progress report
S13.2
5/12
13-15
2-0025 (Hist2), 2-0026 (UD2)
Seminar - progress report
S14
12/12
13-15
2-0025 (Hist), 2-0026 (UD)
Seminar - progress report
S16
9/1
10-16
6-K1031
Seminar - term paper presentations

Content

The course gives a theoretical and practical introduction to research and development in language technology. The theoretical part covers basic philosophy of science, research methods in language technology, project planning, and writing and reviewing of scientific papers. The practical part consists of a small project within a research area common to a subgroup of course participants, including a state-of-the-art survey in a reading group, the planning and implementation of a research task, and the writing of a paper according to the standards for scientific publications in language technology. The research areas for 2018 are:
  1. Universal Dependencies
  2. Historical Text Processing

Examination

The course is examined by means of five assignments with different weights (see below). In order to pass the course, a student must pass each of one of these. In order to pass the course with distinction, a student must pass at least 50% of the weighted graded assignments with distinction.

Assignments

  1. Take home exam on philosophy of science (15%)
    • This assignment will be based on your reading of Okasha's book. You will be asked to discuss issues in the philosophy of science and (sometimes) relate them to the area of language technology. The questions will be handed out September 24, and the report should be handed in September 28.
  2. Research paper presentation (15%)
    • You will present one of the papers discussed in the seminars. The task is to introduce the paper and lead the discussion, not to make a formal presentation. This assignment is not graded and does not qualify for distinction.
  3. Project proposal (15%)
    • You will put together a 3-page proposal describing the project you are going to work on for the rest of the course, using an adapted version of the Swedish Research Council's guidelines for research plans. You will also give a short presentation of the proposal in a seminar (10 minutes with slides). The deadline for the proposal is October 19, and the seminars will take place October 23.
  4. Review of term papers (15%)
  5. Term paper (40%)
    • You will report your project in a paper following the guidelines of Transactions of the Association for Computational Linguistics (except that the page limit for your papers is 4-7 pages + references). The deadline is December 14 for the first version and January 11 for the revised version. On January 9, you will also give an oral presentation of the paper.

Submitting and Reviewing Term papers

We will use EasyChair for submission and review of papers. Please make sure that you have an EasyChair account and then use the following link to log in: LT:R&D19. To be added as a reviewer, you must send the email address used in your EasyChair account to Joakim.

Final Seminar/Workshop

The final seminar will be organized as a workshop with term paper presentations. The time slot for each paper is 15 minutes, to be divided into 12 minutes presentation and 3 minutes discussion. The session chairs will enforce the times strictly. More information later.

Research Groups

Groups Members Projects
Historical Text Processing 1 FatoumataSpelling normalization of early French
TillSpelling normalization of historical German
CrinaStructure and position of VP in Icelandic
IvanLexical normalization using Soundex
StergiosExtrinsic evaluation of phonetic normalization
Helena
Historical Text Processing 2 SamuelBuilding a normalization corpus for Spanish based on Twitter
EllinorLexical normalization for Swedish social media text
AnthiSentiment analysis on the #MeToo movement
HikaruWord segmentation on Japanese Twitter
SujoungNER and normalization on Korean and English Twitter
YaxiNER on Twitter
Universal Dependencies 1 MajaAnalyzing word order in UD treebanks
RevekkaIdentification of verbal MWEs in Greek
YuxinQuestion-answer generation using UD
ClaudiaAnnotation guidelines for Swedish Twitter
XiaoParsing Chinese social media text
RobinDependency parsing with missing diacritics
Universal Dependencies 2 AilsaCross-lingual learning for Faroese
AndrewSynthetic corpora for low-resource multilingual dependency parsing
FeiruozaRule-based morphology for Uyghur dependency parsing
TongheParsing low-resource languages via joint training on similar languages
GuanchenWord embeddings for dependency parsing
JohannesELMo for dependency parsing and domain adaptation
ElenaELMo for multilingual dependency parsing

Presentation of Project Proposals

Historical Text ProcessingUniversal Dependencies
10:15SamuelMaja
10:30EllinorRevekka
10:45AnthiYuxin
11:00Break
11:15FatoumataClaudia
11:30TillXiao
11:45CrinaRobin
12:00Break
13:00Ailsa
13:15HikaruAndrew
13:30SujoungFeiruoza
13:45YaxiTonghe
14:00Break
14:15IvanGuanchen
14:30StergiosJohannes
14:45Elena

Reading

Science and Research

Universal Dependencies

Historical Text Processing