Informationssökning, 7.5hp, HT 2010
Kurskod: 5LN440
Kursplan
Lärare: Jörg Tiedemann (JT), Martin Hassel (MH), Magnus Rosell (MR)
News
- Exempeltenta (från förra året)
- schemaändring (se nedan)! (ny tid på 8/10)
- 2010-09-20: 2 små ändringar i schemat (se nedan)!
- 2010-05-27: kurssida är on-line
- 2010-09-08: topics for presentations (individual or in groups of 2)
- 2010-09-14: Deadline for lab 1 is 2010-09-20 (send report by e-mail to Martin Hassel, see lecture notes for his e-mail address)
Preliminärt schema
Typ |
Datum |
Tid | Lokal |
Lärare |
Innehåll | att läsa |
|---|---|---|---|---|---|---|
| F1 | 2010-09-06 | 10-12 | Turing | JT | Introduction & overview | ch1, ch2 |
| F2 | 2010-09-08 | 10-12 | Turing | MH | IR basics & Link Analysis | ch6, ch8, ch21 |
| F3 | 2010-09-14 | 10-12 | Turing | MH | Web Crawlers & LSA/RI | ch18-ch20, MS |
| L1 | 2010-09-14 | 13-15 | Turing | MH | wget, LSA & RI | deadline: 2010-09-20 |
| F4 | 2010-09-21 | 10-12 | Turing | JT | Dictionaries & tolerant retrieval | ch3 |
| Inställd! | ||||||
| F5 | 2010-09-27 | 10-12 | Turing | JT | Ranked Retrieval | ch6, ch7, mittkursutvärdering |
| F6 | 2010-09-28 | 10-12 | Turing | MR | Clustering | ch16, ch17, MR |
| L3 | 2010-09-28 | 13-16 | Turing | MR | Clustering | |
| F7 | 2010-10-08 | 13-15 | Turing | MH | Text Extraction & Summarization | MH07 (ch 2-4 except 4.4-4.5) |
| L2 | 2010-10-08 | 15-17 | Chomsky | MH | stemming & regular expressions | |
| F8 | 2010-10-11 | 10-12 | Turing | JT | Text classification | ch13 |
| F9 | 2010-10-13 | 10-12 | Turing | JT | Question answering | JM 23.2, Bouma, Wikipedia |
| Seminar | 2010-10-25 | 10-12 | Turing | JT | Presentations | kursutvärdering |
| Seminar | 2010-10-27 | 10-12 | Turing | JT | Presentations | kursutvärdering |
| Tenta | 2010-11-05 | 8-12 | Gimog. 4, sal 1 |
Tentamen |
F=föreläsning, L=laboration.
Chomsky = datasal 9-2043. Turing = datasal 9-2042
Examination
Examination sker genom obligatoriska laborationer och inlämningsuppgifter (måste bli godkända), en muntlig presentation (sista föreläsningstillfälle, 25% av betyget) och en skriftlig tentamen (75% av betyget). Both parts will get a score between 1 and 10 and the average score will be used to determine the final grade for this course. For G you need to get a score of 6 or more and for VG a score of 8 or more. In order to obtain VG for the presentation you need to show a deep understanding of the topic you're presenting and you are required to present it in a pedagogical way that makes it easy for your fellow students to understand this topic.
Kurslitteratur
- Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. Valda avsnitt. Finns på nätet: länk
- [MS] Magnus Sahlgren: An Introduction to Random Indexing
- [MH07] Martin Hassel: Resource Lean and Portable Automatic Text Summarization, Doctoral Thesis, ch 2-4
- Wikipedia: Question answering
- Gosse Bouma, Ismail Fahmi, Jori Mur, Gertjan van Noord, Lonneke van der Plas, and Jörg Tiedemann: Linguistic Knowledge and Question Answering. In Traitement Automatique des Langues (TAL), 2005/3
Ytterligare material kan tillkomma.
Topics for the presentation (individual or 2 students)
- Erik Sterneberg: Index construction (ch4)
- Rebecka Sundström & Camilla Liljhammar: Index compression (ch5)
- 1 student: Evaluation in IR (ch8)
- Björn Lindström: Relevance feedback & Query expansion (ch9)
- 1 student: XML retrieval (ch10)
- Erik Margaronis & Gustav Hallstensson: Probabilistic IR & LM for IR (ch11 & ch12)
- Lina Stadell & Cornelius Fath: Vector space classification (ch14)
- Martin Kjellin: SVM's & machine learning (ch15)
- Your own topic (talk to me first)
2010-10-25, 10-12 10:15-10:30 Erik Sterneberg: Index construction (ch4) 10:35-10:55 Rebecka Sundström & Camilla Liljhammar: Index compression (ch5) 11:00-11:20 Erik Margaronis & Gustav Hallstensson: Prob. IR & LM's (ch11,12) 2010-10-27, 10-12 10:15-10:30 Björn Lindström: Relevance feedback & Query expansion (ch9) 10:35-11:55 Lina Stadell & Cornelius Fath: Vector space classification (ch14) 11:00-11:20 Martin Kjellin: SVM's & machine learning (ch15) 11:25- Kursutvärdering
Länkar
- Web Data Mining, Exploring Hyperlinks, Contents and Usage Data. Bing Liu, Springer, December, 2006
- Text REtrieval Conference
- Cross-Language Evaluation
Forum (CLEF)
- Random Indexing
- PhD thesis. Magnus Rosell, 2009: "Text Clustering Exploration - Swedish Text Representation and Clustering Results Unraveled"
