Uppsala universitet  
(back to main page)

Ongoing Projects

TYDLIGT - Techniques for Yielding Discourse Level Improvements in German Translation

Duration: December 2015 - November 2018
Place: University of Uppsala, Sweden
Description: In this project, I will go beyond word borders and investigate sentence- and discourse-level phenomena for SMT into German. My main interest is on multiword expressions whose components may occur widely spread over the entire sentence in German. Later, I will also investigate anaphora resolution techniques for SMT into German and work on domain adaptation.
Collaborators: Joakim Nivre, Sabine Schulte im Walde, Jonas Kuhn and Alexander Fraser

SWE-CLARIN

Duration: July 2015 - December 2018
Place: University of Uppsala, Sweden
Description: CLARIN (= Common Language Resources and Technology Infrastructure) is a European project aiming to make language resources and tools available and usable to researchers of the humanities and social sciences.

PARSEME

Duration: March 2013 - March 2017
Place: University of Uppsala, Sweden
Description: The IC1207 COST Action, PARSEME, is an interdisciplinary scientific network devoted to the role of multi-word expressions (MWEs) in parsing. Besides parsing, we are also interested in multi-word expressions in the context of multilingual processing and machine translation. I am the leader of the Germanic language group for the upcoming MWE extraction shared-task.

Past Projects

Models of Morphosyntax for Statistical Machine Translation

Duration: October 2009 - June 2015
Places: University of Stuttgart (until October 2013), LMU Munich, Germany (November 2013 - June 2015)
Description: In this project, we developed an SMT-pipeline for translation between English and German with a particular focus on translating into German. We modelled compound splitting prior to translation and compound merging after translation, which allowed us to translate and generate new compounds which had not occurred i the training data. Moreover, we removed all nominal inflections not present in English from the German data prior to translation and after translation we re-predicted the morphological features and thus created coherently inflected sequences, possibly with inflectional variants that have not occurred in the training data.
Collaborators: Hinrich Schütze (PI), Alexander Fraser, Marion Weller, Anita Ramm, Fabienne Braune

Combining Contextual Information Sources for Disambiguation in Parsing and Choice in Generation (SFB-732-D2)

Duration: October 2014 - March 2015
Place:University of Stuttgart, Germany
Description: The SFB-732 (SonderForschungsBereich) is a longterm collaborative research center focussing on Incremental Specification in Context. It consists of over 40 researchers from the Institute for Natural Language Processing and the Institute of Linguistics at the University of Stuttgart.
Collaborators: Jonas Kuhn (PI), Ina Rösiger

TTC - Terminology Extraction, Translation Tools and Comparable Corpora

Duration: April 2010 - September 2010
Place: University of Stuttgart, Germany
Description: TTC was a project founded by the European Union. It aimed at providing specialised terminology for machine translation derived from comparable corpora.
Collaborators: Ulrich Heid, Alexander Fraser

D-Spin - Deutsche Sprachressourcen Infrastruktur

Duration: September 2008 - September 2009
Place: University of Stuttgart, Germany
Description: D-SPIN was a German sub-project within the CLARIN preparatory phase. We collected available NLP resources (like Tokenizers, Tagger, Parser etc.) and we build an initial linguistic web service for term and collocation extraction. Years later these initial efforts resulted in WEBLICHT (WEB-based LInguistic CHaining Tool), which has been implemented and maintained by the University of Tübingen.
Collaborators: Hinrich Schütze (PI), Ulrich Heid, Helmut Schmid, Andreas Madsack and Max Kisselew from Stuttgart and from Tübingen University even Erhard Hinrichs, Marie Hinrichs and Thomas Zastrow.