UPPSALA UNIVERSITY : Department of Linguistics and Philology : STP : MT
Uppsala universitet
Hoppa över länkar

Course Home Page


Uppsala University
Department of Linguistics and Philology
Language Technology Program
Machine Translation
Lab 4

Statistical Machine Translation: Lab 4

Goals

In this lab, you will have the opportunity to train and examine a phrase-based SMT system and to explore how the dynamic programming beam search algorithm for phrase-based statistical machine translation works.

Model training

First, you will train a complete SMT system with the familiar Blockworld corpus to familiarise yourself with the Moses training pipeline. You can still find the data in /local/kurs/mt/lab3/data. Copy it to your home directory if you haven't already done so.

Choose which language (English or Swedish) you want to use as your source and target language, respectively. To begin with, you need a language model for the target language. You may still have one lying around from your earlier lab assignments, if not follow the instructions from part 2 of lab 2 to train a model with SRILM. Use an n-gram order that you found worked well in the experiments of lab 2.

Next, use the training scripts provided with Moses to train your model:

/local/kurs/mt/mosesdecoder/scripts/training/train-model.perl --corpus corpus \
    --f src --e trg --root-dir moses.output --lm 0:order:lm-file \
    --external-bin-dir /local/kurs/mt/bin >logfile 2>&1
Here, root-dir is the name of a directory that will be created to hold the models. The other placeholders in italics are for the corpus name, source and target language suffixes, LM order, and LM file. If your corpus is stored in corpus.parallel.swe and corpus.parallel.eng, you should give corpus as corpus.parallel, and then src and trg are eng and swe or vice-versa depending on which direction you want to translate. Note that you must give the pull path to the LM file, i.e. /home/stp15/YOURNAME/Documents/.... Once the training is done, take a look at the training log to see what happened and if everything went well.

Models

Examine the files generated by the training process. Try to figure out what information they contain by looking at them. You may consult the Moses webpage to read about the training process. The training log may help you understand what goes on during training. Try to relate the training log output to the training pipeline on lecture slides. For the lab report, include a list of all the files generated and a brief (1-3 lines per file) explanation of their contents.

For the following assignments, locate the phrase table and the decoder configuration file.

Phrase table

The phrase table contains five fields separated by " ||| " marks: Source phrase, target phrase, feature values, word alignments and some counts from the training corpus. Some of the feature values are probabilities summing to 1 over a certain set of alternatives, some are not.

Decoder configuration file

The Moses configuration file contains different sections. The [feature] section contains pointers to the phrase table file and language model file.

The configuration file also contains some feature weights. Note that the phrase table has 4 weights, one for each feature contained in the phrase table.

There are no questions to answer about this file, but take a good look at it and make yourself familiar with the main parameters defining a phrase-based SMT model.

Testing your Blockworld model

Try running Moses with the model you've just trained:
/local/kurs/mt/mosesdecoder/bin/moses -f moses.ini
You can find the test sentences from lab 2 in /local/kurs/mt/lab2/data/test_meningar.language. Feed them into the decoder and examine the output. How does it compare to the word-based systems you used in earlier labs?

Exploring the search algorithm

For the rest of the assignments, we're going to use a real-world Swedish-English model trained on Europarl data. You can find the model in /local/kurs/mt/lab-moses/europarl.sv-en. It's substantially larger than the Blockworld model. There's a ready-made moses.ini for you to use. Copy it to your directory. Note that this configuration file was made with an earlier version of Moses so probably looks a bit different to the one you created in the previous section.

The model works with lowercased and tokenised text. You can use the script preprocess.sh in the model directory to preprocess your test sentences in the same way.

Start the decoder with this model, try entering a few sentences and look at the translations you get. You can make up your own sentences or copy some sentences from a newspaper website such as DN or Svenska Dagbladet. You can quit the decoder by pressing Control-D. Look at the BEST TRANSLATION line to see the scores. The decoder outputs the total score as well as the vector of the individual core feature scores. If you wonder which score corresponds to which feature, stop the decoder and run it again as

/local/kurs/mt/mosesdecoder/bin/moses -f moses.ini -show-weights

This will output the feature names and their corresponding weights.

You can increase the decoder's verbosity level to see what it does. If you run the decoder with the -v 2 option, it will tell you how many hypotheses were expanded, recombined, etc. With the -v 3 option, the decoder will dump information about all the hypotheses it expands to standard error. The -report-segmentation option will show you how the input sentence was segmented into phrases.

Another way to gather information about how decoder parameters affect the output is by looking at n-best-lists containing the n best hypotheses in the part of the search space explored by the decoder. To generate n-best output, start the decoder with the -n-best-list file size option. This will output n-best-lists of the given size to the file you specify. Use an n-best size of around 100 to obtain a fair impression of the best output hypotheses.

Here are some options you can use to influence the search:

-stack S sets the stack size S for histogram pruning (default: 100)
-beam-threshold eta sets the beam threshold eta for threshold pruning (default: 0.00001, which effectively disables threshold pruning in most cases!)
-max-phrase-length p segments the input into phrases of length at most p (default: 10, which is more than the maximum phrase length in our phrase table!)
-distortion-limit d sets the distortion limit (maximum jump) to d (default: 6; 0 means no reordering permitted, -1 means unlimited)

You can also change the ttable-limit directly in moses.ini - this affects how many translation options are loaded for each span.

As for finding the search error, what you should do is find out what the best solution in your reference system looks like (segmentation, phrase translations, ordering). Then look at the search log (the -v 3 output) of the target system, starting with the empty hypothesis at the beginning (number 0) and try to follow the search path that would generate the same solution. I suggest you load the search log into a text editor, so you can use the search function to search for hypothesis numbers to see how they're expanded. The search error occurs at the point where the last hypothesis that is a prefix of the correct solution stops being expanded, because it's pruned or removed from the stack in another way. Depending on how exactly you set up your target system, it may also fail to generate the best solution in the first place, e.g., because the ttable-limit or the distortion limit prevents it. Then that would be the source of the search error.

Overall impression of phrase-based SMT

Lab report

In your lab report, you should discuss the questions in the assignments and the experiments you carried out to answer them. You should also hand in your search graph drawing.

Send your report by e-mail to Aaron Smith. The deadline is 17th May 2016. You may hand in your search graph on paper by dropping it into my pigeonhole at the department (on the 3rd floor) if it's easier for you. Don't forget to put your name on it!