take the block on the green circle P = 0.004500 put the block on the circle on the red circle P = 0.002699 put the green cone on the square P = 0.003000 take the red cone P = 0.020000 put the green block on the square P = 0.006000 take the blue cone on the red circle P = 0.000375 take the cube on the red circle P = 0.004500 put the blue cone on the red circle on the green circle P = 0.000042 put the block on the green circle P = 0.004500 put the blue block on the circle P = 0.006000 Probability of corpus: 8.3671 E-27 Entropy (base 2) (per sentence): 8.6627 Entropy (base 2) (per word): 1.1398 Perplexity (per sentence): 405.2681 Perplexity (per word): 2.2035 NB: The test corpus was generated with the same probabilistic grammar as the training corpus. The sentence probabilities are derived from the grammar and sum to 1 over the set of all possible strings given the 12 word vocabulary. The probability of the corpus was computed simply by multiplying the sentence probabilities, which amounts to assuming that sentences are probabilistically independent of each other. The entropy (per sentence and per word) was estimated simply by taking the logarithm (base 2) of the corpus probability and dividing by the number of tokens (sentences and words, respectively). This empirical estimate of the entropy is sometimes simply called the logprob (LP). The perplexity is 2 raised to the power of the entropy (i.e., the entropy is the logarithm of the perplexity).