Automatic Decryption of Historical Manuscripts

Thousands of enciphered historical manuscripts are buried in libraries and archives. Examples of such material are diplomatic correspondence and intelligence reports, private letters and diaries as well as manuscripts related to secret societies, or other (religious) groups in the margins of society. The bulk of these historical manuscripts will remain undeciphered unless we can automate - in part or in full - the processes involved in decoding them.

Our aim is to develop computer-aided tools for automatic and semi-automatic decoding of historical source material by cross-disciplinary research involving computer science, language technology, linguistics and philology. DECODE involves the collection of ciphertexts and keys from Early Modern times, the systematic automatic detection of various cipher types, the development of algorithms for (semi-)automatic decryption of different types of ciphers, and the creation of language models and pattern dictionaries for early variants of fifteen European languages: Czech, Dutch, English, Finnish, French, German, Hungarian, Icelandic, Italian, Latin, Polish, Portuguese, Russian, Spanish, and Swedish.

The cipher database and the historical language collections and language models will be available soon...

Database of historical ciphertexts and keys

Language models for historical texts

Check out the ciphers we solved: The Copiale Cipher, The Cipher

The project is financed by the Swedish Research Council, grant E0067801 for the period 2015-2017.

Principal Investigator: Beáta Megyesi, Department of Linguistics and Philology, Uppsala University
Participants: Kevin Knight and Nada Aldarrab, ISI, University of Southern California and Nicklas Bergman, Nils Blomqvist, and Eva Pettersson, Department of Linguistics and Philology, Uppsala University