find-hyphs.py is a Python program for
finding hyphenations in a text file. The output is a word list
showing what hyphenation points have been used, where -
indicates one instance, =
two and #
more than
two.
To use with a pdf file, combine with pdftotext, for instance like
pdftotext -layout foo.pdf - | python find-hyphs.py
or in a Makefile with
%-hy.txt : %.txt python find-hyphs.py <$^ | sort >$@ %.txt : %.pdf pdftotext -layout $^
The -layout option is needed to keep hyphenations.
(It won't work with two-column text.)
Per Starbäck, starback@stp.lingfil.uu.se