Hyphenation exceptions

While TeX's hyphenation rules are good, they're not infallible: you will occasionally find words TeX just gets wrong. So for example, TeX's default hyphenation rules (for American English) don't know the word “manuscript”, and since it's a long word you may find you need to hyphenate it. You can “write the hyphenation out” each time you use the word:

snippet.latex
... man\-u\-script ...

Here, each of the \- commands is converted to a hyphenated break, if (and only if) necessary.

That technique can rapidly become tedious: you'll probably only accept it if there are no more than one or two wrongly-hyphenated words in your document. The alternative is to set up hyphenations in the document preamble. To do that, for the hyphenation above, you would write:

snippet.latex
\hyphenation{man-u-script}

and the hyphenation would be set for the whole document. Barbara Beeton publishes articles containing lists of these “hyphenation exceptions”, in TUGboat; the hyphenation “man-u-script” comes from one of those articles.

What if you have more than one language in your document? Simple: select the appropriate language, and do the same as above:

snippet.latex
\usepackage[french]{babel}
\selectlanguage{french}
\hyphenation{re-cher-cher}

(nothing clever here: this is the “correct” hyphenation of the word, in the current tables). However, there's a problem here: just as words with accent macros in them won't break, so an \hyphenation commands with accent macros in its argument will produce an error:

snippet.latex
\usepackage[french]{babel}
\selectlanguage{french}
\hyphenation{r\'e-f\'e-rence}

tells us that the hyphenation is “improper”, and that it will be “flushed”. But, just as hyphenation of words is enabled by selecting an 8-bit font encoding, so \hyphenation commands are rendered proper again by selecting that same 8-bit font encoding. For the hyphenation patterns provided for “legacy”, the encoding is Cork, so the complete sequence is:

snippet.latex
\usepackage[T1]{fontenc}
\usepackage[french]{babel}
\selectlanguage{french}
\hyphenation{r\'e-f\'e-rence}

The same sort of performance goes for any language for which 8-bit fonts and corresponding hyphenation patterns are available. Since you have to select both the language and the font encoding to have your document typeset correctly, it should not be a great imposition to do the selections before setting up hyphenation exceptions.

Modern TeX variants (principally XeTeX and LuaTeX) use unicode, internally, and distributions that offer them also offer UTF-8-encoded patterns; since the hyphenation team do all the work “behind the scenes”, the use of Unicode hyphenation is deceptively similar to what we are used to.

This website uses cookies for visitor traffic analysis. By using the website, you agree with storing the cookies on your computer.More information

Creative Commons Lizenzvertrag Edit this page Old revisions Sitemap Backlinks RSS feed Impressum Flattr this