ROT13 (good version)



What makes this version better than other implementations?

It handles combining characters correctly.

Simplistic implementations of ROT13, such as the one at, when presented with the word café, leave the é (U+00E9 LATIN SMALL LETTER E WITH ACUTE) untouched, and produce pnsé.

When this string is sent through a Unicode-aware text system, it may be subject to normalization, changing from pnsé to pnsé. Although these two strings look identical, in the latter, the U+00E9 LATIN SMALL LETTER E WITH ACUTE has been decomposed into two characters, U+0065 LATIN SMALL LETTER E followed by U+0301 COMBINING ACUTE ACCENT.

When simplistic ROT13 is applied again, the U+0065 LATIN SMALL LETTER E does get decoded, and the "decoded" string is cafŕ, which is not the original input.

Here, we avoid this issue by applying NFD normalization before encoding/decoding. The correct encoded form of both café and café is pnsŕ, and the decoded form of pnsŕ is, of course, café.

Back to Things Of Interest