Input:
Output:
It handles combining characters correctly.
Simplistic implementations of ROT13, such as the one at rot13.com, when presented with the word café
, leave the é
(U+00E9 LATIN SMALL LETTER E WITH ACUTE) untouched, and produce pnsé
.
When this string is sent through a Unicode-aware text system, it may be subject to normalization, changing from pnsé
to pnsé
. Although these two strings look identical, in the latter, the U+00E9 LATIN SMALL LETTER E WITH ACUTE has been decomposed into two characters, U+0065 LATIN SMALL LETTER E followed by U+0301 COMBINING ACUTE ACCENT.
When simplistic ROT13 is applied again, the U+0065 LATIN SMALL LETTER E does get decoded, and the "decoded" string is cafŕ
, which is not the original input.
Here, we avoid this issue by applying NFD normalization before encoding/decoding. The correct encoded form of both café
and café
is pnsŕ
, and the decoded form of pnsŕ
is, of course, café
.