Tools.h++ Manual

104011 Tandem Computers Incorporated 4-1
Internationalization 4
Gone are the days when we could ignore our neighbors across the sea (or over
the fence), writing software only for local consumption. Professional software
development today demands not only awareness of the needs of users in other
cultures, but accommodation of those needs. This accommodation is called
localization; making software easily localized is called internationalization
1
Internationalization actually involves many different activities, potentially as
many as the ways in which cultures differ from one another. In practice, it
usually means accommodating differences in alphabets, languages, currencies,
numbers, and date- and time-keeping notations. Let us consider each of these
in turn.
Accommodation of different alphabets begins with allowing them to be
represented. A first step in this direction is making code “8-bit clean”, which
lets it tolerate extensions. Still, eight bits just isn’t enough to represent all the
character glyphs we use, even in English. Some extension beyond 8 bits is
required, and in fact several are in use, falling into two families: multibyte and
wide-character encodings.
Multibyte encodings use a sequence of one or more bytes to represent a single
character. (Typically the ASCII characters are still one byte long.) This gives a
compact encoding, but is inconvenient for indexing and substring operations.
1. “Internationalization” is a horrendous word, widely abbreviated “i18n”; 18 is the number of letters elided.