Software Internationalization Guide
Software Characteristics That Vary by Locale
Software Internationalization Guide—526225-002
2-6
Multibyte Code Sets
Multibyte Code Sets
Multibyte code sets represent characters that require more than one byte to store
encoded values. Eight-bit code sets are often subsets of multibyte code sets.
East Asian Code Sets
The Chinese, Japanese, and Korean languages consist of several thousand
ideographic characters. Eight bits are not enough to represent these characters—
16 bits or more are required. The Chinese National Standard (CNS), Chinese Guo Biao
(GB), Japanese Industrial Standard (JIS), and Korean Standard (KS) are groups that
have created standard code sets.
Table 2-3
lists a few of the East Asian standard code sets and the languages they
support.
Table 2-2. ISO 8859 Code Sets
Code Set Name Languages Supported
ISO 8859-1 Western European – Danish, Dutch, English, Faeroese, Finnish,
French, German, Icelandic, Irish, Italian, Norwegian, Portuguese,
Spanish, Swedish
ISO 8859-2 Eastern European – Albanian, Czechoslovakian, English, German,
Hungarian, Polish, Romanian, Serbo-Croatian
ISO 8859-3 Southeastern European – Afrikaans, Catalan, Dutch, English,
Esperanto, German, Italian, Maltese, Spanish, Turkish
ISO 8859-4 Northern European – Danish, English, Estonian, Finnish, German,
Greenlandic, Sami (Lappish), Latvian, Lithuanian, Norwegian, Swedish
ISO 8859-5 Latin and Cyrillic-Based — Bulgarian, Byelorussian, English,
Macedonian, Russian, Serbo-Croatian, Ukrainian
ISO 8859-6 Latin and Arabic
ISO 8859-7 Latin and Greek
ISO 8859-8 Latin and Hebrew
ISO 8859-9 Western European and Turkish — Danish, Dutch, English, Finnish,
French, German, Irish, Italian, Norwegian, Portuguese, Spanish,
Swedish, Turkish
ISO 8859-10 Danish, English, Estonian, Finnish, German, Greenlandic, Icelandic,
Sami (Lappish), Latvian, Lithuanian, Norwegian, Faeroese, Swedish
Table 2-3. East Asian Code Sets (page 1 of 2)
Code Set Name Languages Supported
CNS 11643 Traditional Chinese
GB 18030 Modern Chinese
GB 2312 Simplified Chinese