Software Internationalization Guide
Software Internationalization Guide—526225-002
Glossary-1
Glossary
ANSI. The American National Standards Institute.
Arabic-based writing system. A writing system with letters that are derived from the Arabic 
alphabet. Not all languages that use Arabic characters are related linguistically to 
Arabic.
ASCII. TAmerican Standard Code for Information Interchange. A single-byte code set that 
uses only 7 of the 8 bits in a byte to represent each character. The ASCII code set 
contains the uppercase and lowercase characters of the U.S. English alphabet, some 
punctuation symbols, the digits 0 through 9, and some symbols and control characters. 
Because of its limited characters, and because the 8th bit is sometimes used in ASCII 
programs as a utility bit, the ASCII code set is not appropriate for use in international 
software.
base character. A character that can be combined with one or more combining characters 
to form a composite character.
Basic Multilingual Plane (BMP). The lower two octets, row and cell, of the ISO 10646 
character layout. Also known as Universal Coded Character Set - 2 (UCS-2).
block-based writing system. A writing system composed of single letters that stand alone 
in printed text such as English, French, and Russian.
BMP. See Basic Multilingual Plane (BMP). 
byte. An ordered set of bits that represents a character or a part of a character. The number 
of bits per byte is implementation-dependent; a byte usually contains 8 or more bits. 
Also called an octet.
C locale. A special locale defined by the ANSI C standard. Every standard C program 
always starts up in the C locale, which means that no locale-specific action takes 
place, and the program operates in the ASCII mode. All library functions behave as 
they do in standard C. Unless the program calls the setlocale() function, none of 
the behavior changes. Also called the POSIX locale or the C/POSIX locale.
character. A sequence of one or more bytes representing a single character; used for the 
organization, representation, or control of data. A single-byte character consists of 
eight bits that represent a character. A multibyte character uses one or more bytes to 
represent a character. A wide character is a fixed-width character wide enough to hold 
any coded character supported by an implementation.
character class. A named set of characters sharing an attribute associated with the name 
of the class.
character encoding. A method in which each member of a character set is mapped to 
specific numeric code values.










