Software Internationalization Guide
Software Characteristics That Vary by Locale
Software Internationalization Guide—526225-002
2-10
Single-Byte Characters
two-byte and four-byte character data types. Some character types are not defined as 
single or multiple byte, but can support various combinations.
Single-Byte Characters
A single-byte character data type consists of eight bits that represent a character. 
ISO 8859 characters are single-byte characters. A single byte can represent up to 256 
characters.
Multibyte Characters
A multibyte character is a coded character that uses one or more bytes in a single data 
stream and that can include characters with varying widths, as shown in Figure 2-9. 
Multibyte characters typically consist of characters encoded in defined code sets. For 
example, a multibyte data stream can contain single-byte ASCII characters as well as 
multibyte ideographic characters. In most situations, a mechanism is needed to define 
the boundary between single-byte and multibyte characters—shift-in and shift-out 
sequences, for example.
Multibyte characters are used for file codes, which are the external representations of 
data. A file code is the format of data that is stored on disk. See File Codes and 
Process Codes on page 2-11 for more information.
Wide Characters
A wide character is a fixed-width character wide enough to hold any coded character 
supported by an implementation. A wide character is an object of the wchar_t type 
definition, included in ISO C to enable international support. 
Wide characters promote code-set independence by removing dependencies on 
specific code sets or encoding methods, and replacing them with general functions that 
can process any encoding. The wide character data type provides flexibility because it 
can store characters defined up to the widest character in the supported code set. 
All wide characters in a single data stream are the same size. The size is defined by 
the implementation and is set at compile time. Wide character sizes most often used 
are 1, 2, or 4 bytes—8, 16, or 32 bits. For example, if wchar_t is defined as 4 bytes, 
all wide character data is processed in 4-byte groups, including all characters from 
Figure 2-9. Multibyte Character Data Stream
1
byte
1
byte
1
byte
1
byte
1
byte
1
byte
1
byte
1
byte
1
byte
1
byte
1
byte
1
byte
. . . . . .
4-byte 2-byte 1-byte 4-byte 1-byte
VST010.vsd










