C/C++ Programmer's Guide (G06.27+, H06.08+, J06.03+)

returns the identifier of the default character set. The Guardian Procedure Calls Reference Manual
describes this system procedure in detail.
The internal representation of the characters of these languages is HP internal and might not conform
to any ISO standard. HP can choose to change this internal representation at any time.
Multibyte Characters
The basic difficulty in an Asian environment is the huge number of ideograms that are needed
for I/0, for example Chinese characters. To work within the constraints of usual computer
architectures, these ideograms are encoded as sequences of bytes. The associated operating
systems, application programs, and terminals understand these byte sequences as individual
ideograms. Moreover, all these encodings allow intermixing of regular single-byte C characters
with the ideogram byte sequences.
The term “multibyte character” denotes a byte sequence that encodes an ideogram. The byte
sequence contains one or more codes where each code can be represented in a C character
data type: char, signed char, or unsigned char. All multibyte characters are members of the
so-called extended character set. A regular single-byte C character is just a special case of a
multibyte sequence where the sequence has a length of one.
Wide Characters
Some of the inconvenience of handling multibyte characters is eliminated if all characters are
of a uniform number of bytes or bits. A 16-bit integer value is used to represent all members
because there can be thousands or tens of thousands of ideograms in an Asian character set.
Wide characters are integers of type wchar_t, defined in the headers stddef.h and
stdlib.h as:
typedef unsigned short wchar_t;
Such an integer can represent distinct codes for each of the characters in the extended character
set. The codes for the basic C character set have the same values as their single-character
forms.
Relationship Between Multibyte and Wide Characters
Multibyte characters are convenient for communicating between the program and the outside
world.
Wide characters are convenient for manipulating text within a program.
The fixed size of wide characters simplifies handling both individual characters and arrays
of characters.
416 HP C Implementation-Defined Behavior