C/C++ Programmer's Guide (G06.27+, H06.03+)

Table Of Contents
HP C Implementation-Defined Behavior
HP C/C++ Programmer’s Guide for NonStop Systems429301-010
A-13
G.5 Common Extensions
multibyte characters are members of the so-called extended character set. A
regular single-byte C character is just a special case of a multibyte sequence
where the sequence has a length of one.
Wide Characters
Some of the inconvenience of handling multibyte characters is eliminated if all
characters are of a uniform number of bytes or bits. A 16-bit integer value is used
to represent all members because there can be thousands or tens of thousands of
ideograms in an Asian character set.
Wide characters are integers of type wchar_t, defined in the headers stddef.h
and stdlib.h as:
typedef unsigned short wchar_t;
Such an integer can represent distinct codes for each of the characters in the
extended character set. The codes for the basic C character set have the same
values as their single-character forms.
Relationship Between Multibyte and Wide Characters
Multibyte characters are convenient for communicating between the program and
the outside world.
Wide characters are convenient for manipulating text within a program.
The fixed size of wide characters simplifies handling both individual characters and
arrays of characters.
MB_CUR_MAX Macro
The MB_CUR_MAX macro specifies the maximum number of bytes used in
representing a multibyte character in the current locale (category LC_CTYPE). The
MB_CUR_MAX macro is defined in the header STDLIBH as:
#define MB_CUR_MAX 2
Conversion Functions
The run-time library functions that manage multibyte characters and wide
characters are:
Function Description
mblen() Determines the length of a multibyte character.
mbtowc() Converts a multibyte character to a wide character.
wctomb() Converts a wide character to a multibyte character.
mbstowcs() Converts a string of multibyte characters to a string of wide characters.
wcstombs() Converts a string of wide characters to a string of multibyte characters.