Standard C++ Library Reference ISO/IEC (VERSION3)
You can write multibyte characters in C source text as part of a comment, a character constant,
a string literal, or a filename in an include directive. How such characters print is
implementation defined. Each sequence of multibyte characters that you write must begin and
end in the initial shift state. The program can also include multibyte characters in
null-terminated C strings used by several library functions, including the format strings for
printf and scanf. Each such character string must begin and end in the initial shift state.
Wide-Character Encoding
Each character in the extended character set also has an integer representation, called a
wide-character encoding. Each extended character has a unique wide-character value. The
value zero always corresponds to the null wide character. The type definition wchar_t
specifies the integer type that represents wide characters.
You write a wide-character constant as L'mbc', where mbc represents a single multibyte
character. You write a wide-character string literal as L"mbs", where mbs represents a
sequence of zero or more multibyte characters. The wide-character string literal L"xyz"
becomes a sequence of wide-character constants stored in successive bytes of memory,
followed by a null wide character:
{L'x', L'y', L'z', L'\0'}
The following library functions help you convert between the multibyte and wide-character
representations of extended characters: btowc, mblen, mbrlen, mbrtowc, mbsrtowcs,
mbstowcs, mbtowc, wcrtomb, wcsrtombs, wcstombs, wctob, and wctomb.
The macro MB_LEN_MAX specifies the length of the longest possible multibyte sequence
required to represent a single character defined by the implementation across supported locales.
And the macro MB_CUR_MAX specifies the length of the longest possible multibyte sequence
required to represent a single character defined for the current locale.
For example, the string literal "hello" becomes an array of six char:
{'h', 'e', 'l', 'l', 'o', 0}
while the wide-character string literal L"hello" becomes an array of six integers of type
wchar_t:
{L'h', L'e', L'l', L'l', L'o', 0}
See also the Table of Contents and the Index.
Copyright © 1989-2001 by P.J. Plauger and Jim Brodie. All rights reserved.