C/C++ Programmer's Guide (G06.27+, H06.03+)

Table Of Contents
HP C Implementation-Defined Behavior
HP C/C++ Programmer’s Guide for NonStop Systems429301-010
A-12
G.5 Common Extensions
No characters have been added to the execution set required by the ISO/ANSI C
standard.
The direction of printing is left to right, and top to bottom.
The decimal point character is a period (.).
G.5 Common Extensions
There are no common extensions to the formats for time and date.
Multibyte Characters and Wide Characters
Multibyte characters and wide characters support Asian alphabets that often contain a
very large number of characters. The Guardian native C run-time library functions,
except for the strcoll() and strxfrm() functions, support these character sets:
Tandem Kanji, Chinese Big 5, Chinese PC, Hangul and KSC5601.
Discussion of multibyte characters applies only to the Guardian environment. For more
details on multibyte characters in the Open System Services (OSS) environment, see
the Software Internationalization Manual.
The Guardian native C run-time library functions mblen(), mbtoc(), mbtowcs(),
wctomb(), and wctombs() do not support multibyte characters for programs that use
the 32-bit (or wide) data model as described in this section. Guardian programs that
use the 32-bit data model must use the Guardian system procedures that support
multibyte characters instead. For more details, see the Guardian Programmer’s Guide.
The default character set supported by a system is configured at system installation
time and cannot be changed during program execution. The Guardian procedure
MBCS_DEFAULTCHARSET_ returns the identifier of the default character set. The
Guardian Procedure Calls Reference Manual describes this system procedure in
detail.
The internal representation of the characters of these languages is HP internal and
might not conform to any ISO standard. HP can choose to change this internal
representation at any time.
Multibyte Characters
The basic difficulty in an Asian environment is the huge number of ideograms that
are needed for I/0, for example Chinese characters. To work within the constraints
of usual computer architectures, these ideograms are encoded as sequences of
bytes. The associated operating systems, application programs, and terminals
understand these byte sequences as individual ideograms. Moreover, all these
encodings allow intermixing of regular single-byte C characters with the ideogram
byte sequences.
The term “multibyte character” denotes a byte sequence that encodes an
ideogram. The byte sequence contains one or more codes where each code can
be represented in a C character data type: char, signed char, or unsigned char. All