Guardian Programmer's Guide

Table Of Contents
Formatting and Manipulating Character Data
Guardian Programmer’s Guide 421922-014
19 - 54
Dealing With Fragments of Multibyte Characters
Dealing With Fragments of Multibyte Characters
If a read operation of a text string of multibyte characters finishes when the specified
read count is satisfied, then you cannot be sure whether the last byte read is the last
byte of a character or the first byte of a multibyte character. If it is the first byte of a
multibyte character, then its meaning is lost without the trailing byte. You should
therefore call the MBCS_TRIMFRAGMENT_ procedure, which checks the validity of
the last byte read and truncates it if it is the first byte of a multibyte character.
To use the MBCS_TRIMFRAGMENT_ procedure, you must supply it with a pointer to
the text string and the length of the text string in bytes. For example:
INT BUFFER[0:79]; !input buffer
STRING SBUFFER := @BUFFER '<<' 1; !byte pointer to input
! buffer
.
.
CALL READ(BUFFER,RCOUNT,BYTES^READ);
IF <> THEN CALL DEBUG;
IF BYTES^READ = RCOUNT THEN
CALL MBCS_TRIMFRAGMENT_(@SBUFFER,
BYTES^READ);
On return, the bytes-read parameter specifies the number of bytes in the text string
after the multibyte fragment is removed.
Handling Multibyte Blank Characters
Many applications expect an ASCII blank character (%H20) as a word delimiter in text
strings. Multibyte character sets typically use a multibyte character to represent a
blank. Some conversion therefore needs to be done if an application written for
standard ASCII input is to work for multibyte character sets. This conversion is done
using the MBCS_REPLACEBLANK_ procedure.
To use the MBCS_REPLACEBLANK_ procedure, you must supply it with a pointer to
th
e text string to be converted and the length of the text string as follows:
CALL MBCS_REPLACEBLANK_(@SBUFFER,
BYTES^READ);
On return, the text buffer contains the same text as input except that any multibyte
bl
ank characters are converted to pairs of ASCII blanks. An application that expects
ASCII blank characters can now process the text string correctly. At the same time,
the integrity of the text string structure is maintained by using two ASCII blank
characters to keep the text string the same length.
Determining the Character Size of a Multibyte Character Set
All currently supported multibyte character sets have two bytes per character. To
prepare your programs for future expansion, however, you may need to know the