Software Internationalization Guide
Software Characteristics That Vary by Locale
Software Internationalization Guide—526225-002
2-16
Other Collation Considerations
Stroke Count
One approach to collating ideographic characters is based on the number of strokes 
that make up the character. Characters containing fewer strokes sort first, followed by 
characters with more strokes.
Radical Base
Ideographic characters can be collated using a scheme based on radicals, which are 
the root structure of ideographs.
Phonetics
Pronunciation is another way of collating ideographic characters. A collation method 
based on pronunciation is the most difficult approach for collating ideographs because 
it is difficult to show how one element relates to a neighboring element.
Other Collation Considerations
When collating characters, you can define characters that are given no weight. These 
characters are called “don’t-care characters.” If a hyphen is defined as a don’t-care 
character, for example, the words re-creation and recreation collate to the same 
position.
In n-to-one character mappings, a string of characters is treated as a single collating 
element. An example is the Spanish character ch that appears between c and d when 
collated. There are also one-to-n character mappings where a single collating element 
is mapped to a string of characters. For example, the German character ß collates 
as ss.
Numeric Representation
Date formats, time formats, and monetary figures are represented in many ways 
around the world, so internationalized software must be flexible. Internationalized 
software must provide a way to overcome fixed numeric representation for date, time, 
and monetary formats. 
Date Formats 
Date formats vary among countries and cultures. A date consists of the year, the 
month, and the day in a variety of orders of presentation.
Table 2-6 on page 2-17 shows how some languages represent dates.










