The LC_COLLATE category definition in a locale source file establishes the relative order between collating elements in the locale that is compiled from that source by the LOCALDEF utility. The LC_COLLATE keywords establish a collation sequence that assigns each element one or more collation values.
The following keywords are recognized in a collation sequence definition:
copy
Specifies the name of an existing locale to be used as
the source for the definition of this category. If this keyword
is specified, no other keyword shall be present in this category.
If the locale is not found, an error is reported and no locale
output is created. The copy keyword cannot specify a locale that
also specifies the copy keyword for the same category.
collating-element
Defines a collating-element
symbol representing a multicharacter collating element. This
keyword is optional.
In addition to the collating elements in the character set, the collating-element keyword can be used to define multicharacter collating elements. The syntax is:
"collating-element %s from %s\n", <collating-element>, <string>
The <collating-element> should be a symbolic name enclosed between angle brackets (< and >), and should not duplicate any symbolic name in the current charmap file (if any), or any other symbolic name defined in this collation definition. The string operand is a string of two or more characters that collate as an entity. A <collating-element> defined with this keyword is only recognized within the LC_COLLATE category.
For example:
collating-element <ch> from "<c><h>" collating-element <e-acute> from "<acute><e>" collating-element <ll> from "ll"
collating-symbol
Defines a collating symbol for
use in collation order statements.
The collating-symbol keyword defines a symbolic name that can be associated with a relative position in the character order sequence. While such a symbolic name does not represent any collating element, it can be used as a weight. This keyword is optional.
This construct can define symbols for use in collation sequence statements, between the order_start and order_end keywords.
The syntax is:
"collating-symbol %s\n", <collating-symbol>
The <collating-symbol> must be a symbolic name, enclosed between angle brackets (< and >), and should not duplicate any symbolic name in the current charmap file (if any), or any other symbolic name defined in this collation definition. A <collating-symbol> defined with this keyword is only recognized within the LC_COLLATE category.
For example:
collating-symbol <UPPER_CASE> collating-symbol <HIGH>
substitute
Defines a substring
substitution in a string to be collated. This keyword is
optional. The following operands are supported with the
substitute keyword:
"substitute %s with %s\n", <regular-expr>,<replacement>
The first operand is treated as a basic regular expression. The replacement operand consists of zero or more characters and regular expression back-references (for example, \1 through \9). The back-references consist of the backslash followed by a digit from 1 to 9. If the backslash is followed by two or three digits, it is interpreted as an octal constant.
When strings are collated according to a collation definition containing substitute statements, the collation behaves as if occurrences of substrings matching the basic regular expression are replaced by the replacement string, before the strings are compared based on the specified collation sequence. Ranges in the regular expression are interpreted according to the current character collation sequence and character classes according to the character classification specified by the LC_CTYPE environment variable at collation time. If more than one substitute statement is present in the collation definition, the collation process behaves as if the substitute statements are applied to the strings in the order they occur in the source definition. The substitution for the substitute statements are processed before any substitutions for one-to-many mappings.
The support of the substitute keyword is an IBM VisualAge for C++ extension to the POSIX standard.
order_start
Define collating rules. This statement is followed by
one or more collation order statements, assigning character
collation values and collation weights to collating elements.
The order_start keyword must precede collation order entries. It defines the number of weights for this collation sequence definition and other collation rules.
The syntax of the order_start keyword is:
order_start <sort-rule1>;<sort-rule2>;...;<sort-rulen>
The operands of the order_start keyword are optional. If present, the operands define rules to be applied when strings are compared. The number of operands define how many weights each element is assigned; if no operands are present, one forward operand is assumed. If any is present, the first operand defines rules to be applied when comparing strings using the first (primary) weight; the second when comparing strings using the second weight, and so on. Operands are separated by semicolons (;). Each operand consists of one or more collation directives separated by commas (,). If the number of operands exceeds the limit of 6, the LOCALDEF utility issues a warning message.
The order-start keyword supports the following directives:
order_end
Terminates the collating order
entries.
Example: LC_COLLATE Locale Category Definition
![]()
Internationalization
Localization
and Locales
![]()
LC_COLLATE Category
LC_COLLATE Collating
Rules
Locale Categories
Locale Source
Files