The charmap File

The charmap file defines a mapping between the symbolic names of characters and the hexadecimal values associated with the character in a given coded character set. Optionally, it can provide the alternate symbolic names for characters.

To make locale source files more generic for creating a locale, refer to characters by their symbolic names or alternate symbolic names. The resulting locale source files are independent of the encoding of the character set they represent.

Each charmap file must contain at least the definition of the portable character set and the character symbolic names associated with each character.

The charmap file consists of number of optional symbols definitions, followed by two main sections:

  1. The character symbolic name to hexadecimal mapping section, or CHARMAP
  2. The character symbolic name to character set identifier section, or CHARSETID

The optional symbols are:

  • <code_set_name>
  • <mb_cur_max>
  • <mb_cur_min>
  • <escape_char>
  • <comment_char>
  • Each symbol definition consists of a symbol from the list above, starting in column 1, including the surrounding brackets, followed by one or more blanks, followed by the value to be assigned to the symbol.

    <code_set_name>
    The string literal containing the name of the coded character set name

    <mb_cur_max>
    The maximum number of bytes in a multibyte character, which can be 1 or 2. If it is 1, each character in the character set defined in this charmap is encoded by a one-byte value. If it is 2, each character in the character set defined in this charmap is encoded by a one- or two-byte value. If this symbol is not defined or a value other than 1 or 2 is given, the default value of 1 is assumed.

    <mb_cur_min>
    The minimum number of bytes in a multibyte character. Can be set to 1 only. If a value of other than 1 is specified, a warning message is issued and the default value of 1 is assumed.

    <escape_char>
    Specifies the escape character that is used to specify hexadecimal or octal notation for numeric values. It defaults to the hexadecimal value 0x5C, which represents the \ character in the coded character set IBM-850.

    <comment_char>
    Denotes the character chosen to indicate a comment within a charmap file. It defaults to the hexadecimal value 0x23, which represents the # character in the coded character set IBM-850.



    Internationalization


    Customize a Locale


    Locale Categories