Liblouis User's and Programmer's Manual
2.3 Character-Definition Opcodes ¶
These opcodes are needed to define attributes such as digit,
punctuation, letter, etc. for all characters and their dot patterns.
liblouis has no built-in character definitions, but such definitions
are essential to the operation of the context
opcode (see context
), the
correct
opcode (see correct
), the multipass opcodes and the back-translator. If
the dot pattern is a single cell, it is used to define the mapping
between dot patterns and characters, unless a display
opcode (see display
) for
that character-dot-pattern pair has been used previously. If only a
single-cell dot pattern has been given for a character, that dot
pattern is defined with the character’s own attributes.
You may have multiple definitions of a character using the same or different dot patterns. If you use different dot patterns for the same character, only the first dot pattern will be used during forward translation. However, during back-translation, all the relevant dot patterns will back-translate to the character you defined.
You can also define a character multiple times using the same dot
pattern for the character, but using different character classes. The
following example would define the character ‘*’ (star) as both
math
opcode (see math
) and sign
opcode (see sign
).
math * 16 sign * 16
Likewise, you can define multiple characters as the same dot pattern. The characters you define this way will be forward translated to the same dot pattern. However, when back-translating, the dot pattern will always back-translate to the first character that was defined with this pattern.
This technique may be useful when defining characters that have one representation in the Windows character set (CP1252) and another representation in the Unicode character set, e.g. the Euro sign, ‘€’. It may also be of use when you have to define several variants of the same letter with different accents, which may be represented in your Braille code by the same dot pattern. This is a very common practice for accented letters that are foreign to the Braille code. In the following example, both e acute (‘é’) and e grave (‘è’) are defined as dot 4 followed by dots 1 and 5.
lowercase \x00e9 4-15 # E acute lowercase \x00e8 4-15 # E grave
In this example, the dot pattern would always back-translate to e
acute, since this is the first definition. You could use the
correct
opcode (see correct
) to correct at least the most common errors on that
account. However, there is no fail-safe way to know what accented
letter to use when you back-translate from a dot pattern representing
more than one variant.
space character dots
Defines a character as a space and also defines the dot pattern as such. for example:
space \s 0 \s is the escape sequence for blank; 0 means no dots.
punctuation character dots
Associates a punctuation mark in the particular language with a braille representation and defines the character and dot pattern as punctuation. For example:
punctuation . 46 dot pattern for period in NAB computer braille
digit character dots
Associates a digit with a dot pattern and defines the character as a digit. For example:
digit 0 356 NAB computer braille
letter character dots
Associates a letter in the language with a braille representation and defines the character as a letter. This is intended for letters which are neither uppercase nor lowercase.
lowercase character dots
Associates a character with a dot pattern and defines the character as a lowercase letter. Both the character and the dot pattern have the attributes lowercase and letter.
uppercase character dots
Associates a character with a dot pattern and defines the character as an uppercase letter. Both the character and the dot pattern have the attributes uppercase and letter.
litdigit digit dots
Associates a digit with the dot pattern which should be used to represent it in literary texts. For example:
litdigit 0 245 litdigit 1 1
sign character dots
Associates a character with a dot pattern and defines both as a sign. This opcode should be used for things like at sign (‘@’), percent (‘%’), dollar sign (‘$’), etc. Do not use it to define ordinary punctuation such as period and comma. For example:
sign % 4-25-1234 literary percent sign
math character dots
Associates a character and a dot pattern and defines them as a mathematical symbol. It should be used for less than (‘<’), greater than(‘>’), equals(‘=’), plus(‘+’), etc. For example:
math + 346 plus
grouping name characters dots ,dots
This opcode is different from the previous ones in that it defines two characters in one rule, and associates them with each other. The opcode is used to indicate pairs of grouping symbols used in processing mathematical expressions. These symbols are usually generated by the MathML interpreter in liblouisutdml. They are used in multipass opcodes. The name operand must contain only letters (a-z and A-Z). The letters may be upper or lower-case but the case matters. The characters operand must contain exactly two Unicode characters. The dots operand must contain exactly two braille cells, separated by a comma.
grouping mrow \x0001\x0002 1e,2e grouping mfrac \x0003\x0004 3e,4e
base attribute <derived character> <base character>
-
This opcode is different in that it does not associate a character with a dot pattern, but it associates a character with another already defined character. The derived character inherits the dot pattern of the base character, and braille indicators (see Braille Indicator Opcodes) are used to distinguish them. The attribute operand refers to the character class (see Character-Class Opcodes) to which the character should be added. By defining braille indicator rules associated with this character class, you can determine the braille indicators to be inserted. The character operands are the derived character and the base character, respectively. A typical use of this opcode is for defining a pair of letters, a lowercase and the corresponding uppercase. For example:
lowercase a 1 base uppercase A a