GetSinhalaCategory

Signature

function GetSinhalaCategory(CP: Cardinal): Integer;

Purpose

Pure Unicode-codepoint → Sinhala syllabic-category lookup. No font state required. Returns one of 13 category codes (same numbering as GetDevanagariCategory).

Return values

CodeCategoryExample codepoints
0OtherUnassigned / reserved within block
1ConsonantU+0D9A–U+0DB1 (KA..NA), U+0DB3–U+0DBB (NDDA..RA), U+0DBD LA, U+0DC0–U+0DC6 (VA..FA)
2Independent vowelU+0D85–U+0D96
3Matra (dependent vowel sign)U+0DCF–U+0DD1, U+0DD2–U+0DD4, U+0DD6, U+0DD8–U+0DDF, U+0DF2–U+0DF3
4Virama (AL-LAKUNA)U+0DCA
6BinduU+0D81–U+0D82 (combining anusvara, anusvara)
7VisargaU+0D83
9DigitU+0DE6–U+0DEF
10ZWJU+200D
11ZWNJU+200C

Notable Sinhala-specific assignments

  • Three pre-base matras (MatraPos = 1): E (U+0DD9), EE (U+0DDA), AI (U+0DDB). Sinhala has more pre-base matras than any other Phase 8f Brahmic script.
  • Above-base matras (MatraPos = 3): I (U+0DD2), II (U+0DD3).
  • Below-base matras (MatraPos = 4): U (U+0DD4), UU (U+0DD6).
  • Post-base matras (MatraPos = 2): AA (U+0DCF), AE (U+0DD0), AAE (U+0DD1), Vocalic R matra (U+0DD8), L matra (U+0DDF), LL / LLL matras (U+0DF2U+0DF3).
  • Split matras (MatraPos = 5): O (U+0DDC), OO (U+0DDD), AU (U+0DDE). All three decomposed per Unicode 16.0 canonical decomposition by ApplySinhalaReorder; U+0DDD OO is a three-part split (pre + post + post).
  • Halant is U+0DCA (named AL-LAKUNA in Sinhala).
  • Two bindu codepoints (combining anusvara and anusvara) at U+0D81U+0D82; visarga at U+0D83.

See also

Version history

  • v2.119.77 — Introduced in Phase 8f.8.