ApplyKannadaReorder
Signature
function ApplyKannadaReorder(const Wide: UnicodeString): UnicodeString;
Purpose
Applies the Kannada reorder pre-pass to Wide and returns
the reordered UnicodeString ready for cmap + GSUB consumption.
Non-Kannada content passes through byte-identical.
Kannada specifics
- R1 Repha enabled — Ra (
U+0CB0) + Halant (U+0CCD) at syllable start is detected, the pair is stripped from the cluster and re-emitted after the reordered output so the font's'rphf'GSUB feature can substitute the Repha glyph. - No pre-base matras — I (
U+0CBF) and E (U+0CC6) are above-base in Kannada, not pre-base. The pre-base buffer is always empty for valid Kannada text. - Five split matras with Unicode 16.0 canonical decompositions:
U+0CC0II →U+0CBF(above) +U+0CD5(post-base length mark).U+0CC7EE →U+0CC6(above) +U+0CD5(post-base length mark).U+0CC8AI →U+0CC6(above) +U+0CD6(above-base AI length mark) — both components above-base.U+0CCAO →U+0CC6(above) +U+0CC2(post-base UU).U+0CCBOO →U+0CC6(above) +U+0CC2(post) +U+0CD5(post) — three-part split, unique among Phase 8f scripts.
- Above-base matras: I (
U+0CBF), E (U+0CC6), AU (U+0CCC), AI length mark (U+0CD6). - Below-base matras: Vocalic R / RR (
U+0CC3–U+0CC4), Vocalic L / LL matras (U+0CE2–U+0CE3). - Post-base matras: AA (
U+0CBE), U / UU (U+0CC1–U+0CC2), post-base length mark (U+0CD5). - Halant is
U+0CCD.
Reorder rules applied
- R1 Repha: Ra + Halant at syllable start re-emitted at the end of the syllable.
- R3 Above-base matras: I / E / AU / AI-length-mark emit after the base block.
- R4 Below-base matras: Vocalic R / RR / L / LL emit after the above-base block.
- R5 Post-base matras: AA / U / UU / post-base length mark emit after the below-base block.
- Split matra decomposition: five splits routed to above + post (II, EE), above + above (AI), above + post (O), and above + post + post three-part (OO).
Output layout per syllable: [base + halant + bindu/visarga/modifier] + [above-matras] + [below-matras] + [post-matras] + [Repha: Ra Halant]?. Conjuncts (C + Halant + C) preserved in the base block. Single-pass and idempotent.
Example
var
Wide: UnicodeString;
begin
// Input: KA (U+0C95) + OO-matra (U+0CCB, three-part split)
Wide:= Doc.ApplyKannadaReorder(#$0C95#$0CCB);
// Wide is now: KA + E (U+0CC6, above) + UU (U+0CC2, post) + length-mark (U+0CD5, post)
end;
See also
ApplyIndicReorder— total dispatcher.ApplyDevanagariReorder— Devanagari counterpart.ApplyBengaliReorder— Bengali counterpart.ApplyGujaratiReorder— Gujarati counterpart.ApplyTamilReorder— Tamil counterpart.ApplyTeluguReorder— Telugu counterpart.GetKannadaCategory— Unicode codepoint → category lookup.
Standards
- Unicode 16.0 §12.9 (Kannada)
- Unicode 16.0
IndicSyllabicCategory.txt,IndicPositionalCategory.txt, andUnicodeData.txt(canonical decomposition source) - ISO 32000-1 §9.10 (extraction of text content)
- OpenType Kannada shaping spec (script tag
'knda')
Version history
- v2.119.75 — Introduced in Phase 8f.6. Complete shaper (R1 + R3 + R4 + R5 + five split-matra decompositions including a three-part split for
U+0CCBOO). Kannada becomes the sixth registered Indic script.