ApplyBengaliReorder
Signature
function ApplyBengaliReorder(const Wide: UnicodeString): UnicodeString;
Purpose
Applies the Bengali reorder pre-pass to Wide and returns
the reordered UnicodeString ready for cmap + GSUB consumption.
Non-Bengali content (Latin, digits, punctuation, other scripts including
other Indic scripts) passes through byte-identical.
Reorder rules applied
- R1 Repha: when a syllable starts with
Ra (U+09B0) + Halant (U+09CD) + Consonant, the(Ra, Halant)pair moves to the syllable end. - R2 Pre-base matras:
U+09BFI,U+09C7E,U+09C8AI move to the syllable start. Note Bengali E/AI are pre-base, unlike Devanagari where they are above-base. - R3 Above-base matras: (empty for Bengali — no above-base matras in the main block)
- R4 Below-base matras:
U+09C1–U+09C4U/UU/Vocalic R/RR,U+09E2–U+09E3Vocalic L/LL emit after the base. - R5 Post-base matras:
U+09BEAA,U+09C0II,U+09D7AU length mark emit after below-base. - Split matras:
U+09CBOo decomposes toU+09C7(pre) +U+09BE(post);U+09CCAU decomposes toU+09C7(pre) +U+09D7(post).
Output layout per syllable: [pre-matras] + [base + halant + nukta + bindu/visarga/modifier] + [below-matras] + [post-matras] + [Repha: Ra Halant]?
Conjuncts (C + Halant + C) preserved in the base block.
Single-pass and idempotent.
Example
var
Wide: UnicodeString;
begin
// Input: KA + Oo-matra (single codepoint U+09CB)
Wide:= Doc.ApplyBengaliReorder(#$0995#$09CB);
// Wide is now: U+09C7 + KA + U+09BE (Oo decomposed to pre+post)
end;
See also
ApplyIndicReorder— total dispatcher covering all registered Indic scripts.ApplyDevanagariReorder— Devanagari single-script counterpart.GetBengaliCategory— Unicode codepoint → category lookup.
Standards
- Unicode 16.0 §12.2 (Bengali)
- Unicode 16.0
IndicSyllabicCategory.txtandIndicPositionalCategory.txt - ISO 32000-1 §9.10 (extraction of text content)
- OpenType Bengali shaping spec
Version history
- v2.119.71 — Introduced in Phase 8f.2. Complete shaper (R1, R2, R4, R5 + split-matra decomposition). Bengali becomes second registered Indic script after Devanagari.