|
Syriac / Mongolian / Devanagari Shaping Support Multi-script capability surfaces (v2.119.53 - v2.119.55)
|
Arabic Shaping Auto Shaping Pipeline GSUB Engine |
|
HotPDF exposes joining-class / positional-analysis / Indic syllabic-category capability surfaces for three complex scripts beyond Arabic: Syriac (U+0700-U+074F), Mongolian (U+1800-U+18AF), and Devanagari (U+0900-U+097F). Each capability lets callers drive the script through the existing OpenType GSUB engine for correct shaping, while keeping the heavy lifting (cluster boundary detection, BiDi resolution, GPOS positioning) outside HotPDF's scope.
Syriac shaping capability (v2.119.53) Two new methods expose Syriac's joining behaviour:
function GetSyriacJoiningClass(CP: Cardinal): TJoiningClass; function GetSyriacPosition(const Run: array of Cardinal; Index: Integer): TPosition;
Syriac follows the same Right-Joining / Dual-Joining / Transparent / Non-Joining four-class framework as Arabic;
Unlike Arabic, the Syriac block has no Presentation Forms pre-encoded into Unicode - there is no Syriac equivalent of the U+FB50-FDFF / U+FE70-FEFC Arabic Presentation Forms blocks. Consumers must therefore drive Syriac shaping through font-defined GSUB lookups (typically
Mongolian shaping capability (v2.119.54) Two parallel methods for Mongolian:
function GetMongolianJoiningClass(CP: Cardinal): TJoiningClass; function GetMongolianPosition(const Run: array of Cardinal; Index: Integer): TPosition;
Coverage includes basic Mongolian (U+1820-U+1842), Todo (U+1843-U+1877), Sibe (U+1880-U+18A8), Manchu, and Ali Gali extensions. Variation Selectors FVS1 / FVS2 / FVS3 (U+180B-U+180D), the soft hyphen NIRUGU (U+180A), and Ali Gali vowel marks are all classified as Transparent (T-class) so they participate in joining without breaking the walk.
Like Syriac, Mongolian has no Presentation Forms pre-encoded in Unicode. Mongolian's traditional vertical layout, complex letter-shape variation rules, and FVS-driven shape selection are all expected to be driven by font-defined GSUB lookups (typically
Devanagari Indic shaping capability (v2.119.55) Devanagari is fundamentally different from Arabic / Syriac / Mongolian - it is an Indic abugida script where the meaningful unit is a syllable cluster (akshara), not a single letter, and the rendered order of glyphs inside a cluster is often different from the logical Unicode order. Two methods expose Devanagari's Indic capability layer:
function GetDevanagariCategory(CP: Cardinal): TIndicCategory; procedure ApplyDevanagariReorder(var Run: array of Cardinal);
Automatic integration (v2.119.67): when
Typical workflow (Syriac)
PDF.RegisterUnicodeTTF('Estrangelo', 'SyrCOMEdessa.otf'); PDF.SetGSUBScript('syrc'); // see GSUB engine doc for i := 0 to Length(Run) - 1 do begin Pos := PDF.GetSyriacPosition(Run, i); // init / medi / fina / isol // query GSUB for the position-appropriate substitute glyph // emit + MarkUnicodeGlyphUsed end;
Typical workflow (Devanagari with automatic reorder)
PDF.RegisterUnicodeTTF('NotoDeva', 'NotoSansDevanagari-Regular.ttf'); PDF.ShapingFeatures := [sfIndicShaping]; // auto Repha + I-matra reorder PDF.CurrentPage.SetFont('NotoDeva', [], 14); PDF.CurrentPage.UnicodeTextOut(50, 700, 0, UnicodeString(#$0939#$093F#$0928#$094D#$0926#$0940)); // "Hindi"
Unicode Subset and Extraction Helpers
Scope and limitations Текущая интеграция на стороне создателя Формирование Syriac, Mongolian, Tibetan и Indic теперь имеет явные справочные страницы API. Syriac можно включить через Текущая область также охватывает N'Ko и Adlam как курсивные RTL-письменности, Thai/Lao для SARA AM и тоновых знаков, Hebrew для порядка niqqud и Javanese для предбазовых знаков. Эти пути описаны в script shaping preprocess methods
См. также: Arabic / Persian / Urdu Shaping Support, Automatic Shaping Pipeline (Phase 8), OpenType GSUB Substitution Engine, THotPDF.AssignSyntheticCodepointForGID |