GetTamilCategory
Signature
function GetTamilCategory(CP: Cardinal): Integer;
Purpose
Pure Unicode-codepoint → Tamil syllabic-category lookup. No font
state required. Returns one of 13 category codes (same numbering as
GetDevanagariCategory).
Return values
| Code | Category | Example codepoints |
|---|---|---|
| 0 | Other | U+0BD0 OM, U+0BF0–U+0BFF Tamil numerals/symbols |
| 1 | Consonant | U+0B95–U+0BB9 (coarse range; reserved gaps treated as consonants for simplicity — real Tamil text never uses them) |
| 2 | Independent vowel | U+0B85–U+0B8A, U+0B8E–U+0B90, U+0B92–U+0B94 |
| 3 | Matra (dependent vowel sign) | U+0BBE–U+0BC2, U+0BC6–U+0BC8, U+0BCA–U+0BCC, U+0BD7 |
| 4 | Virama (PULLI) | U+0BCD |
| 6 | Bindu (anusvara) | U+0B82 |
| 7 | Visarga (aytham) | U+0B83 |
| 9 | Digit | U+0BE6–U+0BEF |
| 10 | ZWJ | U+200D |
| 11 | ZWNJ | U+200C |
Notable Tamil-specific assignments
- I-matra (
U+0BBF): MatraPos = 2 (post-base) — unique among Brahmic scripts. Devanagari / Bengali / Gujarati use pre-base I. - II-matra (
U+0BC0): MatraPos = 3 (above-base). - E (
U+0BC6) / EE (U+0BC7) / AI (U+0BC8): MatraPos = 1 (pre-base). - O (
U+0BCA) / OO (U+0BCB) / AU (U+0BCC): MatraPos = 5 (split — decomposed byApplyTamilReorder). - AU length mark (
U+0BD7): MatraPos = 2 (post-base) — emitted by U+0BCC split. - Halant is named PULLI in Tamil (
U+0BCD). - Visarga (
U+0B83) is the Tamil aytham. - No category 5 Nukta (Tamil has no Nukta codepoint in main block) and no Danda (Tamil uses Latin punctuation).
See also
Version history
- v2.119.73 — Introduced in Phase 8f.4.