property CharacterGenerated[Index: Integer]: Boolean; // read only
| Index | Zero-based character index on the current page, in the range 0 to CharacterCount - 1. |
CharacterGenerated returns True when the character at the
specified index was generated internally by PDFium rather than being present in the
original content stream. Typical examples are: ligature decomposition (splitting an
fi ligature glyph into f + i), the trailing
space that PDFium inserts between adjacent text runs that are visually separated, and
the line-break characters injected at the end of each visual line.
Generated characters have no own drawing in the page content stream, so their bounding boxes are derived from neighboring glyphs. Treat their on-page position with caution if you need pixel-accurate hit-testing; their CharacterRectangle can collapse to a zero-width or zero-height region.
For tasks such as exact verbatim extraction (digital signatures, hash-of-text workflows, ASCII-true comparison with the page content stream), filter out generated characters first. For normal text search and clipboard copy use the flag is purely informational — PDFium has already inserted the characters because users expect them.
True hit; do not assume generated characters are always invisible glyphs.
// Build a verbatim string that contains only characters from the content stream
var
I: Integer;
S: WString;
begin
S := '';
for I := 0 to Pdf.CharacterCount - 1 do
if not Pdf.CharacterGenerated[I] then
S := S + Pdf.Character[I];
Memo1.Text := S;
end;