Docs de PDFiumVCL

Text method

Esta entrada API conserva identificadores, firmas, bloques de código y términos PDF en su forma original.
Component: TPdf  ·  Unit: PDFium
Extract text string from the page. StartIndex and Count parameters determine characters to be extracted. StartIndex parameter is 0-based.

Syntax

function Text(StartIndex: Integer = 0; Count: Integer = MaxInt): WString;

StartIndexInteger. Zero-based index of the first character to extract. Default is 0 (start of page text).
CountInteger. Maximum number of characters to extract. Default is MaxInt, which extracts all remaining characters from StartIndex to the end of the page.

Return Value

A WString (Unicode string) containing the extracted characters. Returns an empty string if StartIndex is beyond the end of the page text or if the page contains no extractable text.

Description

Text extracts the Unicode text content from the current page as a plain string. It reads from the page's internal text layer — the same layer used by PDF viewers for text selection and search — so the result reflects the logical reading order as determined by PDFium, which may differ from the visual order for complex layouts.

The optional StartIndex and Count parameters allow extracting a substring of the full page text. StartIndex is zero-based; omitting both parameters returns the entire page text. Use CharacterCount to determine the total number of extractable characters on the page before using index-based access.

This method extracts text from a loaded PDF page. It does not interact with text objects added via AddText until the document has been saved and reloaded. To extract text within a specific rectangular region, use TextInRectangle instead.

Example

// Extract all text from the first page
Pdf1.LoadFromFile('C:\Docs\report.pdf');
Pdf1.PageIndex := 0;
ShowMessage(Pdf1.Text);

// Extract characters 10..29 (20 characters)
var S: WString;
begin
  S := Pdf1.Text(10, 20);
end;

See Also

TextInRectangle, CharacterCount, FindFirst, Character