PDFiumVCL 文件

Text method

此 API 條目保留識別符號、簽名、程式碼塊和 PDF 術語的原始形式。
Component: TPdf  ·  Unit: PDFium
Extract text string from the page. StartIndex and Count parameters determine characters to be extracted. StartIndex parameter is 0-based.

Syntax

function Text(StartIndex: Integer = 0; Count: Integer = MaxInt): WString;

StartIndexInteger. Zero-based index of the first character to extract. Default is 0 (start of page text).
CountInteger. Maximum number of characters to extract. Default is MaxInt, which extracts all remaining characters from StartIndex to the end of the page.

Return Value

A WString (Unicode string) containing the extracted characters. Returns an empty string if StartIndex is beyond the end of the page text or if the page contains no extractable text.

Description

Text extracts the Unicode text content from the current page as a plain string. It reads from the page's internal text layer — the same layer used by PDF viewers for text selection and search — so the result reflects the logical reading order as determined by PDFium, which may differ from the visual order for complex layouts.

The optional StartIndex and Count parameters allow extracting a substring of the full page text. StartIndex is zero-based; omitting both parameters returns the entire page text. Use CharacterCount to determine the total number of extractable characters on the page before using index-based access.

This method extracts text from a loaded PDF page. It does not interact with text objects added via AddText until the document has been saved and reloaded. To extract text within a specific rectangular region, use TextInRectangle instead.

Example

// Extract all text from the first page
Pdf1.LoadFromFile('C:\Docs\report.pdf');
Pdf1.PageIndex := 0;
ShowMessage(Pdf1.Text);

// Extract characters 10..29 (20 characters)
var S: WString;
begin
  S := Pdf1.Text(10, 20);
end;

See Also

TextInRectangle, CharacterCount, FindFirst, Character