JBIG2 Compression Support

Overview

HotPDF now includes comprehensive JBIG2 (Joint Bi-level Image Experts Group 2) compression support. JBIG2 is a lossy or lossless compression standard specifically designed for bi-level (monochrome) images, commonly used in PDF documents for scanned text pages and line art.

Key Features

  • High-efficiency compression for bilevel images
  • Support for both lossy and lossless compression modes
  • Optimized for text and line art compression
  • Significant file size reduction compared to traditional compression methods
  • Full PDF/A compliance support
  • Multi-page document support with shared dictionaries

Compression Benefits

  • Superior Compression Ratios: Up to 3-5x better compression than Group 4 CCITT
  • Pattern Recognition: Intelligent recognition of repeated patterns and symbols
  • Adaptive Compression: Automatically adjusts compression strategy based on content type
  • Memory Efficient: Optimized for processing large scanned documents

Typical Use Cases

  • Scanned text documents
  • Technical drawings and blueprints
  • Forms and invoices
  • Maps and diagrams
  • Mixed content documents with text and graphics

Technical Implementation

The JBIG2 support is implemented through the HPDFJBIG2.pas unit, which provides:

  • THPDFJBIG2Decoder: Core decoder class for JBIG2 compressed data
  • Stream Processing: Efficient handling of JBIG2 data streams
  • Global Dictionary Support: Shared symbol dictionaries for multi-page documents
  • Scanline Access: Row-by-row data access for memory-efficient processing

Usage Example


// Delphi example - Processing JBIG2 compressed data
procedure ProcessJBIG2Data;
var
  PDF: THotPDF;
  Decoder: THPDFJBIG2Decoder;
  ImageData: TBytesArray;
  ScanlineData: TBytesArray;
  Row: Integer;
begin
  PDF := THotPDF.Create(nil);
  Decoder := THPDFJBIG2Decoder.Create;
  try
    PDF.BeginDoc;
    PDF.AddPage;

    // Load JBIG2 compressed image data
    if Decoder.LoadFromByteArray(ImageData) then
    begin
      // Process scanline by scanline for memory efficiency
      for Row := 0 to Decoder.Height - 1 do
      begin
        if Decoder.GetScanline(Row, ScanlineData) then
        begin
          // Process scanline data as needed
          // Add to PDF page
        end;
      end;
    end;

    PDF.EndDoc;
  finally
    Decoder.Free;
    PDF.Free;
  end;
end;
        

Class Reference

THPDFJBIG2Decoder

  • LoadFromByteArray: Load JBIG2 data from byte array
  • GetScanline: Retrieve specific scanline data
  • Width/Height: Image dimensions properties

Performance Optimization

  • Memory-efficient scanline processing
  • Optimized for large document handling
  • Minimal memory footprint during decompression
  • Fast pattern recognition algorithms

Standards Compliance

  • ITU-T T.88 standard compliance
  • ISO/IEC 14492 standard support
  • PDF/A archival format compatibility
  • Cross-platform consistency

See Also