JBIG2 Compression Support

Overview

HotPDF now includes comprehensive JBIG2 (Joint Bi-level Image Experts Group 2) compression support. JBIG2 is a lossy or lossless compression standard specifically designed for bi-level (monochrome) images, commonly used in PDF documents for scanned text pages and line art.

Key Features

High-efficiency compression for bilevel images
Support for both lossy and lossless compression modes
Optimized for text and line art compression
Significant file size reduction compared to traditional compression methods
Full PDF/A compliance support
Multi-page document support with shared dictionaries

Compression Benefits

Superior Compression Ratios: Up to 3-5x better compression than Group 4 CCITT
Pattern Recognition: Intelligent recognition of repeated patterns and symbols
Adaptive Compression: Automatically adjusts compression strategy based on content type
Memory Efficient: Optimized for processing large scanned documents

Typical Use Cases

Scanned text documents
Technical drawings and blueprints
Forms and invoices
Maps and diagrams
Mixed content documents with text and graphics

Technical Implementation

The JBIG2 support is implemented through the HPDFJBIG2.pas unit, which provides:

THPDFJBIG2Decoder: Core decoder class for JBIG2 compressed data
Stream Processing: Efficient handling of JBIG2 data streams
Global Dictionary Support: Shared symbol dictionaries for multi-page documents
Scanline Access: Row-by-row data access for memory-efficient processing

Usage Example


// Delphi example - Processing JBIG2 compressed data
procedure ProcessJBIG2Data;
var
  PDF: THotPDF;
  Decoder: THPDFJBIG2Decoder;
  ImageData: TBytesArray;
  ScanlineData: TBytesArray;
  Row: Integer;
begin
  PDF := THotPDF.Create(nil);
  Decoder := THPDFJBIG2Decoder.Create;
  try
    PDF.BeginDoc;
    PDF.AddPage;

    // Load JBIG2 compressed image data
    if Decoder.LoadFromByteArray(ImageData) then
    begin
      // Process scanline by scanline for memory efficiency
      for Row := 0 to Decoder.Height - 1 do
      begin
        if Decoder.GetScanline(Row, ScanlineData) then
        begin
          // Process scanline data as needed
          // Add to PDF page
        end;
      end;
    end;

    PDF.EndDoc;
  finally
    Decoder.Free;
    PDF.Free;
  end;
end;

Class Reference

THPDFJBIG2Decoder

LoadFromByteArray: Load JBIG2 data from byte array
GetScanline: Retrieve specific scanline data
Width/Height: Image dimensions properties

Performance Optimization

Memory-efficient scanline processing
Optimized for large document handling
Minimal memory footprint during decompression
Fast pattern recognition algorithms

Standards Compliance

ITU-T T.88 standard compliance
ISO/IEC 14492 standard support
PDF/A archival format compatibility
Cross-platform consistency

See Also