|
LZW Compression Support
Overview
HotPDF now includes comprehensive LZW (Lempel-Ziv-Welch) compression support for PDF streams.
LZW is a lossless data compression algorithm that's particularly effective for text and simple graphics,
commonly used in PDF documents for content stream compression.
Key Features
- Full LZW decompression support for PDF streams
- Configurable predictor support for enhanced compression
- Multiple fill order options (top-to-bottom, bottom-to-top)
- Early code change support for compatibility
- Memory-efficient stream processing
- PDF parameter integration (Predictor, Colors, BitsPerComponent, Columns)
Technical Implementation
The LZW compression support is implemented through the HPDFLZW.pas unit, which provides:
- TPDFLZWDecompressor: Main decompression class
- TPDFLZWParms: Parameter structure for PDF-specific settings
- TPDFLZWFillOrder: Enumeration for bit order processing
Class Reference
TPDFLZWDecompressor
The main class for LZW decompression operations.
Properties:
- FillOrder: Specifies bit order (foBottom, foTop)
- EarlyChange: Controls early code size changes
- InitialCodeSize: Initial code size for decompression
Methods:
- Decompress(Input: AnsiString): AnsiString - Basic decompression
- Decompress(Input: AnsiString; Parms: TPDFLZWParms): AnsiString - Decompression with PDF parameters
Usage Examples
Basic LZW Decompression
// Delphi example - Basic LZW decompression
procedure DecompressLZWData;
var
Decompressor: TPDFLZWDecompressor;
CompressedData: AnsiString;
DecompressedData: AnsiString;
begin
Decompressor := TPDFLZWDecompressor.Create;
try
// Configure decompressor
Decompressor.FillOrder := foTop;
Decompressor.EarlyChange := True;
Decompressor.InitialCodeSize := 9;
// Decompress data
DecompressedData := Decompressor.Decompress(CompressedData);
// Use decompressed data
ProcessDecompressedData(DecompressedData);
finally
Decompressor.Free;
end;
end;
Advanced LZW Decompression with PDF Parameters
// Delphi example - Advanced LZW decompression with PDF parameters
procedure DecompressLZWWithParms;
var
Decompressor: TPDFLZWDecompressor;
CompressedData: AnsiString;
DecompressedData: AnsiString;
Parms: TPDFLZWParms;
begin
Decompressor := TPDFLZWDecompressor.Create;
try
// Configure PDF parameters
Parms.Predictor := 2; // Horizontal differencing predictor
Parms.Colors := 3; // RGB color space
Parms.BitsPerComponent := 8; // 8 bits per component
Parms.Columns := 100; // Image width in pixels
Parms.ExpandedTo8Bit := True; // Expand to 8-bit components
Parms.ColorSpace := 'DeviceRGB';
// Decompress with parameters
DecompressedData := Decompressor.Decompress(CompressedData, Parms);
// Process the decompressed image data
ProcessImageData(DecompressedData, Parms);
finally
Decompressor.Free;
end;
end;
PDF Parameter Support
TPDFLZWParms Structure
- Predictor: Predictor function (0=none, 2=horizontal differencing)
- Colors: Number of color components (1=grayscale, 3=RGB, 4=CMYK)
- BitsPerComponent: Bits per color component (1, 2, 4, 8, 16)
- Columns: Number of samples per row
- ExpandedTo8Bit: Whether to expand to 8-bit components
- ColorSpace: Color space identifier
Algorithm Details
The LZW implementation includes:
- Dictionary-based Compression: Builds and maintains a dynamic dictionary
- Variable Code Length: Supports code lengths from 9 to 12 bits
- Clear Code Handling: Proper handling of clear and end-of-information codes
- String Table Management: Efficient string table for pattern recognition
Performance Characteristics
- Memory Efficient: Optimized for large data streams
- Fast Decompression: Highly optimized decompression algorithms
- Predictable Performance: Consistent performance across different data types
- Low Memory Footprint: Minimal memory overhead during processing
Common Use Cases
- PDF content stream decompression
- Image data decompression in PDF files
- Text stream decompression
- Form data decompression
- PostScript stream decompression
Error Handling
The LZW decompressor includes robust error handling for:
- Invalid code sequences
- Corrupted data streams
- Memory allocation failures
- Dictionary overflow conditions
Standards Compliance
- PDF 1.2+ LZWDecode filter compliance
- PostScript Level 2 LZW compatibility
- TIFF LZW compression compatibility
- Adobe LZW implementation compatibility
See Also
|