Enhanced ZLib Compression Support

Overview

HotPDF now includes comprehensive ZLib-compatible compression support backed by bundled zlib-ng for PDF streams and content. The zlib-compatible API provides efficient deflate/inflate compression that is widely used in PDF documents for FlateDecode filter operations and general data compression.

Key Features

  • ZLib-compatible API backed by bundled zlib-ng
  • PDF FlateDecode parameters support
  • Stream-based compression and decompression
  • Enhanced error handling and debugging capabilities
  • Memory-efficient processing for large data streams
  • Win64 one-shot image compression bridge for zlib-ng length ABI safety
  • Win64x and diagnostic MSVC Win64 zlib-ng runtime SIMD dispatch with generic fallbacks
  • 32-byte aligned C allocation bridge for SIMD-friendly native buffers
  • Cross-platform C library integration
  • Multiple compression levels (none, fastest, default, maximum)

Technical Implementation

The enhanced ZLib support is implemented through the HPDFZLib.pas unit, which provides:

  • Complete ZLib API: Full implementation of ZLib functions
  • Stream Classes: TCustomZStream and derived classes for stream processing
  • PDF Integration: TPDFFlateParms for PDF-specific parameters
  • Enhanced Functions: InflateStr, InflateStrParms for simplified usage

The 64-bit zlib-ng object sets include runtime-selected SSE2, SSSE3, SSE4.1, SSE4.2, PCLMULQDQ, and AVX2 paths. Win32 uses the generic zlib-ng object set to stay compatible with the bcc32c OMF toolchain, while Win32 JPEG/TIFF objects remain on classic bcc32.

Regression Coverage

The zlib-ng Flate backend is covered by the automated Delphi and C++Builder regression suites. The current validation set passes 20 Delphi DUnitX tests on Win32 and 20 on Win64, plus 17 C++Builder GoogleTest tests on Win32 and 17 on Win64x. The native object rebuilds also pass from clean outputs: 206 Win32 objects, 205 Win64x objects, and 206 MSVC Win64 objects. The covered workflows include page-stream compression, embedded font and ToUnicode CMap compression, Flate image streams, standard JPEG image placement, TIFF import, barcode-heavy output, copy/merge/edit scenarios, and demo-generated PDFs.

Compression Levels

  • Z_NO_COMPRESSION (0): No compression
  • Z_BEST_SPEED (1): Fastest compression
  • Z_BEST_COMPRESSION (9): Maximum compression
  • Z_DEFAULT_COMPRESSION (-1): Default balance of speed and compression

Usage Examples

Basic String Compression/Decompression


// Delphi example - Basic string inflation
function DecompressString(const CompressedData: AnsiString): AnsiString;
begin
  // Simple decompression
  Result := InflateStr(CompressedData);

  // With debug output enabled
  Result := InflateStr(CompressedData, True);
end;
        

Stream-based Compression


// Delphi example - Stream-based decompression
procedure DecompressStream(InStream, OutStream: TStream);
begin
  // Decompress from input stream to output stream
  PLZDecompressStream(InStream, OutStream);

  // For raw deflate data (without ZLib headers)
  PLZDecompressStreamRaw(InStream, OutStream);
end;
        

PDF FlateDecode with Parameters


// Delphi example - PDF FlateDecode with parameters
function DecompressPDFStream(const CompressedData: AnsiString;
  Predictor, Colors, BitsPerComponent, Columns: Integer): AnsiString;
var
  Parms: TPDFFlateParms;
begin
  // Configure PDF parameters
  Parms.Predictor := Predictor;
  Parms.Colors := Colors;
  Parms.BitsPerComponent := BitsPerComponent;
  Parms.Columns := Columns;
  Parms.ExpandedTo8Bit := True;
  Parms.ColorSpace := 'DeviceRGB';

  // Decompress with PDF parameters
  Result := InflateStrParms(CompressedData, Parms);
end;
        

Low-level ZLib Operations


// Delphi example - Low-level ZLib operations
procedure CompressData;
var
  strm: z_stream;
  ret: Integer;
  input: AnsiString;
  output: AnsiString;
begin
  input := 'Data to compress';
  SetLength(output, Length(input) * 2);

  // Initialize stream
  FillChar(strm, SizeOf(strm), 0);
  ret := deflateInit(strm, Z_DEFAULT_COMPRESSION);

  if ret = Z_OK then
  try
    // Set input
    strm.next_in := PByte(PAnsiChar(input));
    strm.avail_in := Length(input);

    // Set output
    strm.next_out := PByte(PAnsiChar(output));
    strm.avail_out := Length(output);

    // Compress
    ret := deflate(strm, Z_FINISH);

    if ret = Z_STREAM_END then
    begin
      // Compression successful
      SetLength(output, Length(output) - strm.avail_out);
    end;
  finally
    deflateEnd(strm);
  end;
end;
        

PDF FlateDecode Parameters

TPDFFlateParms Structure

  • Predictor: Predictor function (1=no prediction, 2=TIFF, 10-15=PNG)
  • Colors: Number of color components
  • BitsPerComponent: Bits per color component
  • Columns: Number of samples per row
  • ExpandedTo8Bit: Whether to expand to 8-bit components
  • ColorSpace: Color space identifier

Enhanced Features

  • Memory Management: Optimized memory allocation and deallocation
  • Error Handling: Comprehensive error checking and reporting
  • Debug Support: Optional debug output for troubleshooting
  • Performance Optimization: Highly optimized compression algorithms
  • Stream Integration: Seamless integration with Delphi stream classes

Performance Benefits

  • Fast compression and decompression speeds
  • Excellent compression ratios for text and simple graphics
  • Low memory footprint during processing
  • Optimized for PDF content streams
  • Minimal CPU overhead

Standards Compliance

  • RFC 1950 (ZLib format specification)
  • RFC 1951 (Deflate compression algorithm)
  • PDF FlateDecode filter specification
  • PNG predictor algorithm support
  • TIFF predictor algorithm support

Error Codes and Handling

  • Z_OK (0): Success
  • Z_STREAM_END (1): End of stream reached
  • Z_NEED_DICT (2): Dictionary required
  • Z_ERRNO (-1): File error
  • Z_STREAM_ERROR (-2): Stream state error
  • Z_DATA_ERROR (-3): Data integrity error
  • Z_MEM_ERROR (-4): Memory allocation error
  • Z_BUF_ERROR (-5): Buffer size error

See Also