LZSSWCompressor

class tinycompress.lzssw.LZSSWCompressor(ringsize=4096, maxmatchlen=16, commonbyte=32)[source]

LZSSW compression implementation.

This compressor implements the LZSSW algorithm using a sliding window approach. It maintains a ring buffer and binary tree structure to efficiently find repeated sequences in previously processed data.

The compressor can be tuned via three main parameters: - Ring buffer size: Controls look-back distance for finding matches - Maximum match length: Limits how long matched sequences can be - Common byte: Value used to initialize the ring buffer

Attributes

eof

Indicates whether the compressor has reached end of file state.

Methods

__init__

Initializes a new LZSSW compressor instance.

compress

Compresses the given data using LZSSW encoding.

flush

Flushes the compressor and returns any remaining compressed data.

reset

Resets the compressor to its initial state.

__init__(ringsize=4096, maxmatchlen=16, commonbyte=32)[source]

Initializes a new LZSSW compressor instance.

Parameters:
  • ringsize (int) – Size of the ring buffer, must be between RING_SIZE_MIN and RING_SIZE_MAX. Larger sizes allow finding matches further back.

  • maxmatchlen (int) – Maximum length of matched sequences, between MAX_MATCH_LEN_MIN and MAX_MATCH_LEN_MAX. Larger values can improve compression but increase processing time.

  • commonbyte (int) – Value used to fill the initial ring buffer, must be 0-255. Should be set to a commonly occurring byte value in the input.

Raises:

ValueError – If any parameters are out of their valid ranges.

__weakref__

list of weak references to the object

_delete_node(p)[source]

Removes a node from the binary tree.

This internal method removes a position from the binary tree when it moves out of the sliding window, maintaining the tree’s structure.

Parameters:

p (int) – Position in the ring buffer to remove from the tree.

The method handles all cases of node removal (leaf nodes, nodes with one child, and nodes with two children) while preserving the tree balance.

_init_tree()[source]

Initializes the binary tree for match finding.

This internal method sets up the initial state of the binary tree by: 1. Setting right-hand links for all leaf nodes to point to ringsize 2. Setting all parent pointers to ringsize to mark nodes as unconnected

The binary tree structure is used to efficiently find matches in the ring buffer during compression. Each node in the tree represents a position in the buffer, with the tree organized to enable quick string comparisons.

_insert_node(r)[source]

Inserts a new node into the binary tree.

This internal method adds a new position in the ring buffer to the binary tree, maintaining the tree’s balance and match-finding properties.

Parameters:

r (int) – Position in the ring buffer to insert into the tree.

The method also updates the match length and position if it finds a better match while traversing the tree during insertion.

_iter_encode()[source]

Generator that implements the main compression loop.

This internal method is the core of the LZSSW compression algorithm. It: 1. Initializes the sliding window with the common byte 2. Processes input bytes one at a time 3. Maintains the binary tree to find matches 4. Outputs literal bytes or match position/length pairs 5. Updates the ring buffer as compression progresses

Yields:

None on each iteration, accepting the next input byte or None/Ellipsis to indicate end of input.

compress(data)[source]

Compresses the given data using LZSSW encoding.

The method feeds data to the compression generator, which looks for matches in the sliding window and outputs either literal bytes or match references.

Parameters:

data (Union[bytes, bytearray, memoryview, Iterable[int]]) – Input data to compress.

Returns:

Compressed data as bytearray.

Raises:

LZSSWException – If the compressor has already been flushed.

property eof: bool

Indicates whether the compressor has reached end of file state.

Returns:

bool – True if in EOF state (initialization state < 0), False otherwise.

flush()[source]

Flushes the compressor and returns any remaining compressed data.

This method signals the end of input to the compressor, which may output additional compressed data based on its internal state. After calling flush, the compressor cannot be used for further compression without calling reset.

Returns:

Any remaining compressed data as a bytearray, or empty if already flushed.

reset()[source]

Resets the compressor to its initial state.

Clears all internal buffers, match tracking variables, and the binary tree. This allows the compressor to be reused for a new compression task.