Compression Sort Transform

The New Sort Transform Data Compression Algorithm Explained

by Jason Williams (jasonw@tdv.com)
written 10 May 1993

This article is classified "Fictional"


Recent developments in data compression technology, such as the JPEG scheme,
have mostly involved using existing compression techniques on data after
passing it through a special transform which renders the data more
compressible.

Most of us have heard of discrete cosine transforms, wavelet transforms,
and power transformers, but a new field of compression transform technology
is rapidly emerging - the Compression Sort Transform.

The mathematical complexity of this type of transform is extremely high,
demanding an IQ of at least 60 for basic comprehension, so we will not
indulge in details in this article, but refer the topmost 2% of our
readership to interesting background material by D.E. Knuth in The Art of
Computer Programming Volume III: Sorting and Searching, and Well I'll Be
Damned, It Works! by S. Woolford.

The basic principle of the Compression Sort Transform goes like this:

Text and data, says Woolford, is typically stored as bytes of information.
Normally, if we compress this information, the sudden changes of byte values
in a sentence or across an X-rated picture confuse even the best compression
algorithms, often resulting in poor compression performance, and frustration
when you can only fit 5 X-rated pictures onto a disc.

However, if we sort the bytes making up our data by ascending value, the
file is rearranged into 256 zones of data, such that all of the bytes in
each zone are equal.

A simple pass over this data with a modified run-length compression engine
(details of which we are unable to disclose) will result in any file of up
to 4 Gigabytes being compressed to 1024 bytes.  With a smart compressor,
most files can be reduced to even less than this.

Researchers are currently working on using a binary adaptation of this
algorithm which sorts bits instead of bytes.  Once operational, this method
should be able to compress any binary stream of up to 4 billion bits into
less than 8 bytes of storage!

Unlike many other transforms, the exact value of each byte or bit is
entirely preserved by the compression, so the compression is not "lossy,"
a significant advantage.

The beauty of this system lies in its speed and quality -- products are
already available which replace the Full Sort Transform and run-length
compression passes with a single "Accumulator" pass which approximates the
full transform reasonably well (exactly, in fact), and executes extremely
quickly.

New products using early prototypes based upon this technique are already
flooding the data compression market, and millions of computer users are
benefitting from the sudden huge gain in data storage capacity.  Hard drive
manufacturers have resumed the manufacture of 5MB drives, and already some
large drive manufacturers have announced the cessation of products of more
than 20MB capacity, regarding such large capacities as "pointless in the
light of new data compression technologies."

The newly formed Las Stinken Richos research establishment in Rio de Janeiro
stated in a recent press release: "A decompressor for the Sort Transform
compression algorithm is forthcoming."

See also:
  • Binary
  • Poke And Hope Programming
  • RSA Broken By The Russians?
  • Random Sort Method

  • Go to [Root page | Title list | Author list | Date list | Index]