Monthly Archives: October 2012

An Easy to Understand Explanation of the Burrows Wheeler Transform

Data Compression is one of the key components that has allowed an explosion in communications and the ability to store massive amounts of data.

There are many ways to compress data, however, one of the most elegant and effective is the Burrows-Wheeler Transform. In this article I will explain how the Burrows-Wheeler Transform works and offer an easy to understand java implementation.

 The Theory

An Overview of Burrows-Wheeler
The Burrows-Wheeler Transform is a text transformation that takes an input string and produces an output string with a high amount of repeating characters. For example the word “banana” is transformed into “annb$aa” ( the “$” will be explained shortly). This output can be efficiently compressed.

The magic of Burrows-Wheeler is it’s ability to create output that contains long runs of identical characters and it’s ability to transform the output back to it’s original state. I’ll explain how this works in two sections, Burrows-Wheeler Forward and Burrows-Wheeler Back.
Continue reading