May 1, 1995 12:00 PM

Geek Page

Geek Page - Wavelet Image Compression

__ Geek Page - Wavelet Image Compression __

Beating the bandwidth bottleneck.

The military needs to send real-time video to hand-held receivers using only a 4800-baud satellite link. Sixteen Gbytes of remote sensing imagery need to be distributed on a single CD-ROM. These are examples of the bandwidth bottleneck that is holding back many multimedia applications. But a previously obscure mathematical tool known as wavelet analysis may eliminate the problem.

According to a growing number of proponents, wavelets allow unprecedented image-compression ratios at landmark speeds; they promise to surpass alternative techniques such as JPEG or fractal compression. However, the technique is only now moving from the mathematics community into industry.

The concept behind image compression is the same no matter what method is used. Any image can be described by listing the color of each pixel. But that's a waste of space. If a group of neighboring pixels are the same color, it is more efficient to use a single description for the region. Take it one step further. What about a group of pixels that are almost the same color? By replacing them with their average, we distort the image only slightly, and the description will be much more compact. This is known as lossy compression and is acceptable for most imaging applications.

What does vary among compression methods is how the regions of similarly colored pixels are detected and described. The technique used by wavelet compression represents an image in terms of special mathematical functions called wavelets. For example, an image that is black on the left and white on the right could be succinctly represented by the mathematical function color(x,y)=0 if x0.5. Of course, most images are much more complex, and require more elaborate functions to describe. This is where wavelets come into their own. Because wavelets can be flexibly shaped and molded to describe re-gions in terms of averages, they are the perfect building blocks to describe an image.

What makes wavelets better than older compression methods such as JPEG is their ability to adapt to the size and location of regions in the image. While JPEG works in terms of eight-by-eight squares, wavelets can describe regions of varying size, shape, and location.

How do we identify the regions that can be coalesced into a single description without significantly distorting the image? By the fast wavelet transform.

The fast wavelet transform takes an image and computes its wavelet coefficients. These numbers, combined with the wavelet function, can later be used to reconstruct the image. The coefficients are computed in different ways depending on the particular wavelet function we want to use. There are many: some smooth, others fractal-like, but they all boil down to fancy versions of averaging and differencing.

With the simplest wavelet, known as the Haar function, we take pairs of neighboring pixels and compute their average and their difference. So, for a row of pixels (14, 8, 4, 6), the averages would be (11, 5) and the differences (6, -2). The averages provide a copy of the original image at half-resolution, the differences provide the wavelet coefficients at that level of resolution. Put the differences aside and continue with the half-resolution image. Again, take averages and differences, put the differences aside, and continue with the yet smaller average image. Eventually, we obtain a single overall average and all the differences - the wavelet coefficients - at the various levels of resolution.

So far, nothing has been compressed. However, when two neighboring pixels have the same value, their difference, and therefore their associated wavelet coefficient, will be zero. There is no need to store this zero. We can also throw away coefficients that are close to zero without significantly distorting the image. We can then encode the remaining coefficients and transmit them along with the overall average value.

On the other end, the image can be reconstructed by performing the inverse of the original transform: we start with the overall average, add in the differences for that level of resolution, and then repeat the process until we have expanded the image back to its original size. Some detail will be lost because of the discarded coefficients, but the important regions - such as object edges, where color differences and coefficients are large - will have been preserved.

Of course, a slightly more complex technique will be used in real-world applications. The wavelet function, for example, will use a much fancier version of averaging and differencing that operates on more than two pixels at a time. But the basic ideas remain the same.

Several considerations are important when selecting a compression algorithm. The most obvious is space saved, and wavelets give consistently better results than other methods. Just as important is the time required for encoding and decoding. While all compression algorithms will perform better if they have more time, wave-lets with encoding times only about twice the decoding time achieve superb results. Compare this with fractal compression, which decompresses quickly but requires Herculean effort to compress it. This makes wavelets particularly advantageous for applications such as live video, in which we need to provide compression on the fly.

These advantages are attracting an increasing number of companies that need cutting-edge compression. Magnavox, for example, is incorporating wavelet compression into a number of its video products. The next step will be for this mathematical tool to prove itself in the entrenched world of standards committees.

Peter Schröder (ps@math.scarolina.edu) whose license plate reads "WAVELET," is a postdoc at the University of South Carolina. Wim Sweldens also contributed to this article.