FRACTAL IMAGE COMPRESSION
Ironically, it was one of Barnsley's PhD students, Arnaud Jacquin, that
made the Graduate Student Algorithm obsolete. In March 1988, according to
Barnsley, he published a modified scheme for representing images called Partitioned
Iterated Function Systems (PIFS). This broke the ice for a new approach of
research in image encoding.
It is obvious that the output images converge to the Sierpinski triangle. This final image is called attractor for this photocopying machine. Any initial image (letter 'A' in our case) will be transformed to the attractor if we repeatedly run the machine. On the other words, the attractor for this machine is always the same image without regardless of the initial image. This feature is one of the keys to the fractal image compression.
How can we describe behaviour of the machine ? Transformations of the form as follows will help us.
Such transformations are called affine transformations. Affine transformations are able to skew, stretch, rotate, scale and translate an input image.
Barnsley suggested that perhaps storing images as collections of transformations could lead to image compression. His argument went from possibility to describe the image as a few parameters of affine transformations. If we have parameters of affine transformations (or photocopying machine) we can produce the image.
The machine produces self-similarity images called fractals.We need only a few parameters of affine transformations to produce Sierpinski triangle from any initial image. But, how to find parameters for any image ? (that is inverse problem mentioned earlier) Barnsley's dream is to discover algorithm for calculating these parameters. Unfortunately, images from the nature are very inapropriate for generating using affine transformations because natural images are not exactly self-similar. His dream had never came true.
However, the trick exists. But, first of all, lets describe Iterated Function Systems because Fractal Image Compression is based on it.
Iterated Function Systems
Behaviour of the photocopying machine is described with mathematical model called an Iterated Function System (IFS). An iterated function system consists of a collection of contractive affine transformations. This collection of transformations defines a map
For an input set S, we can compute for each i, take the union of these sets, and get a new set W(S). Hutchinson proved an important fact in Iterated Function Systems: if the are contractive, then W is contractive. Hutchinson's theorem tells us that the map W will have a unique fixed point in the space of all images. That means, whatever image (or set) we start with, we can repeatedly apply W to it and our initial image will converge to a fixed image. Thus W completely determine a unique image.
In other words, given an input image (or set) , we can repeatedly apply W (or photocopying machine described with W) and we will get as a first copy, as a second copy and so on. The attractor, unique image, as the result of the transformations is
Self-Similarity in Images
Now, we want to find a map W which takes an input image and yields an output image. If we want to know when W is contractive, we will have to define a distance between two images. The distance is defined as
where f and g are value of the level of grey of pixel (for greyscale image), P is the space of the image, and x and y are the coordinates of any pixel. This distance defines position (x,y) where images f and g differ the most.
Natural images are not exactly self similar. Lena image, a typical image of a face, does not contain the type of self-similarity that can be found in the Sierpinski triangle. But next image shows that we can find self-similar portions of the image.
where represents the contrast, the brightness of the transformation and .
The transformation W has to be contractive in all three directions: x, y and z. That transformation will be contractive when z distances are shrunk by factor less than 1 or when each < 1 (some experiments show that 1.2 is still safe).
Now, when we have mathematical "background" we are able to consider process of encoding images.
Suppose we have an image f that we want to encode. On the other words, we want to find a collection of maps with and . That is, we want f to be the fixed point of the map W. The fixed point is
We seek a partition of f into N non-overlapped pieces of the image to which we apply the transforms and get back f. This is too optimistic, because the image is not exactly composed of pieces that can be transformed and fit somewhere else in the image. What we can expect is another image with small. On the other words, we seek a transformation W whose fixed point is not f, but it is close to.
The measure of approximation is distance
We should find pieces and maps , so that when we apply a to the part of the image over , we should get something that is very close to the any other part of the image over . Finding the pieces and corresponding by minimizing distances between them is the goal of the problem.
At the end, let consider one remarkable example:
Suppose we have to encode 256x256 pixel greyscale image. Let be the 8x8 pixel non-overlapping sub-squares of the image, and let D be the collection of all overlapping 16x16 pixel sub-squares of the image (general, domain is 4 times greater than range). The collection D contains (256-16+1) * (256-16+1) = 58,081 squares. For each search through all of D to find with minimal distances. There are 8 ways (the square can be rotated to 4 orientations or fliped and rotated into further 4 orientations) to map one square onto another. That means, we have to compare 8 * 58,081 = 464,648 squares with each of the 1024 range squares. Also, a square in D is 4 times smaller as an , so we have to average a square from D and prepare it for comparing.
To find most-likely sub-squares we have to minimize distance equation. That means we must find a good choice for that most looks like the image above (in any of 8 ways of orientations) and find a good contrast and brightness . For each , pair we can compute contrast and brightness using least squares regression which has the least root means square (rms) difference
. The squared Euclidean distance is used for this purpose
where is pixel intensities from , is pixel intensities from , and s and o are contrast and brightness respectively. To compute contrast and brightness, we have to compute a minimum of the distance. The minimum of E occurs when the partial derivatives with respect to s and o are zero
A choice of with a corresponding and determines a map . Once we have the collection we can decode the image by estimating .
Each transformation requires 8 bits in the x and y direction to locate the position of , 7 bits for , 5 bits for and 3 bits to determine a rotation and flip operation for mapping to . We need 31 bits per transformation of 8x8 pixel sub-squares (8x8x8=512 bits), giving a compression ratio 512/31 = 16.5:1.
Compression ratio is one of the parameters which determine compression algorithm. Th second parameter is peak-to-peak signal-to-noise ratio (PSNR). For 8-bit gray scale images the PSNR is defined as
Encoding Color Images
There is very little work has been published on color fractal image compression. Recommendation is that not to encode the RGB components separately but YIQ channels. YIQ channels can be encoded individually; the I and Q channels should be encode at a lower bit rate than the Y channel.
Other suggestion is to encode the green component of RGB components individually and try to predict the other (R and B) components.
Fractal Image Compression versus JPEG Compression
At high compression ratios, fractal compression has similar compression and quality performance as the JPEG standard. Let see some comparisons:
Original Lena image (184,320 bytes)