FRACTAL IMAGE COMPRESSION

Diese Site wurde kopiert von
http://www.rasip.fer.hr/research/compress/algorithms/adv/fraccomp/index.html
1. Introduction
Ten to fifteen years ago fractal techniques were introduced in computer
graphics. These new ideas came from a mathematical theory called Iterated
Function Systems (IFS). This theory had been developed in 1981 by John
Hutchinson. Michael Barnsley, researcher from Georgia Institute of Technology,
wrote the popular book Fractals Everywhere. The book presents the
mathematics of IFS, a new tool for modeling techniques in computer graphics,
called as the Collage Theorem. The Collage Theorem states what an IFS must be
like in order to represent an image.
This presented an intriguing possibility. If fractal mathematics is good for
generating natural looking images (such as clouds, mountains, ...), could it
not be used to compress images ? Going from a given image to an IFS that can
generate original (more or less), is known as the inverse problem. This
problem remains unsolved. Barnsley, however, thought he had it solved. He
left Institute and found Iterated Systems Incorporated with Alan Sloan. At
January 1988, he announced article in BYTE Magazine about some calculations of
high compression ratio but article did not offer the solution of inverse
problem. At that time nobody knew exactly how to reproduce images at
reasonable compression rates with IFS. The only desperately suggestion to
IFS-based compress given image is so called the Graduate Student Algorithm.
Graduate Student Algorithm:
 | Acquire a graduate student.
 | Give the student a picture.
 | And a room with a graphics workstation.
 | Lock the door.
 | Wait until the student has reverse engineered the picture.
 | Open the door. |
| | | | |
Ironically, it was one of Barnsley's PhD students, Arnaud Jacquin, that
made the Graduate Student Algorithm obsolete. In March 1988, according to
Barnsley, he published a modified scheme for representing images called Partitioned
Iterated Function Systems (PIFS). This broke the ice for a new approach of
research in image encoding.
The idea in Jacquin's method was very simple. An image should not be
considered of as a collage of copies of the entire image, but of copies of
smaller parts of it. For instance, a part of a cloud does not look like an
entire landscape with mountains, sky and clouds, but it probably looks like to
another section of some cloud or some other parts in the image.
Fractal Image Compression was born.
2. What is Fractal Image Compression ?
Consider a special type of photocopying machine that reduces the input image
by a half and reproduces it three times on the copy. What will be happen when
we return back the output of the machine to the input ? Next figure shows the
output images:

It is obvious that the output images converge to the Sierpinski triangle. This
final image is called attractor for this photocopying machine. Any
initial image (letter 'A' in our case) will be transformed to the attractor if
we repeatedly run the machine. On the other words, the attractor for this
machine is always the same image without regardless of the initial image. This
feature is one of the keys to the fractal image compression.
How can we describe behaviour of the machine ? Transformations of the form as
follows will help us.

Such transformations are called affine transformations. Affine
transformations are able to skew, stretch, rotate, scale and translate an
input image.
Barnsley suggested that perhaps storing images as collections of
transformations could lead to image compression. His argument went from
possibility to describe the image as a few parameters of affine
transformations. If we have parameters of affine transformations (or
photocopying machine) we can produce the image.
The machine produces self-similarity images called fractals.We need only a few
parameters of affine transformations to produce Sierpinski triangle from any
initial image. But, how to find parameters for any image ? (that is inverse
problem mentioned earlier) Barnsley's dream is to discover algorithm for
calculating these parameters. Unfortunately, images from the nature are very
inapropriate for generating using affine transformations because natural
images are not exactly self-similar. His dream had never came true.
However, the trick exists. But, first of all, lets describe Iterated Function
Systems because Fractal Image Compression is based on it.
Iterated Function Systems
Behaviour of the photocopying machine is described with mathematical model
called an Iterated Function System (IFS). An iterated function system
consists of a collection of contractive affine transformations. This
collection of transformations defines a map 
For an input set S, we can compute
for each i, take the union of these sets, and get a new set W(S).
Hutchinson proved an important fact in Iterated Function Systems: if the
are contractive, then W is contractive. Hutchinson's theorem tells us
that the map W will have a unique fixed point in the space of all
images. That means, whatever image (or set) we start with, we can repeatedly
apply W to it and our initial image will converge to a fixed image.
Thus W completely determine a unique image.
In other words, given an input image (or set) ,
we can repeatedly apply W (or photocopying machine described with W)
and we will get
as a first copy,
as a second copy and so on. The attractor, unique image, as the result of the
transformations is

Self-Similarity in Images
Now, we want to find a map W which takes an input image and yields an
output image. If we want to know when W is contractive, we will have to
define a distance between two images. The distance is defined as

where f and g are value of the level of grey of pixel (for
greyscale image), P is the space of the image, and x and y
are the coordinates of any pixel. This distance defines position (x,y)
where images f and g differ the most.
Natural images are not exactly self similar. Lena image, a typical image of a
face, does not contain the type of self-similarity that can be found in the
Sierpinski triangle. But next image shows that we can find self-similar
portions of the image.
 |
 |
| Original Lena image |
Self-similar portions of the image |
A part of her hat is similar to a portion of the reflection of the hat in the
mirror. Tha main distinction between the kind of self-similarity found in the
Sierpinski triangle and Lena image is that the triangle is formed of copies of
its whole self under appropriate affine transformation while the Lena
image will be formed of copies of properly transformed parts of itself.
These parts are not exact the same; this means that the image we encode as a
set of transformations will not be an identical copy of the original image.
Experimental results suggest that most images such as images of trees, faces,
houses, clouds etc. have similar portions within itself.
Partitioned Iterated Function Systems
For encoding of the image, it is neccesary to divide it into non-overlapped
pieces (so called domains, D) which are each transformed separately. By
partitioning the image into pieces, we must to encode many shapes that are
difficult to encode using an IFS. Because, we have to use Partitioned
Iterated Function Systems (PIFS). PIFS is also based on affine
transformation but, now it describes transformation from one piece of the
image to another (so called ranges, R). Affine transformation mentioned
earlier is able to "geometricaly" transform part of the image but is
not able to transform grey level of the pixel. That is reason, we have to add
a new dimension into affine transformation

where
represents the contrast,
the brightness of the transformation and .
The transformation W has to be contractive in all three directions: x,
y and z. That transformation will be contractive when z
distances are shrunk by factor less than 1 or when each <
1 (some experiments show that 1.2 is still safe).
Now, when we have mathematical "background" we are able to consider
process of encoding images.
Encoding Images
Suppose we have an image f that we want to encode. On the other words,
we want to find a collection of maps
with
and .
That is, we want f to be the fixed point of the map W. The fixed
point is

We seek a partition of f into N non-overlapped pieces of the image to
which we apply the transforms
and get back f. This is too optimistic, because the image is not
exactly composed of pieces that can be transformed and fit somewhere else in
the image. What we can expect is another image
with
small. On the other words, we seek a transformation W whose fixed point
is not f, but it is close to.

The measure of approximation is distance 
We should find pieces
and maps ,
so that when we apply a
to the part of the image over ,
we should get something that is very close to the any other part of the image
over .
Finding the pieces
and corresponding
by minimizing distances between them is the goal of the problem.
At the end, let consider one remarkable example:
Suppose we have to encode 256x256 pixel greyscale image. Let
be the 8x8 pixel non-overlapping sub-squares of the image, and let D be
the collection of all overlapping 16x16 pixel sub-squares of the image (general,
domain is 4 times greater than range). The collection D contains
(256-16+1) * (256-16+1) = 58,081 squares. For each
search through all of D to find
with minimal distances. There are 8 ways (the square can be rotated to 4
orientations or fliped and rotated into further 4 orientations) to map one
square onto another. That means, we have to compare 8 * 58,081 = 464,648
squares with each of the 1024 range squares. Also, a square in D is 4
times smaller as an ,
so we have to average a square from D and prepare it for comparing.
To find most-likely sub-squares we have to minimize distance equation. That
means we must find a good choice for
that most looks like the image above
(in any of 8 ways of orientations) and find a good contrast
and brightness .
For each ,
pair we can compute contrast and brightness using least squares regression
which has the least root means square (rms) difference
. The squared Euclidean distance is used for this purpose 
where
is pixel intensities from ,
is pixel intensities from ,
and s and o are contrast and brightness respectively. To compute
contrast and brightness, we have to compute a minimum of the distance. The
minimum of E occurs when the partial derivatives with respect to s
and o are zero
which
occurs when
and

A choice of
with a corresponding
and
determines a map .
Once we have the collection
we can decode the image by estimating .
Each transformation requires 8 bits in the x and y direction to
locate the position of ,
7 bits for ,
5 bits for
and 3 bits to determine a rotation and flip operation for mapping
to .
We need 31 bits per transformation of 8x8 pixel sub-squares (8x8x8=512 bits),
giving a compression ratio 512/31 = 16.5:1.
Compression ratio is one of the parameters which determine compression
algorithm. Th second parameter is peak-to-peak signal-to-noise ratio (PSNR).
For 8-bit gray scale images the PSNR is defined as
There
are many other modification of mentioned algorithm such as: quadtree
partitioning, HV-partitioning, triangular partitioning, self vector
quantization, convolution transform coding etc.
Encoding Color Images
There is very little work has been published on color fractal image
compression. Recommendation is that not to encode the RGB components
separately but YIQ channels. YIQ channels can be encoded individually; the I
and Q channels should be encode at a lower bit rate than the Y channel.
Other suggestion is to encode the green component of RGB components
individually and try to predict the other (R and B) components.
Fractal Image Compression versus JPEG Compression
At high compression ratios, fractal compression has similar compression and
quality performance as the JPEG standard. Let see some comparisons: 
Original Lena image (184,320 bytes)
 |
 |
JPEG-max. quality (32,072)
comp. ratio: 5.75:1 |
FIF-max. quality (30,368)
comp. ratio: 6.07:1 |
 |
 |
JPEG-med. quality (11,278)
comp. ratio: 16.34:1 |
FIF-med. quality (7,339)
comp. ratio: 25.11:1 |
 |
 |
JPEG-min. quality (8,247)
comp. ratio: 22.35:1 |
FIF-min. quality (3,924)
comp. ratio: 46.97:1 |
These are expected results for Fractal Compression. Some images can be
compressed over 80:1 or higher. We can see worse image quality for higher
compression ratios.
Fractal Compression is an asymmetric process, compression taking a long time
compared with decompression. JPEG/DCT is symmetric process, encoding and
decoding taking the same amount of time. Fractal compression times can be
decreased by using a dedicated hardware. For example, for a 640 x 400 pixel 24
bit colour image, JPEG software compression and decompression took 6 seconds
each, for fractal compression the time was 68 seconds and decompression was
only one second. Reading the original uncompressed image required two seconds.
Another one very important feature of the Fractal Image Compression is Resolution
Independence.
Fractal encoding image has no natural size. It is set of transformations
defined on domains and ranges of the image. When we want to decode image, only
thing we have to do is apply these transformations on any initial image (it is
usually grey rectangle for greyscale images) iteratively. Every iteration is
step closer to the decoded image. After each iteration, details on the decoded
image are sharper and sharper. That means, the decoded image can be decoded at
any size. The extra detail needed for decoding at larger sizes is generated
automatically by the encoding transforms. The next question is: if we decode
an image at larger and larger sizes, will we be able to see atoms of the
Lena's face ? The answer is, of course, no because we are encoding digitized
image which is only approximation of the nature image but on the larger sizes
we do not have "pixelization" effect than we get sharper details.
 |
 |
Lena's eye
original image enlarged to 4 times |
Lena's eye
decoded at 4 times its encoding size |
This feature of Fractal Image Compression is unique.
So why isn't everyone using Fractal Image Compression ?
Fractal Image Compression is still under development. Many different
researchers and companies are trying to develope a new algorithms to reach
shorter encoding time and smaller files. But there are still some problems
with fractal compression. Fractal Compression Standard does not exist ! FIF,
Fractal Image Format is not standardized, is not embedded into any browsers
etc.
In spite of all, total number of publications on fractal image compression is
growing; over 280 papers was published last year.
Show must go on ...
|