Rechnernetze
Home Nach oben

Das JPEG-Format

JPEG steht für Joint Photographic Experts Group und ist ein graphisches Kompressionsverfahren, welches die Information nicht exakt abspeichert (verlustbehaftete Kompression, lossy compression).

Im ersten Schritt wird das Bild in Blöcken von 8x8 Bit zerteilt. In einem optionalen Schritt wird die Darstellung in Luminanz (Helligkeit) und Chromatik (Farbschattierung, hue) aufgeteilt, da das menschliche Auge Farbe weniger deutlich wahrnimmt als Helligkeitsunterschiede, können Farbwerte gröber kodiert, also besser optimiert werden. Optional können mehrere Bits in Blöcken zusammengefasst werden, in denen die Farbwerte gemittelt werden.

In einem zweiten Schritt wird die Darstellung mittels der diskreten Cosinustransformation (DCT) in einen anderen Bereich transformiert. Dieses bewirkt keinerlei Datenreduktion, erlaubt aber die Unterscheidung der Intensität durch unterschiedliche Koeffizienten, die in stärker und weniger stark beeinflussende aufgeteilt sind. Im Gegensatz zur bekannteren Fouriertransformation bleiben bei der Cosinustransformation spezielle Bildelemente wie Geraden besser erhalten. Kompression erhält man durch folgende Techniken:

Tabellen mit Faktoren bestimmen den Reduktionsgrad (und somit den Verlust); die Koeffizienten der DCT werden durch die Faktoren dividiert und auf die nächste ganze Zahl gerundet. Große Faktoren verringern somit die Auflösung für entsprechende Koeffizienten stärker als kleine Faktoren.
Durch eine "Zick-Zack"-Kodierung werden die großen Koeffizienten für die niedrigen Frequenzen an den Anfang gelegt. Andere Koeffizienten erhalten mit größerer Wahrscheinlichkeit den Wert null.
Von aufeinanderfolgenden 8x8-Blöcken brauchen nur die Differenzen kodiert zu werden, wenn die Blöcke sich ähneln.
Längere Folgen von Nullwerten können übersprungen werden.
Die so erhaltenen Koeffizientenfolgen können dann durch eine Huffmann- oder arithmetische Kodierung weiter komprimiert werden.

JPEG kennt vier Modi,

  1. Sequential Mode
  2. Lossless Mode
  3. Progressive Mode
  4. Hierarchical Mode

die genauer in der Literatur erklärt werden. Insbesondere wird eine verlustfreie Kodierung nur auf Differenzbilder angewendet.

 

Die folgende Site wurde kopiert von

http://www.rasip.fer.hr/research/compress/algorithms/adv/jpeg/index.html

JPEG basics

Introduction

The following text is intended to shed light on some basic procedures in JPEG file creation. For detailed reference see the bottom of this page.

JPEG stands for Joint Photographic Experts Group, which was the committee that wrote the standard in late eighties and early nineties. The format is ISO standard 10918.

Considering there was no adequate standard for compressing 24-bit per pixel colour data, committee came up with algorithm for compressing full colour or greyscale images depicting real world scenes (like photographs). The main benefit JPEG exploits from human eye (in)sensitivity to certain image aspects and thus producing a powerful compression ratio even to 100:1 (usually 10:1 to 20:1 without noticeable degradation). The JPEG isn´t so good with line art, cartoons and one coloured blocks.

It is obvious that mentioned technique is lossy one, meaning that decompression won´t produce the original image to perfection, but a near match. The quality is left to user to selected it as he thinks it fit, having in mind preferred disk space / quality ratio.



The algorithm is divided in four general steps and they are as follows :

 

Matrix creation and colour scheme conversion
Discrete cosine conversion
Quantization
Additional encoding




1. Matrix creation and colour scheme conversion

The source image is broken into 8x8 pixel samples (8x8 matrix filled with number values). Next is colour scheme conversion. Usually the RGB is normalized and converted to YUV (or some other) colour space (YUV gives us luminance-brightness and chrominance-hue image information ). The human eye is much more sensitive to luminance, so additional space is gained by discarding some chrominance info. Conversion is not necessary neither for greyscale nor for full colour images, but in latter case the compression is better.

The final matrix is sent for next conversion ...........

2. Discrete cosine conversion

 

1
0
20
11
0
9
43
3
0
12
22
32
21
17
28
9
12
0
7
26
13
31
4
0
8
22
4
0
19
2
21
12
22
19
11
9
5
12
27
40
6
0
17
24
19
2
11
19
10
33
20
0
3
7
11
16
21
0
15
23
11
14
25
0

Figure 1.

On fig.1 we have 8x8 matrix representing some picture (brightness/hue information).

Two dimensional discrete cosine conversion (DCT) is computed for each element (one dimension for rows and one for columns), thus giving us 64 DCT coefficients representing initial image frequencies


 

f(m,n) are pixel value and t(i,j) are frequencies coefficients.

This looks nasty, but it's nothing more but four embedded for instructions.

The first coefficient (usually called DC) is average of rest of 63 coefficients (AC). More important frequencies are grouped around upper left corner. Another change is greater overall zero count (coefficient with value 0, see figure 2.), which will contribute to the final compression ratio.


40
0
20
0
0
0
15
0
0
0
0
0
0
0
0
0
47
0
25
0
5
0
9
0
0
0
0
0
0
0
0
0
22
0
0
0
11
0
0
0
0
0
0
0
0
0
0
0
2
0
12
0
16
0
4
0
0
0
0
0
0
0
0
0

Figure 2.

 

This are only random numbers, which were not computed from original matrix. They are only representation of possible cosine transformation.

So, step two produced 64 DCT coefficient matrix which follows us to another step ......

3. Quantization

Finally , we come to lossy part of the story. By defining preferred image quality, user creates two constant tables, one for luminance and one for chrominance. Those tables are saved in file header for later decompression. Every value in matrix is divided with right table constant and rounded to integer.

Punch line is that quantisation "destroys" unimportant frequencies, while the important ones loses their previous precision. This part is crucial one, considering we choose the quality of output picture.

 

4. Additional coding

Prior the encoding, the matrix is unfolded to one dimensional array by use of so called zigzag pattern (figure 3). Lower frequencies coming first and higher last. Higher ones are likely to be zeroes and overall compression is improved.

 

Figure 3.

There are two encoding mechanisms in use today - Huffman or arithmetic. Due to some legal mix ups, Huffman algorithm is preferred one (arithmetic encoder is partly owned by certain companies).

Huffman coding is explained in details in standard algorithm section.

 

JPEG types

Two widely used types are baseline and progressive JPEG.

The difference in structure is a little one. Progressive JPEG additionally reorders the compressed data. Baseline JPEG, while decoding, draws line after line until all the image is shown. Progressive draws whole image at once, but in very low quality (blurry) . In next turn, another ´layer´ of data is added over previous one and quality is improved. In the end we have full quality image drawn on screen.

So, where is the catch? Progressive JPEG is used in web publishing and designs. Internet connections are often slow and messy. Using progressive technique user can make out the image before it´s fully downloaded, and thus save time, nerves and money. Almost all JPEG images found on web are progressive.

Transparent JPEG

Contrary to GIF format, JPEG transparency is a big problem. GIF uses one colour (not in use in picture) to mark the area of image that will be transparent. That would be just fine, but JPEG doesn´t use constant colour values. While compressing, cell value is combined with values of surrounding cells and rounded to nearest integer. Every time the image is compressed the value is changed and we can´t be certain which colour (intensity) will the pixel have after next decompression. That is why the icons or cursors are done in GIF format. The solution to this problem is still in development.