Understanding JPEG compression

April 20, 2023

Understanding JPEG Compression: How It Optimizes Image Storage

Introduction:

JPEG (Joint Photographic Experts Group) is a widely used image compression technique that allows efficient storage of images while maintaining acceptable visual quality. In this article, we will explore the key aspects of JPEG compression and how it takes advantage of the human visual system. We will also delve into the concept of color sampling and how it contributes to reducing file sizes without significant loss of image quality.

The Basics of JPEG Compression:

JPEG employs a lossy compression algorithm to reduce the size of image files without retaining all the original data. It involves an encoder that converts a bitmap image into the JPEG format and a decoder that retrieves the bitmap image from the compressed format.

Leveraging Human Visual Sensitivity:

Scientific evidence shows that human eyes are more sensitive to changes in brightness than to colors. The JPEG compression scheme capitalizes on this characteristic to achieve effective image compression.

RGB Color Spaces:

The RGB (Red, Green, Blue) color space is a three-dimensional representation of colors, where each axis corresponds to one color component. By mapping all possible colors onto a cube, the RGB color space allows us to understand the relationship between color values. Moreover, the diagonal line from the origin to the color (255, 255, 255) represents an increase in brightness.

Introducing the Y Cb Cr Color Space:

The Y Cb Cr color space separates the luminance (brightness) and chrominance (color) components of an image. Y represents the luma or brightness, while Cb and Cr encode the color information. This color space aligns with how human eyes perceive color, with the Y component being particularly important.

Chroma Subsampling:

Chroma subsampling is a technique used to compress an image by reducing the amount of color information while retaining the luma component. In the common 4:2:0 subsampling scheme, the original 8x8 image is divided into 2x2 blocks, and the color values of the pixels within each block are averaged. Subsequently, the color information in each block is represented by a single pixel value, typically the top-left pixel.

Advantages of Chroma Subsampling:

By merging 2x2 blocks of Cb and Cr channels into one color value, the file size is significantly reduced. While this subsampling technique may introduce slight visual differences in 8x8 images, it is often challenging to notice these differences in real-world images. Overall, subsampling the chroma channels can lead to a 50% reduction in file size.

JPEG encoder

The JPEG encoder plays a crucial role in compressing images using the JPEG compression technique. This article explores the inner workings of the JPEG encoder, including concepts such as Discrete Cosine Transform (DCT), quantization, and encoding methods like run-length encoding and Huffman encoding.

Discrete Cosine Transform (DCT):

The DCT is a mathematical technique used to represent an image as a weighted sum of cosine waves of different frequencies. By applying the DCT to an 8-pixel signal, we obtain coefficients that represent the contribution of each cosine wave to the original signal. These coefficients reflect the amplitude and direction of changes in the pixel values.

The 2D DCT and Quantization:

To encode an image, the JPEG encoder divides it into 8x8 blocks and applies the 2D DCT to each block. This process generates sets of DCT coefficients for each column. Next, higher frequency components are removed through quantization. The encoder divides each coefficient by a scalar value and rounds it to an integer using a quantization table. This table determines the compression quality, known as the Quality Factor, and is provided by JPEG standards.

Quantization Tables and Color Channels: JPEG employs separate quantization tables for the luma (brightness) and color channels. These tables are designed based on visual experiments to achieve optimal compression quality. The quantization process reduces the precision of higher frequency components, contributing to compression while introducing minimal perceptible loss.

Run-Length Encoding and Huffman Encoding: After quantization, the JPEG encoder employs a combination of run-length encoding and Huffman encoding. Run-length encoding groups consecutive zeros together to reduce redundancy. Huffman encoding assigns variable-length codes to frequently occurring values, further compressing the data. By applying these encoding techniques, the JPEG encoder achieves significant file size reduction while preserving image quality.

Conclusion:

The JPEG encoder is a vital component in the compression of digital images. By utilizing techniques such as Discrete Cosine Transform, quantization, and encoding methods like run-length encoding and Huffman encoding, the JPEG encoder effectively reduces file sizes while maintaining acceptable image quality. Understanding the inner workings of the JPEG encoder helps us appreciate the optimization achieved in image compression and the widespread use of JPEG as a standard format for storing and sharing images.