Quantity Of Data Without Reducing Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Reducing the quantity of data without reducing the quality is Compression technique.. Compressed multimedia data is very much ecient and much more faster to store or use than the uncompressed multimedia data. There are various techniques and standards for multimedia data compression, especially for video compression MPEG-2 and H.264standards. These standards consist of di¬€erent functions such as color space conversion and entropy coding .The goal of this thesis is to compare the two standards and highlight the di¬€erences between them. Although both follow the same general framework, there are several fundamental key advancements in the H.264 standard including a new integer transform, advanced arithmetic entropy coding. Keywords: Compression, MPEG-2 standard, H.264standard, entropy coding.


We live in multimedia world. We use videos for entertainment, study or various purpose. In Addition to save storage space we need to apply various compression technique. There are two video compression technique lossy and lossless compression. In this research we need to study on lossless compression method. Lossless compression is a compression technique that does not lose any data in the compression process. In international standards such as JPEG [1], JPEG2000 [4] , H.263 [2] and MPEG-4 [3] compression eciency is achieved by sacri¬cing the quality of the image and video content. In some applications, the preservation of the original data is more important than compression eciency, this is the area of lossless compression. There are three types of redundancies in color video sequences. These are spatial, spectral and temporal redundancy. Spatial redundancy exists among neighboring pixels in a frame. Pixel values usually do not change abruptly in a frame except near edges and highly textured areas. Hence there is signi¬cantly correlation among neighboring pixels, which can be used to predict pixel values in a frame from nearby pixels. The predicted pixel is subtracted from the current pixel to obtain an error or a residual. The resulting residual frame has signi¬cantly lower entropy than that of the original frame. Lossless JPEG [1] uses simple linear predictors obtained by linear combination of neighboring pixels. If a frame is in RGB color space, there are redundancies among the three color components. This is referred to as spectral redundancy. Components us-

ing a deferential coding scheme, where two of the colors are represented by the deferences with the reference color. In the JPEG-2000 standard [4], a RCT (Reversible Color Transform) is used to convert from RGB to YCrCb color space. Consecutive frames in video sequence are very similar if no scene change has occurred. This is referred to as temporal redundancy. Temporal redundancy can be used to increase compression eciency similar to the way spatial redundancy has been used in lossless image compression. Temporal redundancies between frames can be removed by temporal prediction. In lossy compression, block based temporal prediction by motion compensation provides very high coding gain where the technique decor relates the frames. The data rate is further reduced by the quantization of the residual error frames. Motion vectors, needed to reconstruct the frame, are transmitted as overhead data. In lossless compression, the same technique can be used to predict pixel values. Temporal prediction works better than spatial prediction in static areas where little motion has occurred between frames. It would seem logical that one may be able to increase the coding efeciency if one could use adaptive selection of the prediction technique. The performance comparisons with respect to the search range in temporal prediction are also described. A measure for video compressibility is de¬ned. If the video compressibility is high for a video sequence, then the contribution of temporal prediction is high and the video sequence can be compressed well by the proposed algorithm.


Related Work

The goal of image compression is to represent an image with the least amount of bits possible. There are lossless and lossy modes in image compression. In lossy compression, the image quality may be degraded in order to meet a given target data rate for storage and transmission. The applications for lossy coding are the transmission of the image and video through a band limited network and ecient storage. The issue in lossy compression is how much we can reduce the degradation of the image quality given the data rate. In lossy compression, most algorithms transform pixels into the transform domain using the DCT (Discrete Cosine Transform) [6] or the DWT (Discrete Wavelet Transform) [8] [5]. The source of the loss is either quantization of the transform coecients or the termination of the encoding at a given data rate. In order to meet a data rate budget, the transform coecients are explicitly quantized with a given step size as in ZTE, JPEG [1], H.263 [2] and MPEG-2 . Implicit quantization is used in algorithms such as EZW and SPIHT , which can be truncated at any point in the bit stream during encoding. Consecutive frames in a video sequence are very similar. This redundancy between successive frames is known as temporal redundancy. Video compression methods can exploit temporal redundancy by estimating the displacement of objects between successive frames (motion estimation). The resulting motion information (a motion vector) can then be used for an ecient inter-frame coding

(motion compensation). The motion information along with the prediction error are then transmitted instead of the frame itself.


video Compression Based on Transform Coding

In order to exploit spatial redundancy, most video coding methods (including MPEG-2 and H.264) incorporate some form of transform coding. To eciently code the spatial data, ¬rst a transform is performed on the spatial data in order to decor relate it so that it can represented using the fewest coecients. The most common transform utilized in modern video coding methods is the Discrete Cosine Transform (DCT), which is used in MPEG-2. After the transform, the transform coecients are quantized and then are entropy coded for ecient binary representation. H.264 uses a new integer transform described below. 3.1 Transform Coding in MPEG-2

The transform speci¬ed in the MPEG-2 video coding standard is a ¬‚oating point DCT. This same transform is used in many video coding standards, including MPEG- 1, H.263, and MPEG-4: Part 2 Visual, as well as the JPEG image coding standard. The main role of the DCT is to decorrelate the spatial data in the image. The DCT is widely used in transform coding since it is a close approximation to the Karhunen-Loeve Transform (KLT. Many DCT-based compression techniques partition the image into (88) blocks and apply the DCT to each block. 3.2 Entropy Coding in MPEG-2

To eciently represent the transform coecients, the MPEG-2 video coding standard utilizes a method very similar to the method used in the original JPEG image compression standard [7]. Since adjacent 8 8 blocks are often highly correlated, e.g. sky or solid color background, the DC coecient (upper left corner) is di¬€erentially coded with respect to the previously coded 8 8 block. The DC coecient can be encoded with 8 to 11 bits of precision. The coding of the quantized DC coecient uses a variable-length code (VLC) and a variable-length integer (VLI) speci¬ed in the MPEG-2 standard. First a VLC is coded that speci¬es the length (and also the dynamic range). The VLI then follows the VLC, specifying the actual value in the given range. To entropy code the quantized AC coecients, the coecients are scanned in a normal zig-zag order, in the case of frame-coded pictures, or an alternate scan order, in the case of ¬eld pictures. The two di¬€erent scan orders are shown in Figure 2.2. The resulting [run,level] pairs are then binary coded using one of the two available VLC tables speci¬ed in the MPEG-2 standard. Each block is terminated using an End Of Block (EOB) symbol.


Undesirable Properties of the DCT

Although the DCT is used in many coding standards, it does possess some undesirable properties: - The DCT transforms an integer signal into a real-valued signal. This implies the use of ¬‚oating point arithmetic, which can be expensive to implement in hardware or software. - The use of ¬‚oating point numbers introduces rounding errors during the transform and inverse transform process. The encoder and decoder may mismatch due to rounding, which can not be completely avoided. In order to address these shortcomings, the MPEG-2 standard speci¬es the amount of precision required by a compliant decoder. The error propagation due to mismatch error of present frames predicted from previous frames can be minimized by the periodic insertion of intracoded frames, which do not rely on previously coded frames. Accumulated drift can be terminated by the insertion of an intracoded frame. 3.4 Transform Coding in H.264

For the coding of the quantized transform coecients, H.264 o¬€ers two entropy coding methods: Context-Adaptive Variable Length Coding (CAVLC) and Context- Adaptive Binary Arithmetic Coding (CABAC). CAVLC is related to the entropy coding method in MPEG-2 in the sense that ¬xed code tables are used, but the After the zig-zag scan, the number of trailing zeros and trailing ones are considered. Depending on the statistics of the neighboring blocks, it uses one of four available VLC tables to encode this data. The encoder then codes the coecients in reverse order, choosing one of six possible Exp-Golumb code tables de¬ned in the standard.


Motion Compensated Video Coding

Motion compensation is used to exploit temporal redundancy within the video sequence to reduce the data rate. It provides an overview of various tools for motion compensation available in MPEG-2 and H.264. The following two sections will discuss the tools in more detail. 4.1 Motion Compensation in MPEG-2

Basic Macroblock Types for Forward Prediction: MPEG-2 speci¬es a 16 16 macroblock, which is the basic unit that is coded based on previous pictures. For P-pictures in progressive video, the standard allows for the following 3 basic types of macroblocks.

Sub-Pixel Motion Prediction: In order to achieve better modeling of the motion vector ¬eld, the motion vectors are not limited to being integers. Sub-pixel samples are interpolated for a more accurate motion-compensated prediction, which leads to less energy in the error signal. The sub-pixel samples are created in the decoder by using a bilinear averaging, which results in 1/2-pixel sample accurate motion compensation. An example is shown in, where the dashed pixels are interpolated using a linear average of neighboring shaded pixels. Interlaced Motion Compensation in MPEG-2: MPEG-2 supports additional modes of motion compensation to eciently represent the motion present in interlaced video. Due to the nature of alternating ¬elds updating every cycle, adjacent rows in ¬eld-pictures are not as correlated as in framepictures when motion is present. In additional to standard frame-based prediction, MPEG-2 provides additional coding tools for ¬eld-based prediction. 4.2 Motion Compensation in H.264

The H.264 motion model is much more ¬‚exible compared to MPEG-2. The H.264 standard allows for a much wider variety of encoding modes as compared to previous video coding standards. - Multipicture Prediction. - H.264 Bi-Directional Prediction.


Transmission of Compressed Video

This section provides a short overview of transmitting compressed MPEG-2 and H.264 video over a network. 5.1 Transmission of MPEG-2 Compressed Video

MPEG-2 Systems de¬nes di¬€erent coding layers to eciently handle several sources of data simultaneously. A single source of compressed data, e.g. compressed audio or video, forms what is known as an Elementary Stream in the MPEG Systems. The elementary stream is then partitioned into ¬xed or variable length packets to form a Packetized Elementary Stream (PES). These PES packets are then multiplexed together to form a Program Stream or a Transport Stream. Program Streams: This is similar to MPEG-1 Systems layer. It is primarily suitable for non error-prone environments. A Program Stream consists of one or more PES packet streams. Transport Streams: A Transport Stream is designed for error-prone environments. A Transport Stream consists of one or more programs, each of which may have a time base independent with one another. Each Transport Stream packet is exactly 188 bytes in length. The Program and Transport streams were both designed with di¬€erent goals in mind. It should be noted that one is not a subset or superset of the other.


Transmission of H.264 Compressed Video

Transmission of H.264 Video using MPEG-2 Systems: The ¬‚exibility of the standard allows H.264 VCL data to be carried over popular the MPEG-2 Systems layer via Program Streams or Transport Streams. One main issue that has arisen is the mapping of the variable length NAL units into ¬xedlength packets, such as the case in MPEG-2 Systems Transport Streams [7]. MPEG-2 TS packets are always 188 bytes in length. A simple approach would simply align the start of each NAL unit with the start of a new TS packet. This would waste bits, however, since the end of the NAL unit would not always coincide with the end of TS packet. The TS packet would have to be padded in order to meet the exact length requirement of 188 bytes, which would result in much wasted bandwidth. A solution would be to assign unique start codes for each NAL unit. In this way, the start of the NAL unit would not have to be aligned with the start of the TS packet. The decoder is able to uniquely identify the start of each NAL unit. Amendment 3 to the MPEG-2 Systems standard provides provisions to allow the transmission of H.264 video. The Amendment assigns a new stream type to H.264 video. 5.3 Error Resilient Coding Tools in H.264

Slice Coding: H.264 allows grouping of macroblocks known as slices. A slice is a subset of a picture. Analogous to I, P, and B frames, the H.264 standard includes I, P, B slices. The standard also speci¬es switching P (SP) and switching I (SI) slices. They are used to aid in the dynamic switching of different bit streams encoded at di¬€erent data rates. The de¬nition of di¬€erent slice types are very ¬‚exible. I Slice: All of the macroblocks in an I slice are coded using intra prediction. P Slice: An I slice except that some of the macroblocks can be coded using inter prediction, using a maximum of one prediction signal. B Slice: A P slice except that some macroblocks can be coded using interprediction, using a maximum of two signals Flexible Macroblock Ordering (FMO): The normal raster scan ordering (left to right, top to bottom) of macroblocks is normally utilized, but it may not the most optimal in terms of error resilience. FMO allows for a variety of scanning patterns, including chess boards or completely random patterns. By transmitting the macroblocks in a non-raster fashion, any errors in the transmission will manifest itself throughout the picture, instead of consecutive runs of macroblocks. In turn, the correct macroblocks can be used to eciently conceal the errors in the damaged macroblocks.



This thesis presents an overview of the di¬€erences between the MPEG-2 video coding standard and the recently ¬nalized H.264 video compression standard.

MPEG-2 made popular the general framework of a hybrid coder. The H.264 standard takes the basic building blocks of MPEG-2 and improves on every aspect, from the drift free transform to the advanced entropy coding.