The motion estimation

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


Many visual communication applications requires a high compression ratio, because of limited channel bandwidth and the requirements of real time video playback. One of the key elements of many video compression scheme is motion estimation. A video sequence consists of series of frames. To achieve high compression temporal redundancy between adjacent frames should be properly identified and eliminated. A popular method called block matching motion estimation technique is used to reduce the temporal redundancy in any video coding techniques.

This project deals with an efficient algorithm for fast Block Matching Motion Estimation (BMME) called New Diamond Search Algorithm. The basic idea is to divide the current frame into macroblocks. Each block is compared to a macro block in the reference frame using some error measure, and the best matching block is selected The displacement of the macroblock is determined called motion vector. These vectors are used during the reconstruction process.


Video has been a major part of public consciousness for over 50 years. The emergence of the VCR and the increasing shift to use of cable television signals have created an open environment where video producers can rapidly distribute information to a large number of consumers. The domains of video , computer systems, and communications services used to be quite distinct. These domains are coming together where information, communications and entertainment can use a common set of services and equipment for distribution. The development of digital video technology has made it possible to use digital video compression for a variety of telecommunications applications , such as video teleconferencing and digital telephony. Digital video compression plays an important role in the multimedia applications.In order to manage large multimedia data objects efficiently, these data objects need to be compressed to reduce the size for storage . Video Compression tries to eliminate the temporal redundancy between adjacent frames. Once the redundancies are removed the object requires less memory space. So being smaller in size, it takes less time for transmission over the network. This in turn significantly reduces storage and transmission costs.


Digital video technology has been characterized by a steady growth due to new applications like video e-mail, mobile phone video communications, video conferencing on the web.This continuously push for further evolution of research in digital video coding. In order to be sent over the internet or even wireless networks, video information clearly needs compression to meet bandwidth requirements. Compression is mainly realized by exploiting the redundancy present in the data.

A video sequence consists of series of frames. In that two successive images are very similar. This simple concept is called temporal redundancy. To achieve high compression ratio, the temporal redundancy should be identified and eliminated. The principle is :- The displacement of the objects between successive frame is first estimated. The resulting motion information is then exploited for an efficient interframe coding. Consequently the motion information along with the prediction error is transmitted instead of the frame itself. Block matching techniques are generally used for motion estimation in video coding.



Motion estimation has proven to be effective in exploiting the temporal redundancy of video sequences and therefore forms a central part of all hybrid video compression standards. Intuition suggests that moving pictures have a pixel conservation property -that pixels on one frame may be translated to form the pixel patterns on a subsequent frame. In other words, images corresponding to objects on one frame move within the frame to form corresponding objects on the subsequent frame. This temporal redundancy that exists between successive frames may be exploited in a number of ways. The simplest method of exploiting temporal redundancy in video sequence is frame differencing. This strategy assumes that the average motion is small and simply compresses the pixel differences between two frames. This is useful for fast, low resolution video compression.

Principles of motion estimation

Motion estimation and motion compensated prediction is by far the most efficient and widely used technique for achieving the high levels of compression that are typified in modern video compression standards. In this techniques, a scene or frame is divided arbitrarily into macroblock (MB) regions. The assumption is that each MB is composed of closely associated pixels. It is highly likely, therefore that motion in the frame will cause most pixels within a macroblock to move a consistent distance in consistent direction. By using motion estimation, motion vectors are determined for the macroblock in the current frame by searching for their best matching macroblocks in a previous frame. Motion vectors are therefore a measure of temporal redundancy between successive frames. From the pixel information of the previous frames and the motion vector, motion compensation is used to predict the current frame.

Motion compensation

During reconstruction, the reference frame is used to predict the current frame using the motion vectors. This technique is known as motion compensation. During motion compensation, the macroblock in the reference frame that is referenced to by the motion vector is copied into the reconstructed frame.

A lot of research work has been carried out in the area (and is still going on). The abstract and some of the research results are given below

Estimating motion in image sequences is the estimation of 2D motion from time varying images and paying particular attention to estimation criteria and optimization strategies. For a given region of support, it determines the dimensionality of the estimation problem as well as the amount of data that has to be successfully interpreted or transmitted. The data term of an estimation criterion is usually supplemented with a smoothness term which may be expressed explicitly or implicitly via a constraining motion model. Since the optimization of an estimation criterion typically involves a large number of unknowns, it presents several fast search strategies [1].

Many techniques are involved in the estimation of motion in video processing i.e. motion representation and motion estimation techniques. The block based motion estimation technique involves fixed partitioning into blocks and characterizes it with simple motion (translational motion). It assumes all pixels within a block go through the same motion. Motion parameters for each block are searched independently. It guarantees optimal solution within a search range. It compromises between accuracy and complexity in a good way. The block matching error criteria identifies mean squared error and sum of absolute differences specifications [2].

The key idea of motion estimation process for video coding is determining the motion vector. This provides the general concept of Exhaustive block matching algorithm. The basic idea of the algorithm is comparing the two frames in the image sequence by dividing them into macro blocks using sum of absolute difference. The displacement of the block in the current frame with respect to the reference frame is calculated. This gives the motion vector which is utilized in motion compensation. Also discusses the Pros and Cons involved in the algorithm [3].

Model parameter estimation from block motion vectors can be used for extracting accurate motion information. This extracts the motion vector from a sequence of images by using size variable block matching algorithm. This algorithm dynamically determines the search area and the size of a block. Then uses an adaptive robust estimation to filter out the extracted motion vectors and estimate the model parameters accurately [4].

Neural fuzzy motion estimation provides new schemes for motion estimation and compensation based on neural fuzzy systems. Motion estimation schemes often neglect the strong temporal correlations between the frames. Since the search window remains same through the image sequences, it needs heavy computation. This algorithm reduced the search area and assumes that each block of pixels moves with uniform translational motion. Fuzzy system uses a set of if-then rules to map inputs to outputs. It uses the motion vectors of neighboring block to map the prior frame's pixel value to the current pixel value. For this the system uses 196 rules. The fuzzy system learns and updates its rules as it decodes the image. This approach improved the compensation accuracy [5].

The four step search algorithm presents a center-biased search pattern with nine checking points on a 5X5 window .The center of the search window is then shifted to the point with minimum block distortion measure. If the minimum point is found at the center of the search window, the search window is reduced to 3X3 and the search stops at this small search window. This algorithm reduces the worst-case computational requirements from 33 to 27 search points when compared with the previous algorithms [6].

Note: The number given in the square bracket indicates the number in the references in the page.



Diamond Search

It is based on the block matching motion estimation techniques. In this the reference frame is divided into 8X8 blocks. Each block is compared with the blocks in the next frame using SAD measure. The search pattern is given below:

The pattern comprises nine checking points from which eight points surround the center one to compose a diamond shape. These nine points are tested to find minimum block distortion (MBD).

Steps involved to find the motion vector:( DS algorithm)

  1. If the MBD point calculated is located at the center position then jump to step3.Otherwise continue with step2.
  2. The MBD point found in previous search step is repositioned as the center point to form a new LDSP. If the new MBD point obtained is located at the center position then jump to step3.Oterwise continue this step.
  3. Switch the search pattern from LDSP to SDSP. The MBD point found in this step is the final solution of the motion vector which points to the best matching block.


The methodologies involved in the Diamond search algorithm are as follows:

Input the two image frames

  • Block matching process:
  • The BlockMatching module involves the block match process, which is being done for the motion vector calculation. Two image frames are extracted. Then split the reference frame into 8X8 blocks. Each block in the reference frame is compared with the current frame. The comparison is performed by calculating the SAD, which gives the least matching error. This gives the displacement between the current block and the best matching block which is the motion vector to be stored.

  • ImageReconstruction process:
  • The ImageReconstruct module involves the reconstruction process. The estimated motion vector is used for reconstruction.



A video stream consists of bytes representing pixel values. For color movies, each pixel is typically represented by three bytes. A collection of pixels make up a frame, a still image of the scene at a certain time. A sequence of frames makes up the video.

Digital images and video are resource demanding when it comes to storage or transfer requirements. Thus it is often necessary to compress the data by finding alternate representations. One may take into account the way the human visual system works, and remove certain information without making the loss too noticeable for human spectators.

When compressing video, there are similarities between nearby frames. With motion compensation, the algorithm tries to find the most equal block in an already seen frame, by searching a small neighborhood of the current block. The current block is then coded using the prediction error from the matching block.



  1. 'Christoph stiller, Janusz Konrad', "Estimation motion in image sequences".
  2. 'Gokce Dane', "Overview of motion in video processing".
  3. 'Yao Wang', " Motion Estimation for video coding " Polytechnic University,Brooklyn.
  4. 'Seok-Woo Jang,Marc Pomplun,and Hyung-Il Choi', "Adaptive Robust Estimation of model parameters from block motion vectors",Department of Computer Science, University of Massachusetts at Boston .
  5. 'Hyun Mun Kim and Bart Kosko', "Neural Fuzzy Motion estimation and compensation", IEEE transactions on signal processing, Vol. 45, No.10, October 1997.
  6. 'Lai-Man Po and Wing-chung Ma', " A Novel Four step Search algorithm for fast block motion estimation" , IEEE transactions on circuits and systems for video technology,Vol. 6, No. 3, P.No-313, June 1996.
  7. 'Toucka Toivonen, Janne Heikkila, Olli silven', "A New algorithm for fast full search block motion estimation based on number theoretic transforms".
  8. 'Prof. Ja-LingWu', "Motion estimation for video coding standards", Department of Computer Science and Information Engineering, National Taiwan University.
  9. 'Deepak Turaga, Mohamed Alkanhal', "Search Algorithms for Block-Matching in Motion Estimation".
  10. 'Archana Rao & Manasa Raghavan', " Fast Motion Estimation Algorithms computation and performance Trade-Offs".
  11. 'Markus Khun', "Information theory and coding-Image, video and audio compression".
  12. 'Bern Girod', "Image and video compression".