Implementation Of The Kite Cross Diamond Search Algorithm Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

To achieve high compression ratio in video coding, a technique known as Block Matching Motion Estimation has been widely adopted in various coding standards. This technique is implemented conventionally by exhaustively testing all the candidate blocks within the search window. The process to estimate the pels or pixels of the current frame from reference frame(s) is called motion estimation (ME). This type of implementation, called Full Search (FS) Algorithm, gives the optimum solution. However, substantial amount of computational workload is required in this algorithm. To overcome this drawback, many fast Block Matching Algorithms (BMA's) have been proposed and developed. Different search patterns and strategies are exploited in these algorithms in order to find the optimum motion vector with minimal number of required search points.

One of these fast BMA's, which is proposed to be implemented in this project, is called Kite Cross Diamond Search (KCDS) Algorithm. The student is required to implement the algorithm in MATLAB and then compare its performance to FS algorithm as well as to other fast BMA's in terms of the peak signal-to-noise ratio (PSNR) produced, number of search points required and computational complexity.

Block matching method is to seek for the best matched block from the previous frame, which is the first single frame, within a fixed-sized of search window (w).It introduce the Full search (FS) method. Full search method matches all possible displaced candidate block within the search area in the reference frame in order to find the block with minimum distortion, so this FS algorithm have large motion and more searching point to do the blocks matching. It introduces high intensive computation. Many fast motion estimation algorithms have been proposed to give a faster estimation with similar block distortion compared to FS over the last two decades. The fast Block matching algorithm are the three step search (TSS), the new three-step search (NTSS), the four-step search (4SS), the diamond search (DS), the cross-diamond search (CDS) and small cross-diamond.

In TSS, NTSS, 4SS, BBGDS algorithms, squared-shaped search patterns of different sizes are employed. On the other hand, the DS algorithm adopts a diamond-shaped search pattern, which has demonstrated faster processing with similar distortion in comparison with NTSS and 4SS.


The scopes of works in this project are:

Background Study:

Study on video or image compression, Motion Estimation, Block Matching Algorithm and Kite Cross Diamond Search (KCDS).

Implementation of the Kite Cross Diamond Search algorithm using MATLAB.

The proposed algorithm is implemented and simulated using MATLAB.

Performance Analysis

The performance of Kite Cross Diamond Search (KCDS) algorithm is compared with existing fast block matching motion estimation algorithms.


The main objective for this project is to implement a one of the available fast Block Matching algorithm Kite Cross Diamond Search (KCDS). The others objectives are :

To implement and develop the of Kite Cross Diamond Search (KCDS) algorithm by using the MATLAB.

To compare the performance of Kite Cross Diamond Search (KCDS) with other algorithm.

To produce working MATLAB program code for the algorithm



2.1 Video Compression

Video compression is to reducing the quantity of data used to represent digital video images, and is a combination of spatial image compression and temporal motion compensation. Video compression algorithms use the fact that there are usually only small changes from one frame to the next so they only need to encode the starting frame and sequence of differences between frames.

Video compression is the concept of the coding of video data in video file formats and streaming video formats. It compressed video can effectively reduce the bandwidth required to transmit video. While lossless video compression is possible, in practice it is virtually never used and all standard video data rate reduction involves discarding data.

2.2 Motion Estimation

Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. The temporal prediction technique used in MPEG video is based on motion estimation. By sending a encoded difference images which inherently have less energy and can be highly compressed as compared to sending a full frame, it need to save on bits. This is for the idea of motion estimation based video compression. From the video sequence, the motion estimation takes out the motion information.

Motion estimation is to estimate the pels or ixels of the current frame from reference frame. All the video consists of many sequences of frame, when the motion in a sequence, then a pel on the same part of moving object is a better prediction for the current pel. To calculate the low energy residue, the motion information is needed to use in video compression to find best matching block in reference frame. The Motion Compensation is to use of the knowledge of the displacement of an object is better successive frames.

By modifying the reference frames, which is very close match to the current frame, for the current frame will create a model by the motion estimation module. The motion vectors is relate to the whole image or specific parts, such as rectangular blocks, arbitrary shaped patches or even per pixel. Motion estimation plays a key role in many video applications, such as frame-rate video conversion, video retrieval, video surveillance, and video compression.

2.3 Block Matching Algorithm (BMA)

The temporal redundancy remove technique between 2 or more successive frame is called block matching algorithm (BMA). It is an integral part for most of the motion-compensated video coding standards. Marco blocks (MB) is the frames are being divided into regular size blocks. The macro block is compared with corresponding block and its adjacent neighbors in the previous frame to create a vector that stipulates the movement of a macro block from one location to another in the previous frame. This movement calculated for all macro blocks comprising a frame, constitutes the motion estimated in the current frame.

Block matching method is to seek for the best matched block from the previous frame, usually the first single frame, within a fixed sized of search window (w). The displacement of the best matched block will be described as the motion vector (MV) to the block in the current frame by based on a block distortion measure (BDM) or other matching criteria.

Figure 1: Block Matching Process, the two frames used to determine the motion vector of a given block.

Figure 2: Block matching a macro block

The best match is usually evaluated by a cost function based on a BDM such as Mean Square Error (MSE) and the most popular and less computationally expensive is Mean Absolute Difference (MAD).

Many fast block-matching algorithms have been developed, for example, three-step search (TSS), new three-step search (NTSS), four-step search (4SS), Diamond Search (DS), etc. These fast BMA exploit different search patterns and search strategies for finding the optimum motion vector with drastically reduced number of search points as compared with the FS algorithm.

2.4 Full Search Algorithm

Full search (FS) algorithm is a most accurate strategy in BMA which evaluates all possible candidate motion vectors in search window to find the optimum. The candidate that gives the best match for a given block distortion measure is chosen as the estimated motion vector.

Figure 3: Full Search

2.5 Type of Fast Motion Estimation Algorithm

-Three Step Search (TSS) algorithm

- New Three Step Search (NTSS) algorithm

-Four Step Search (FST) algorithm

- Kite Cross Diamond search (KCDS) Algorithm

2.4.1 Three Step Search (TSS) algorithm

The three-step search algorithm has been widely used in block matching motion estimation due to its simplicity and effectiveness. The sparsely distributed checking point's pattern in the first step is very suitable for searching large motion. Among these fast BMAs, the three-step search (3SS) algorithm becomes the most popular because of its simplicity and effectiveness. Searching with a large pattern in the first step, 3SS is more efficient to find the globe minimum especially for those sequences with large motion.

The Step had shown the TSS method.

1. An initial step size is picked. Eight blocks at a distance of step size from the centre (One center point and eight points on the boundary of the search square) are compared in the first step and thereafter only 8 points are searched.

2. At the start of a new step the search point center is moved from the best matching point from the previous step. Step size is reduced by half after each step.

3 . At the end of the search the step size is one pel.

2.4.2 New Three Step Search (NTSS) algorithm

NTSS algorithm is a modification of the well known TSS algorithm, which is widely used as the motion estimation technique in some low bit-rate video compression algorithms. The NTSS modified the checking point's pattern in the first step by searching additional central eight points. At the mean time, N3SS used a halfway-stop technique to speed up the (quasi-) stationary blocks matching.

NTSS differs from

TSS by full search assuming a center biased checking point pattern in the first step. Details are as follows.

1. In addition to the original checking points used in TSS, eight extra points are added, which are the eight extra neighbors of the search window (center-bias).

2. A half-way stop technique is used for stationary and quasi stationary block in order to fast identify and then estimate the motions for these blocks.

a If the minimum BDM in the first step occurs at the search window center, stop search (first-step-stop).

b. If the minimum BDM point in the first step is one of the eight neighbors of the window center, the search in the second step will be performed only for eight neighboring points of the minimum and then stop search (second-step-stop).

Figure 4: NTSS search point

2.4.3 Four Step Search (FST) algorithm

With a search pattern of +/-7, the FSS algorithm utilizes a center-biased search pattern with nine checking points on a 5 X 5 window in the first step, instead of a 9 X 9 window as in the TSS. The search window size of the next two steps depended on the location of the minimum block distortion measure (BDM) points. The FSS algorithm is summarized as follows:

1. A minimum BDM point is found by a nine checking points pattern on 5x5 window located at the center of the 15x15 searching area as shown in Figure 5(a). If the minimum BDM point is found at the center of the search window, go step 4, otherwise go step 2.

2. The search window size is maintained 5 X 5. However search pattern will depend on the position of the previous minimum BDM.

a. If the previous minimum BDM point is located at the comer of pervious search window, five additional checking points shown in Figure 5 (b) are used.

b. If the previous minimum BDM point is located at the middle of the horizontal or the vertical axis of the previous search window, three additional checking points as shown in Figure 5 (c) are used.

c. If the minimum BDM is found at the center of the search window, go to step 4 else, go to step 3.

3. The searching pattern strategy is the same as step 2, but it will finally go to step 4.

4. The search window is reduced to 3 X 3 as shown in Figure 5 (d) and the direction of the overall motion vector is considered as the minimum BDM point among these nine searching points.

Figure 5: search pattern of FSS. (a) first step, (b) second/third step. (c) second/third step, (d) fourth step.

2.4.4 Kite Cross Diamond search (KCDS) Algorithm

The kite-cross-diamond search (KCDS) algorithm adopts a novel asymmetric kite-shaped search patterns in the search step to keep similar distortion or even better in low-motion sequence while the speed of the motion estimation for stationary or quasi-stationary blocks is further boosted. KCDS is particularly faster and more accurate in some kinds of sequences. This algorithm is especially suitable for videoconferencing applications.

The search-point configurations used in the KCDS are divided in 3 different shapes: cross-shaped pattern, diamond-shaped pattern and kite-shaped pattern. The details and the analysis of the algorithm are given below:

Step 1 (Starting - Small Cross-Shape Pattern SCSP): A minimum BDM is found from the 5 search points of the SCSP (Small Cross-Shaped Pattern) [Figure 6] located at the center of search window. If the minimum BDM point occurs at the center of the SCSP (0,0), the search stops (First Step Stop); otherwise, go to Step 2.

Step 2 (KSP): With the vertex (minimum BDM point) in the first SCSP as the center, a particular KSP is formed based on the motion direction in previous step. For example, if the minimum BDM is located in upper vertex in first step, the new KSP will be an up-kite shape described as Figure 7 (a). Thus, there are 4 cases of newly formed KSP in this step: up-kite, down-kite, right-kite and left-kite, depends on the MV motion in step 1. If the minimum BDM point occurs at the center of this KSP, the search stops (Second Step Stop); otherwise go to Step 3.

Step 3 (Diamond Searching): A new Large-Diamond-Shaped Pattern LDSP is formed by repositioning the minimum BDM found in previous step as the center of the LDSP. If the new minimum BDM point is at the center of the newly formed LDSP, then go to Step 4 for converging the final solution; otherwise, this step is repeated.

Step 4 (Ending - Converging step): With the minimum BDM point in the previous step as the center, a SDSP (Small Diamond-

Shaped Pattern) is formed. Identify the new minimum BDM point from the SDSP, which is the final solution for the motion vector.

Figure 6: Search pattern KCDS

Figure 7: Kite Search Patterns: (a) up kite (b) left kite (c) down kite (d) right kite




From the early project start, the time table is set to help this project is complete with systematic and achieve the objective of the PSM I. There are have a guideline and helping from the supervisor to lead the project move forward. With this guideline of project, the problem and trouble will be overcome. The following is the explanation for the methods to achieve the objective of project.

Project plan

The first discussion with the supervisor is to design the project flow for the PSM I. The explanation of the supervisor about the project is useful for understand the concept of the Motion Estimation. Supervisor plays an important role to help to enhance and develop the project.

Literature review

Literature review is important, because it is the part to gain the knowledge and deeply understand the project needed. Literature reviews are secondary sources, which are finding the theoretical and methodological to support the concept of the project. The source of information could be the text book, journal, internet, magazine or the opinion of the supervisor. The technique used for the Kite Cross Diamond Search (KCDS) algorithm is studies and understood.

By refer to the source, the first task to complete is to upload and play the video by using the MATLAB software.

After uploading the video and play it, the video frame is being extracted from the video sequence and displayed by using MATLAB. The frame has to be represented in number in order to implement the algorithm. The frame can divided into square block with the pixel dimension.

Implementation of the Kite Cross Diamond Search (KCDS) Algorithm

The three step search algorithm is studies to learn more about the block matching algorithm. It has advantages to learn the basic Block Matching Algorithm before to start implement the Kite Cross Diamond Search algorithm.

The Kite Cross Diamond Search (KCDS) algorithm is being implemented on block to determine the motion vectors. The predicted frames are reconstruction by using the motion vectors obtained from the previous stage.

Performance Analysis

The result of Kite Cross Diamond Search (KCDS) algorithm is compared to others algorithm and determined the most suitable algorithm to be used in motion estimation in terms of PSNR, number of search points required and computational complexity.

Report Writing and Presentation

The complete report is written and the result of this project is presented.



4.1 Video Sequence Uploading

The video must in the AVI file format to upload into MATLAB. For the information, the coding "mov=aviread(filename)" is used to read the AVI movie filename into MATLAB movie structure mov.

The information about the video can be specific by using fileinfo=aviinfo(filename). The information is shown below:

To view the video from the MATLAB workspace, the "mplay('filename.avi') is used. But the audio is not allowed for the "mplay".

After the coding is debugging, it will come out the window workspace to view the video. Just click the play from the window workspace and it will start play.

4.2 Frame Extraction

The CData is a property of an image object contains the data array. By using the colormap colors or as an RGB image, the dimensionality of the CData array controls whether the image displays. If the CData array is two-dimensional, the image is either an indexed image or an intensity image; in either case, the image is displayed using colormap colors. If, on the other hand, the CData array is m-by-n-by-3, it displays as a truecolor image, ignoring the colormap colors.

From the workspace, it shows:

To convert y to double:

From the workspace, it shows:

Im2double (y) converts the intensity image I to double precision, rescaling the data if necessary.

4.3 Block

To choose the row number and column number, just change the x1 and y1 values. The imshow(E) is to display the image of the block.


From the current result, the video uploading by using MATLAB have been learnt and the basic of the Block Matching Algorithm is studies from the literature review. Besides that, the concept of the frame extraction is learnt and the image can converted into number and the block of the image success to display.




As a conclusion, by refer to the gantt chart table, the progress of the project is under the expectation. So there is no problem for the progress for the project. For the PSM I, the implementation of KCDS algorithm is started, but it still in the literature review, so the implementation of KCDS algorithm will be continue in the PSM II. The project will complete in the expected time.