Object Detection And Tracking For Computer Vision Computer Science Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Abstract - The goal of the paper is to describe the implementation and evaluation of a system that is able to detect and track moving foreground objects through MATLAB. For this purpose, relevant toolbox such as Image Acquisition Toolbox and Image Processing Toolbox were implemented. This paper splits the tracking task into two sub-tasks. The first is to detect foreground object in each frame of a sequence as blobs, and the second is to track the detected blobs across frame. To understand the dynamics of the object detection algorithm inside Image Acquisition Toolbox and Image Processing Toolbox, some parameters are modified and evaluated, few analyses confirms that the proposed MATLAB algorithm able to detects foreground objects well under static conditions. Moreover, to track individual blobs across frames, the image property image technique is used. Especially when tracking multiple objects simultaneously, the data association problem can be motivated from the ambiguities that can arise between tracked blobs and newly detected blobs. An association technique which solves this problem in simple instances is developed and tested on several video sequences. These tests also indicate that the overall system can generally detect and track non-occluding objects in the face of some clutter in the scene.

Keyword: detection and tracking, MATLAB Toolbox

INTRODUCTION

Freshly, the rapidly increasing use of advanced cameras provides breakthrough in growing the amount of video imaginary. In line with the technological depth, the political developments in recent years have unfortunately lead to dramatic rise in interest for surveillance, specifically to monitor vulnerable public areas such as underground stations, banks, and shopping center. In view of the plethora of digital data that us being accumulated, an interesting and challenging problem is the algorithmic interpretation and fault detection in factories, as well as abnormal behaviors recognition. For surveillance and other applications the sheer amount of data are becoming difficult to use and process manually. Being not surprise, automated implementation to the current situations are seen as a talented solutions to this dilemma, by highlighting noticeable event in a sequence. One of the central tasks for this goal is motion tracking, that is, given a sequence images from a camera (for instances, monitoring a street scene), it is desired to detect and track moving foreground objects in such way that the object's motion trajectories become evident. Thus motion tracking is aimed at determining the visual identify of objects at different point in a time. One such information is available; it can be used in later processing steps to reveal further information about the objects, their objects, goals and their relation to the environment.

This final year project describes the implementation and analysis of the object detection and tracking by using Image Acquisition and Image Processing Toolbox, and then the algorithm is compare to author for its competence, which weighed in different kind of situation of video.

The first part is the technique use for object detection; the algorithms used must be robust to dealing with changes in the geometry scenes, such as vehicle which park for prolonged time will not be foreground object after sufficient time has passed. Moreover, the repetitive motion such as waving leaves should not become part of background object. Furthermore, the process also must strong deal with ability to adapt change to illumination changes, due to the result will bring to next stage of the system, which is tracking, hence, some post processing must be done to ensure the quality of background and foreground blobs.

The second part is how to associate problem with the true object that caused them? In other word, how to maintain the object identities' in video frames? Besides that, the uncertainty about the true state, which due to inaccurate observations from the background subtraction process (in approximating its true position, extracting its rough shapes, etc.?)

The goals of the project were to implement the proposed object detection method as described by authors and gain insight into the update equations and the associated parameters. The second aim was to successfully track multiple objects in simple situations without occlusions, and robustly combine all noisy observations into more accurate state estimate in what called filtering.

RELATED RESEARCH

Numerous researchers have been gone through to update the image processing algorithms. Starting by Cai et.al [1], they proposed a system called as background extraction based by thresholding techniques. Given a dummy values or reference values to the threshold, any pixel values that are higher than threshold will identified as foreground object, but Cai techniques failed when there is noises appear in the frames, or dynamic camera. This challenge overcomes by Wren et al [2] where they discovered a more complex system which called as Pfinder. They claimed their system, Pfinder, can be used in that deal which background changes, and have higher and complex level of algorithms. Fluentes et al.[3] more recently showed the essential of the thresholding to identify background and foreground blobs. Later than that, Huwer et al.[4] proposed a techniques called adaptive background thresholding to improve the efficiency of the background detection. Due to the increase in technological depth, a technique called Sequential KD approximations [5] is also introduced, one of the disadvantages of the Sequential KD approximations is that the techniques needs large memory and cpu requirement, which can cause the old computer to failed use of this techniques.

Spagnolo et al [6] then discovered an algorithm which can perform the background subtraction by using radiometric similarity or a small neighborhood, or window within image. According to the authors, they claims their work uses a local based differentiation rather than global differentiation, this result improve the process time of the algorithm. Unfortunately, this technique also needs a dummy values or referrer to have a background pixel selection. Matsuyama et al. [7] then improve the work done by Spagnolo, in accordance to their work, they use vector size of background image and foreground pixel for differencing. Mahadevan et al [8] proposed a new method in background differencing, focused on dynamic scenes, authors uses calculation of saliency of the image to differentiate background and foreground pixels, this techniques claim need use of a threshold values too, as stated above.

For the tracker techniques, Masound et al [9] has done a lot of work in order to find the best matches in 2 consequential frames. Other instances for tracker also proposed by Cai et al [1], they use what so called state, which are combination of space, velocity, and coarse features such as ratio and so on. Fuentes et al [2] also claimed the tracking process is to made assumption for each blobs, and those blobs can become a group which more meaning to us (for examples, human, vehicle, tress), which contain no more separation between them.

Lately, Song et al [10] also proposed a tracker method, in their techniques, it requires two model in order object to be track, which are dynamic model and appearance model. Appearance model is basically a color histogram of the object, and the dynamic model consists of Kalman Filter which use for location prediction. Collin et al [11] then improve tracker techniques; Collins uses RGB color in the frames instead model comparing. Liang [12] then further the work done by Collins, Liang added adaptive feature selection, scale invariant, and scale variation in tracking process. The adaptive feature selection is based on Bayes error rate between two frames. This allows object can tracked successfully without messed up.

METHODOLOGY

In this project, I proposed a system which is robust to the differentiation between foreground and background pixels. Moreover, the "object" in this project only constraints to human and vehicle; and track only human in the frames (essential application for home surveillance system). In this project, the system suggested to work under offline environment, in other word, the video must be fetched and split in to frames before feed into the system for object identification and tracking. Moreover, the camera can be static or PTZ (Pan/Tilt/Zoom) where the background is dynamic according to the camera position.

Given an image series from the dynamic or stationary camera, the initial step is to differentiate moving foreground blobs from the background. The methodologies used in this project is adaptive thresholding (for instances, given a reference value to the MATLAB, then those pixel values higher than dummy values consider as foreground object). Adaptive Thresholding initialized by convert the frame color into gray scales images, and a threshold values is put in the algorithm. The judgment of whether a pixel from a given, subsequent image is part of the background or foreground is then depending independently on the stored dummy values: Hence, in order to have best pixel in the frames, several experiment need to confirm to choose the best values of threshold (dummy values).

In order to smoothing and increase the detected object pixel, some filtering is done in this project. The filtering is called as averaging filter, this techniques will forces the detected object pixel's to be smooth and produce more connected outcome when threshold, in other words, the object detected will appear to be the true and full objects to user, not the partial of the object. Figure below show the different between un-filter frames and filter frames.

After the detection process, the frames appear with a lot of unwanted noises, hence, some morphological processing need to implement inside this before reach the final step of the object detection. Basically, morphological will help to reduce or increase the size of pixel, so, a pixel detected which less than the refer morphological values will be eliminated in the process, and those pixel have higher values than that will be key stay in the frames. The morphological process also helps to increase the tracking result in later stage.

The second stage of this project is to successfully tracked move object in real-time. In order to achieve this, few important parameters need to be taking note, such as, position of the object, or the location of the pixel between current and history frame. The main goal here is to merged the detected blobs into a real-object, distance between blob is fundamental to an new real-object to that particular frames, if the distance between is lower than reference values, it will be combined into one blobs to generate new blobs or object which can undead by user. Other than that, center of mass of each object can be as an essential parameter to improve the tracking result. Centre of mass can actually identified each object in the frames, due to its principal, the algorithm use of location and sizes of the object. So for, the system can successfully identify the object across frames or in real-time.

RESULT

As stated previously, the threshold value is the most important parameter in the system, in order to get suitable dummy value, multiple videos were analyzed in this project. The video selected for this project can be divided as; static environment, only few people (not more than 5 objects) in the video, another one will be an object test on outdoor (dynamic changes due to sudden illuminations), and lastly is video which contain the most people and dynamic environment changes (for example: raining).

By using the algorithm proposed, the result showed that the values picked are approximately from 0.3 to 0.7 where the maximum and minimum ranges for the dummy values are between 0 and 1. From the result obtained, it shows the dummy values are almost 30% of the total dummy values. If other dummy values were to use from stated as above, the tracking and detection would be very poor, the figures below showed result for suitable dummy value:

75%

82%

82%

97%

Table 1.0 : Few analysis to choose the best to represtating the dummy value for thresholding.

As can see above, the algorithm can fails under certain circumstances. One major failure observed is the scene where images are dynamic. The factors arises to this challenge can be varied, for instances, the environment situation. Adaptive threshold can't deal much with highly dynamic places (where the speed of the camera is fast), this will particularity messed up the algorithm process, as the synchronous between process time and camera rotation time is not concurrently. Other than that, scenes such as occlusion and sudden illumination changes in the video will fail the detection too.

For the tracking result, I have done analysis using 3 videos as previously. Ideally, if the blobs distance is slight near to each other, and the centroid of the blobs is known, the probability of the merging can be increased, however, if the speed of the blobs is so fast until the fails to achieve minimum distance, then the like hood of a successful match decreases dramatically. The result of the tracker showed as below:

Video clip

Object tracked

Frames number

peformance

rt-1

h1

1000

GOOD

h2

1000

GOOD

rt-2

o1

200

GOOD

o2

200

GOOD

o3

200

FAIR

o4

200

FAIL

rt-3

o1

Inf

GOOD

o2

Inf

FAIR

o3

Inf

FAIR

h1

Inf

FAIR

h2

Inf

FAIR

h3

Inf

FAIL

h4

Inf

FAIL

Table 1.1: Tracker result for 3-different type of video

.The rt-1 video involves of 2 humans and in a scheduled environment, i.e., my room in Kampung Wai. A "GOOD" performance was achieved with 1000 frames (process real-time). Besides that, as can see from Table 1.1, the rt-3 video show major failure in tracking, this is due to high dynamic scene and high population of human in the video. The algorithm failed to differentiate each human in short time. Hence, separation happened in rt-3 video.

CONCLUSION

The aim of this project was to implement and evaluate a system for tracking moving foreground objects in real-time, the task was decomposed into two independent problems, namely foreground blob detection and tracking of those blobs across frames.

Overall, the work which was carried out so far draw positive and negative conclusions from this project. The ideas presented by Cai et al. [1] appear not to be a successful in the scene that is dynamic, such as sudden illumination changes, occlusions. Hence, The extensions need to be carried out so that the detection efficiency can be increased, so applied to the tracking algorithms.

The introduction motivated this project from the perspective of automated sequence analysis for more efficient use of the large amounts of available video data. In this context, the detection of motion trajectories was identified as a key step towards this goal. The results produced by the implementation can often usefully in some summaries object trajectories in compact form for future some simple occasions. Although the algorithm proposed can successfully detect and track, but that many interesting challenges still remain.

REFENRENCES

[1] Q. Cai, A. Mitiche, and J.K. Aggarwal. Tracking human motion in an indoor environment.Image Processing, 1995. Proceedings., International Conference on, 1:215{218 vol.1, Oct 1995.

[2] Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.

[3] S. Julier and J. Uhlmann. A new extension of the Kalman filter to nonlinear systems. In Proceedings of the 11th International Symposium on Aerospace/Defense Sensing, Simulation and Controls, Orlando, FL, 1997.

[4] Y. Cai, N. de Freitas, and J.J. Little. Robust visual tracking for multiple targets. In A. Leonardis, H. Bischof, and A. Pinz, editors, Proceedings of the European Conference on Computer Vision, pages 107-118, 2006.

[5] Brad Schumitsch, Sebastian Thrun, Gary Bradski, and Kunle Olukotun. The information-form data association filter. In Y. Weiss, B. Sch¨olkopf, and J. Platt, editors, Advances in Neural Information Processing Systems 18, pages 1193-1200. MIT Press, Cambridge, MA, 2006.

[6] David A. Forsyth and Jean Ponce. Computer Vision: A Modern Approach, pages 388-392. Prentice Hall Professional Technical Reference, 2002.

[7] ] Chris Stauffer and W. Eric L. Grimson. Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):747-757, 2000.

[8] R. Jonker and A. Volgenant. A shortest augmenting path algorithm for dense and sparse linear assignment problems. Computing, 38(4):325-340, 1987.

Writing Services

Essay Writing
Service

Find out how the very best essay writing service can help you accomplish more and achieve higher marks today.

Assignment Writing Service

From complicated assignments to tricky tasks, our experts can tackle virtually any question thrown at them.

Dissertation Writing Service

A dissertation (also known as a thesis or research project) is probably the most important piece of work for any student! From full dissertations to individual chapters, we’re on hand to support you.

Coursework Writing Service

Our expert qualified writers can help you get your coursework right first time, every time.

Dissertation Proposal Service

The first step to completing a dissertation is to create a proposal that talks about what you wish to do. Our experts can design suitable methodologies - perfect to help you get started with a dissertation.

Report Writing
Service

Reports for any audience. Perfectly structured, professionally written, and tailored to suit your exact requirements.

Essay Skeleton Answer Service

If you’re just looking for some help to get started on an essay, our outline service provides you with a perfect essay plan.

Marking & Proofreading Service

Not sure if your work is hitting the mark? Struggling to get feedback from your lecturer? Our premium marking service was created just for you - get the feedback you deserve now.

Exam Revision
Service

Exams can be one of the most stressful experiences you’ll ever have! Revision is key, and we’re here to help. With custom created revision notes and exam answers, you’ll never feel underprepared again.