Navigation Algorithm For A Mobile Robot Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Computer vision system is one of the main applications in the mobile robot observation systems. Mobile robots may be able to find the way in a known environment getting visual information of their atmosphere with the in order to estimate the position and direction of an obstacle. The main objective of this thesis is to find obstacle and make stereo depth perception systems for collision free navigation of a run time system for mobile robot in a known environment. In this thesis, a robust stereo depth perception and obstacle detection algorithms are developed. The application is applied in stereo vision autonomous vehicle for its navigation. The distance is detected and analyzed from disparity mapping produced by stereo matching process .The disparity mapping is produced by Area based matching algorithm using Sum of Absolute Differences (SAD) and gradient .The process like camera calibration, rectification, stereo correspondence and calculation of disparity are considered in the thesis . In addition in this study for safely navigation of the mobile robot within its environments and consistently move from start point to destination point, a navigation algorithm. With the help of developed algorithms, the robot was able to reach target point without collision with any obstacles. The project is implemented in Matlab.




Mobile robots are different from the industrial robots as mobile robots are capable o moving in contrast to the industrial robots. The mobile robotic systems find their application in many industrial areas, such as applications including people tracking, mobile robotics navigation, and mining. It is also used in industrial automation and 3D machine vision applications to execute tasks such as bin picking, volume measurement, automotive part measurement and 3D object location and recognition. In all these fields, it is essential that the mobile robots must have the sensing capability which can be achieved using sensor system which allows the vehicle to attain information from the environment. The data provided by the sensor system must be useful in control planning of the mobile robot.

To make the navigation of mobile robots accurate the sensor such as ultrasonic and infrared sensors system can do well but when features like size and complete position of the obstacle are needed the range systems will not be enough for sensing of robot's environment. At that time it is related with the three dimensional sensitivity systems as applied to mobile robots.

Three Dimensional Robot Perception

Spatial sensing is a one of the techniques that can be applied to get information from the environment.

Following are the three main types of spatial sensors which are:

Laser Range Finders

Sonar Sensors

Stereo Vision

1.2.1 Laser Range Finders

Laser range finders are used to compute the distance to objects in the closer surrounding by estimating the time of flight of the laser impulse. Laser rays are reflected back by the objects in front of the laser source. The view angle of the optical source is kept slightly different the laser source as showed in Figure 1.1.The information regarding the depth is calculated by calculating the difference between the measured and the expected point of incidence of the of the laser ray on the optical sensor .The principle of Laser scanners is that they emit trillions of light photons in the direction of an object and only receive a small percentage of those photons back. The reflectivity of an object is depends on the object's color. [1] .Like black with the, maximum absorption power reflects only a small amount of the incident rays. Transparent objects such as glass will only refract the light and give incorrect dimensional information. In Figure 1.2 we can see typical laser range finder.

Figure 1.1 Schematics of a Laser Scanning Operation

Figure 1.2 Sick Laser Scanner

1.2.2 Sonar Sensors

Sonar sensors are provided to avoid obstacle as they are smaller in size and have low consummation, unluckily their resolution is too low to reconstruct a dense depth map of the neighboring environment. To provide 3D information of the ambient with ultrasonic sensors, a scanning head collected of multiple sensors and motors are needed. A simple ultrasonic transmitter receiver pair is shown in Figure 1.3

Figure 1.3 Simple Sonar Sensors

1.2.3 Stereo Vision

Stereo vision is an extensively used for three dimensional positions of objects from two or more simultaneous views of a scene .It is of great advantage for mobile robots to use stereo vision to acquire range information of the environment as it is a very reliable way. Precision of the results is usually very important for applications such as depth perception and obstacle avoidance. In addition stereo vision is a passive sensing technique

Figure 1.4 Parallel Stereo Vision camera

The advantage of Stereo vision systems is that is do not require complex hardware two cameras mounted at same level can do well. We can make our own binocular vision system. The depth of information is not directly calculated through the hardware but it has to be extrapolated from the binocular images by helps of disparity.

Given that the sonar sensors are fast laser scanners are accurate they have their own disadvantages .The advantage of stereo vision over the laser scanning and sonar sensors is that these systems are passive and provide high resolution depth map.

When mobile robots works in a real time environment, object detection and identification is the major problem that is encountered, to deal with this problem. Visual information is extensively used for navigation and obstacle detection of mobile robots .In our project we used stereo vision sensor to navigate our robots. Since stereo vision gives dense depth maps, this helps us to identify small or fine objects significantly better than sonar sensors.

Obstacle avoiding is an important job for mobile robots and is a big challenge to be solved for many years by researchers and a lot of obstacle detections and avoiding systems have been designed so far. On the other hand designing an perfect and entirely robust and consistent system remains a challenging task, in the real environments.

1.3 Research Objectives

The main purpose of this thesis are described below

The aim of the study was to develop an autonomous robot that will be able to visually navigate in a known environment and is capable of avoiding obstacles during navigation. Obstacles detection and avoidance have been performed by our robot using computer stereo vision system.

A navigation algorithm for a mobile robot is developed using stereo vision in MATLAB. In the navigation algorithm, the starting and destination coordinates are specified and robot travels from the starting point to the target point. This is how static obstacles are avoided.

Hence the thesis explains the idea of obstacle avoidance by detecting the obstacle using the stereo vision technique implemented on mobile robot.


1.4 Format of Thesis

This section is chapter wise outline of the thesis.

The thesis is organized in the following manner

First chapter provides the basic information about mobile robot navigation using stereo vision technique and it preference over the other spatial sensors techniques ,further the main objective of the research is explained .In addition to all this the format of the thesis is described in the first chapter.

Chapter 2nd is a brief literature survey of stereo vision and how stereo vision is used in the navigation of mobile robots.

Chapter 3 is a discussion about the algorithms of stereo correlation, stereo vision perception system depth and disparity maps and the steps involved in stereo vision like camera calibration, image processing n enhancement techniques.

Chapter 4 is about the navigation techniques used in the project and some navigation algorithms are also explained.

Chapter 5 is the conclusion of the thesis in which the major issues are faced during work are discussed .In addition the various contribution are mentioned and idea of the future work is explained.



2.1. Introduction

The two major necessities in mobile robot navigation are



In this research stereo vision is used for perception which is presented in the topic 2.2 and the study relating mobile robot navigation using stereo vision is also explained in this chapter

2.2 Stereo Vision

Presently computer vision systems are gaining high attention of the researchers because of their ability of accurately generating three dimensional information of the environment which plays a key role in the development of intelligent robots. Three dimensional features are extracted using passive sensing of the surrounding [2, 3].

Normally, stereo vision system uses the combination of two cameras .Each camera provide two dimensional representation of environment in the form of images. Then the three dimensional information is extracted from the images provided by the cameras by interpreting two or more two dimensional images of the environment. The interpreted three dimensional information results in map that gives the points of correspondence between the resulting images and the real three dimensional scene of the environment .Detailed explanation on stereo vision is explained in [4, 5].

In recent times, numerous stereo algorithms are designed to find correspondence between the two images or in other words calculating the disparity map .Few simple and fast methods are "Sum of Absolute Differences" (SAD) and "Sum of Squared Differences" (SSD) which calculate the square differences of the pixel intensities and determine the similarities between the images. [6]

An area based algorithm is perform best for run time applications. The heart of the algorithm relies on the distinctiveness constraint and on a corresponding process that allows for rejecting preceding matches as soon as more consistent ones are establish. This way is compared with the left-right reliability constraints. In a personal computer this algorithm has been carefully optimized to attain a very fast implementation. [7]

A technique to attain an effective distance detection of obstacles in stereo vision application. In stereo vision autonomous vehicle for its navigation this application is useful. Using stereo corresponding process the distance is detected and examined from disparity mapping. During navigation the extraction of pixel values is used in curve fitting tool (cftool) to find out the distance of obstacles. This tool is available in Matlab software. The disparity mapping is produced by block matching algorithm Sum of Absolute Differences (SAD). The range determined by cftool will be used as a reference for navigation of autonomous vehicle. [8]

A new approach has presented for stereo corresponding in autonomous robot applications. In this structure a precise but sluggish rebuilding of the 3D scene is not required; rather, the fast localization of the obstacle to avoid them in an additional significant. Correct correspondence is a topic to subject in all these technique. But they are incompetent in practical contexts for the occurrence of identical patterns, or some perturbations among the two images of the stereo pair. This approach is to look the stereo correspondence problem as a corresponding between homogenous areas. This technique is powerfully robust in case of some variations of the stereo pair, homogeneous and repetitive areas, and is fast. [9]

The paper proposes a joint scheme in which motion and stereo vision are used to infer scene structure and determine free space areas. Binocular disparity, computed on several stereo images over time, is combined with optical flow from the same sequence to obtain a relative-depth map of the scene. Both the time-to-impact and depth scaled by the distance of the camera from the fixation point in space are considered as good, relative measurements which are based on the viewer, but centered on the environment.

The feasibility of the approach in real robotic applications is demonstrated by several experiments performed on real image data acquired from an autonomous vehicle and a prototype camera head.[10]

Camera Models and Prameters [11]

Pinhole cameras

Analytical Euclidean geometry

The intrinsic parameters of a camera

The extrinsic parameters of a camera

Camera calibration

Least-squares techniques


In case of no prior information about camera calibration (uncalibrated cameras) RANSIC algorithm can be used to extract the corners of object in images and further can be used for the calculation of fundamental matrix which involves camera models.

Randomly choose minimal subset of data points necessary to fit model (a sample).

RANSAC divides data into inliers and outliers and yields estimate computed from minimal set of inliers.[12]

2.3 Mobile Robot Navigation

Few significant researches in the field of mobile robot navigation using stereo are discussed in this section as follows

Raw stereo images are useless for obstacle detection and for navigation purposes, because of the techniques such as image processing and enhancement are required to be applied on these raw images to obtain the disparity information.

Real time or almost real time stereo images are existing from different stereo hardware systems. These images need major post-processing process to take out three dimensional range informations. But rare stereo images are ineffective for robot obstacle avoidance and navigation responsibilities. For instance, researchers normally have applied some image processing and image enhancement methods.

A new obstacle identification technique was introduces by Kumano and Ohya [13].In this technique stereo depth is calculated without matching the corresponding points in the images, given technique was very suitable for the real time environment because it is very fast but is not very suitable for the object detection quick adequate for mobile robot navigation, but, not appropriate strong for obstacle detection. As, some objects could not been detected during navigation.

Distance transform technique was firstly introduced by chin et al.[14]. DT is used in both path planning and objects detection at a time making .the performance of DT method get worse when used for both purposes simultaneously, so it is mostly used for navigation purpose. The DT technique in this paper [9] can be further improved by optimizing the DT algorithm .

The method for object detection and path planning proposed my sabe et al.[14] was based on floor extraction using Hough transform .In the mentioned technique a humanoid robot was used .

Navigation environments can be split into one of the following categories



Partially known

Chapter no 3

Stereo Vision & Depth Estimation

3.1 Introduction

Stereo Vision is based on the human visual apparatus that uses two eyes to gain depth information.

Stereo vision is the process in visual perception technique principal for the calculation of depth from the two different projections of the world onto image plane of the two cameras.

Stereo vision is a part of the field of computer vision. It is generally used in mobile robotics to detect obstacles and depth measurements.

In computer stereo vision operation, two cameras take pictures of the same scene, but they are separated by a certain distance.

Figure (3.1)

From a computational standpoint, a stereo system must solve two problems.

The first is known as correspondence, consists in determining which item in the left eye corresponds to which item in the right eye.

The difficulty in correspondence is that some part of the scene is visible by one eye only. A stereo algorithm must be able to handle parts of the image that may not be matched.

A stereo vision corresponding algorithm compares the images while shifting the two images together over top of each other, to find the parts that match. In other words, it tries to find for every pixel in the left image the corresponding pixel in the right image. Correct corresponding pixel is defined as the pixel representing the same physical point in the scene. The distance between two corresponding pixels in image coordinates is called the disparity and is inversely proportional to depth. The disparity at which objects in the image, best match is used by the stereo vision triangulation algorithm to calculate their distance.


Pre Processing Operations

Stereo Match (i.e. stereo corresponding)

Post Processing Operations

Camera Calibration (intrinsic & Extrinsic Parameters) Offline

Obstacle detection (nearest) and localization

Filter Applied to smooth the result of stereo matching

Stereo matching Algorithm Applied

Cropping after rectification

Rectified left image

Rectified right image

Right frame (Grabber)

Depth Calculation for navigation

Left frame




3.2 Pre Processing

Pre processing operations are done before stereo matching. These operations have vital role in stereo vision depth estimation. Images acquired from camera's are inadequate for depth perception and images should be prepared for matching operation.

Steps for Pre Processing

Image Grabbing

Stereo Rectification

Image Crop

3.2.1 Image Grabbing

We have used image acquisition toolbox for interfering with cameras .Previewing two camera's at a time, acquiring frame and importing left and right frame to the workspace for upcoming operations.

%% Live video Aquisition Using Logitech Camera's

vid = videoinput('winvideo', 1, 'RGB24_320x240');%right cam

vid1 = videoinput('winvideo', 2, 'RGB24_320x240');%left cam



2-Set frame rate

3-Importing data to workspace



The binocular vision hardware (i.e. stereo vision cameras) has to be interfaced trough a driver in programming language in order to get synchronized images. The process uses two parallel cameras aligned at a known distance of separation. Each camera captures an image and these images are analyzed for same features.

In stereo vision image grabbing is performed via stereo rig. There are a lot of professional stereo rig in the market. But we don't need to use expensive hardware to be able to use in our robots. Today most webcams are of a reasonable quality and have a sufficiently high frame rate to be practical on slow moving robots. Because of this reason we decided to construct our own stereo rig. Constructing Stereo Hardware

Stereo vision hardware consists of identical model USB webcams. When using the same model we ensure that the optics of both cameras as much as possible will be same and they will have the same field of view and focal lengths.

As far as our project is concerned we have used Logitech camera's(Logitech Webcam C120).It has quality of CMOS webcams upto 30 frames per second and wide field of view .

The specification satisfies the good result for stereo matching.

In order to make a stereo rig a physical support is necessary to mount the cameras. The stability and the precision of the physical support are very important, because without a correct images alignment, the correspondence algorithm could not work properly.

Figure (3.3) must be inserted

3.2.2 Stereo Rectification

Stereo rectification is the process of projecting two or more images onto a common image plane and aligning their coordinate systems. This is a useful procedure in many computer vision applications. Camera calibration and stereo matching are the most important Pre-Processing step for the calculation of disparity which is inversely proportional to depth. Epipolar Geometry

The epipolar geometry is the intrinsic projective geometry between two views. It does not depend on scene structure and is only dependent on the camera internal parameters and relative pose.


The point of intersection of the line joining the camera centres(baseline).


The two cameras are indicated by their centers C and C' and images planes. The camera centers 3-space point X, and its images x and x' lie in a common plane.

Epipolar plane

A plane containing baseline


Epipolar line

The intersection of an epipolar plane with the image plane .An epipolar plane intersects the left and right image planes in epipolar line, and defines the correspondence between the lines.

Epipolar Constraint

Epipolar constraint establishes mapping between points in left image and lines in the right image and vice versa. This constraint is useful to reduce the search area into a single line.


Consider two points P and Q on the same line of sight of the reference image R(both points project into the same image point p ≡ q on image plane π R of the reference image).

The epipolar constraint states that the correspondence for a point belonging to the (red) line of sight lies on the green line on image plane π T of target image.

3.2.3 Camera Parameters Pinhole Camera model

Every camera maps the three dimensional environment to a two dimensional image. The simplest camera model that models this mapping is the pinhole camera model.

Figure 3.5 Pinhole Camera Model

A pinhole camera consists of two planes, the retinal plane and the focal plane with the optical center C in the middle as shown in Figure 3.2. On the retinal plane the image is formed, the focal plane is parallel to the retinal plane on a distance which called focal length. A three dimensional point P from the real world is mapped to the two dimensional image via a perspective projection. Pinhole cameras are characterized by two sets of parameters. Internal or intrinsic parameters, describe the internal geometry and the optical characteristics of the camera. Extrinsic or external parameters describe the camera position and orientation on the real world. To compute a comparison between two images captured from two different cameras, intrinsic and extrinsic parameters are essential.

Figure 3.6 Intrinsic and Extrinsic Camera Parameters

As shown in Figure 3.6 the system for modeling two cameras consists of three different coordinate systems, the world reference frame (xw , yw zw) the camera frame (x,y,z) with the optical centre as origin and the image frame (u,v) . A three dimensional point given in homogeneous world coordinates can be converted into the camera frame by a rotation rij and a translation tj which is expressed by the extrinsic parameters Te

Then this point is converted to the two dimensional image planes using the intrinsic parameters.

The intrinsic parameters are as follows:

The intrinsic parameters are as follows:

f = focal length

(uo ,vo ) = center of the plane

(ko ,k1 ) = the pixel size in mm

α = f/ ko (3.3)

β = f/ k1 (3.4)

The transformation using the intrinsic parameters is as follows:

Points on the focal plane, where z = 0 and s = 0 respectively, it cannot be

Transformed to image plane coordinates as division by zero is not defined and the straight line going through this point and the optical centre does not intersect with the image plane as it is parallel to the image plane. In summary, a point given in world coordinates is transformed onto a two dimensional image plane using the following equation

The knowledge of the intrinsic and extrinsic camera parameters allows for the rectification of images and after image rectification we ensure the epipolar constraint is ensured. The calculation of these parameters is needed for the camera calibration . Camera Calibration

For aligning the epipolar lines (rectification) or assuring Epipolar constraint the webcams need to be calibrated .By calibration of Camera means evaluating for camera model parameters, intrinsic and extrinsic parameters.

Intrinsic and extrinsic both parameters for camera can be calculated using Camera Calibration Toolbox for Matlab by Jean-Yves Bouguet.

Steps In camera calibration full Guide(


Acquiring images from both camera .

Uses chessboard pattern.

Corners extraction from chess board.

Results in .mat file including intrinsic and extrinsic parameters of stereo camera.

Fig(3.7).LEFT & RIGHT Camera Calibration images from different views and distance.

Focal Length: fc = [ 353.13743 352.32978 ] ± [ 8.08085 8.23615 ]

Principal point: cc = [ 167.26989 120.14406 ] ± [ 7.71296 10.28340 ]

Skew: alpha_c = [ 0.00000 ] ± [ 0.00000 ] => angle of pixel axes = 90.00000 ± 0.00000 degrees

Distortion: kc = [ 0.14315 -0.96487 0.00008 -0.00188 0.00000 ] ± [ 0.14581 1.34769 0.01398 0.00889 0.00000 ]

Pixel error: err = [ 0.16729 0.18380 ]


Extrinsic parameters:

Translation vector: Tc_ext = [ -78.231335 -53.225431 584.101418 ]

Rotation vector: omc_ext = [ 2.169627 2.155013 0.116551 ]

Rotation matrix: Rc_ext = [ 0.006944 0.993779 0.111155

0.999970 -0.006531 -0.004076

-0.003325 0.111180 -0.993795 ]

Pixel error: err = [ 0.10002 0.13720 ]

Calibration results for left camera. Image Rectification

The result of camera calibration (intrinsic and extrinsic parameters) would be use to satisfy the Epipolar constraints and subsequently used for rectification of left and right images.

Using Calibration information

Trashes the lens distortion

Turning left and right images in standard form.

Figure (3.9)

Bouquet's algorithm pre-computes the necessary indices and blending coefficients to enable a quick rectification .Rectification process is done just one time before acquisition of images and after offline rectification then the same transforms can be used to rectify the subsequent images grabbed from stereo rig. In our case the rectified composition of left and right images are shown below.


3.2.4 Image Crop

The process of rectification cause the images of left and right camera to be a little tilted or shifted up or down. In reaction of this shifting and tilting of images we get some regions in image that a void or are padded to 0s or 1s to eliminate these padding we crop the images In our case we have acquired images of size 320x240 and are cropped to size of 300x210 .

Cropped images of size 300x210

Figure (3.11)(a) (b)

3.3 Stereo Match

The Algorithm tries to find corresponding pixel point of left image(reference) in right image(target).

The Problems faced in Correspondence

Matching cost computation

Cost Aggregate

Disparity Optimization

Disparity refined

Algorithms for stereo are classified as

Local (WTA(strategy))


Local Approach

In local approach a pixel from left image is compared in same row of right image with all pixels by shifting. Simplest and unused approach

Figure (3.12)

Local operation Reduces signal to noise ratio SNR and uses WTA to calculate disparity

Matching cost

Absolute Difference

Squared Difference

Area based Matching cost

Sum of absolute difference

Sum of squared difference

Gradient based

Disparity Computation

Winner takes all


Disparity is higher near the camera and reduces subsequently.

Figure (3.14)

A disparity map is a method for storing depth of each pixel in an image. Each pixel in the disparity map corresponds to the same pixel in a reference image. Determine the disparity of a physical point in multiple point of view projections. By repeating this process for all points of the 3D scene the correspondence phase computes a disparity map. Disparity Result before and after filtering. Stereo images and their corresponding disparity results are shown below.

Figure (3.15) (original size 300x210)

% With Camera intrinsic matrix

K = [349.7388 0 163.2014

0 348.5764 112.4836

0 0 1.0000 ];

Single Matching Phase Algorithm

Image type: grayscale

Preprocessing: subtraction of mean values

Matching cost (Step 1): Absolute Differences

Aggregation strategy (Step 2): FW

Disparity selection (Step 3): WTA

Outlier detection: efficient strategy (later, Step 4)

Discards uniform areas: yes, analyzing image variance

Optimizations: box-filtering

Runs in real-time on a standard PC

3.4. 3D reconstruction depth

After getting successful result in disparity we can reconstruct the 2D images from both camera into 3D scene using Camera calibration results(intrinsic parameters ) and using the image color scheme. The result of images constructed in 3D are as below.



4.1. Introduction

In robotics applications, important task is navigation of robot in an environment. Mobile robots having a vision system can be navigate in an environment having visual knowledge of their nearby environment with the goal of finding the location of obstacles. Basically autonomous robot consist of two important task to do it, Firstly obstacle avoidance and second one is movement from starting point to final point which is also called destination point. On the other hand we can define the robot navigation in sentence robot navigation is a job to find its own position in its coordinate system and then design an algorithm to reach at its goal location while avoiding obstacles.

Robot navigation can be successfully complete with the combination of the two tasks in our project scope,



Perception is a method to convert the stereo image data into meaningful data like depth map. Localization is a point toward robot's capability to find their own position and orientations within the starting point in an environment. This is also the requirements to find the position and orientations within the starting point in same reference frame and coordinates frame.

4.2. Perception

One of the main tasks in an autonomous robot is to acquire some meaningful knowledge about its environment. This is done by taking calculation from various sensors and then grabs meaningful information from that calculation.

In previous chapter details of perception is given. Only visual information is not sufficient for robot movement. With the meaning of safely navigation obstacle detection is also necessary with the perception. Basically obstacle detection is

not the only necessary thing but position measurements is also important for navigation.

Perception system of the autonomous robot can be divided into two parts,

Depth perception and obstacle avoidance

Position measuring system

In the research of obstacle detection a lots of calculation includes, the prepared algorithm stereo vision system produced high quality results, using two calibrated cameras. But only depth information is not enough for navigation. Obstacle detection algorithm is also necessary and has been developed in the previous chapter. So on coming paragraphs we discussed position measuring system.

In order to do exact robot navigation, autonomous robot needs to use some odometry system. There is broad selection of odometry system used in autonomous robot system such as servo potentiometer and encoders for position measurement of wheels. Tachometer used for velocity measurement for wheels and GPS (Global Position System) give us a longitude, altitude and latitude of the robot but unluckily GPS cannot work properly for indoor applications.

4.3 Localization

Mobile robot can easily move in safely environment and get destination position.

At the stage of autonomous robot navigation location of the robot of its surroundings it's important to know. With the help of these we can find the location of obstacle.

Here is the flow chart for navigation,

Flow Chart for Navigation


Stereo Activation


No Yes

Linear Forward Motion

Linear Backward Motion

Stereo Deactivate




Navigation can be defined through this flow chart, first of all robot start navigation with the stereo vision activation. We should try to navigate with obstacle avoidance. In our project scope environment are known. Robot movement like whether he is move forward or backward depend upon object distance from robot. For Example object distance is about 1m from robot then robot stopped navigation or move backward. This all work can be done on a single line; robot can be able to move in a single line. If robot find any obstacle on a line nearby 1m then he stopped, or the user remove the obstacle from their position then robot continue to navigate in an environment.

Robot is able to move backward, in a situation if user forcefully creates the 1m distance between robot and object. This all work can be done using stereo vision operation.

During the process of robot navigation, robot has capability to check if stereo vision algorithm is stopped working then robot stops navigation. Otherwise robot continues navigation.



5.1. Conclusion and Discussion of Present Work

This thesis concentrated on learning of stereo vision estimation and object detection from depth map for autonomous robotic applications. Preparing algorithm has been used for securely navigation in a known environment. This thesis also focused on different camera's calibration algorithm and which one is fast for present work and image rectification is also. Without camera's calibration and image rectification development of stereo vision depth map is impossible.

Within the scope of this work three main task have been accomplished,

These are;

Developing stereo vision depth perception and obstacle detection algorithm.

Constructing and calibrating stereo hardware

Developing navigation obstacle avoidance strategy

In this work stereo vision depth perception can be classified into three parts for autonomous robot navigation, the first one is pre processing in which further some techniques are defined in third chapter and the second one is stereo matching (i.e. stereo corresponding) and finally post processing is defined in which different improvement techniques are applied to make a better result for disparity map. Interconnections between all of these functions are given in chapter 3.

In pre processing operation three main functions are defined first one is image grabbing in which we grab the frame from live video processing and the second one is stereo rectification and last one is image cropping. All these pre processing operations are prepared to improve stereo matching quality which is a big matter in better navigation.

In order to movement of autonomous robot in a known environment with securely need to know not only the depth information but also the localization is required for present position and orientation of the robot and position, size, depth of the obstacle. The prepared stereo vision perception system gives high quality depth map, using two calibrated cameras with an algorithm Sum of Squared Differences (SSD). But unluckily the depth information is not sufficient for autonomous robot navigation. At this point we are required to define the obstacle and distance of obstacle from the robot.

In order to acquire sequentially image form robot surroundings, robot needs a stereo hardware system. To implement this operation a stereo hardware are designed and constructed with two CCD sensor USB webcams. Design stereo hardware is cheap with respect to professional stereo hardware.

Firstly, design of the stereo hardware assembly should be in a good form and good design work otherwise there is a little misalignment between two cameras will result in a low quality stereo depth map.

Secondly, our system does not require any frame grabber so it is very inexpensive solution. But, getting synchronize stereo image pair is very hard and passing time which is need for image grabbing is higher than the professional solutions. In contrast to professional stereo image grabbing hardware prepared stereo rig must be calibrated in offline by user.

In this work, navigation of autonomous robot is safely there is no collision avoidance.

To accomplish this task we design some algorithm for autonomous robot navigation. By helps of obstacle avoidance algorithm, robot tries to escape obstacles according to the position and size of the obstacle.

To execute developed stereo perception algorithm and navigation algorithm is modified and improved to build up a real autonomous robot.

After optimization and combining all algorithms and codes, behavior of robot is observed, by use of developed algorithms, the robot is able to reach destination point without collision any obstacles.

5.2. Future Work

As our Project is continuation of wanderbots Project. The development of stereo system for Wanderbots in Matlab was the basic step to know the working of stereo rig system and its limitation .

The future work of Wanderbots project related to stereo vision would be to make the algorithm faster for fast real-time operation.

The future work will be to build the project in OpenCV environment and implement it using FPGA.

Second task would be Shifting our project from known environment to partially unknown environment.

Third task would be related to autonomous navigation creating navigation maps using stereo vision.

One of the important task would be picking of detected objects and moving the object to the specified location autonomously.


A.1 Digital image

An image may be defined as a two dimensional image with x and y axis. The amplitude of the each index is defined as (x, y). Digital image is stored in an array form; in which each element correspond to a single pixel. Pixel represents a single dot on a computer, also called the intensity of the image at that point.

A.1.1 Pixel Coordinates

Normally, the pixel coordinates term is used for define the locations of image. In this coordinate system, the image is treated as a grid of discrete elements, ordered from top to bottom and left to right.

Figure A.1

In Pixel Coordinates, rows increase in the direction of downward and columns in increase in the direction of right. Pixel coordinates are integer values and range between one and the length of the row or column.

Generally, In Image Processing Four Types images are used

True Color Images

Grayscale Images

Indexed Images

Binary Images

A.1.2 True Color Images

True Color images can also be called RGB images. In which three planes are defined Red, Green, and Blue.

The eight corners of the cube (see Figure A.2) match to the three primary colors (Red, Green and Blue), the three secondary colors (Cyan, Magenta and Yellow) and black and white. All the different neutral grays are positioned on the diagonal of the cube that attach the black and the white vertices.

Figure A.2

Color image

A.1.3 Grayscale Image

A grayscale image is prepared from pixels each of which holds a single number corresponding to the gray level of the image at a exacting location. These gray levels span the full range from black to white in a series of very fine steps, normally 256 different grays.

A.1.4 Indexed Image

Some color images are created using a inadequate palette of colors, normally 256 different colors. These images are referred to as indexed color images because the data for each pixel consists of a palette index indicating which of the colors in the palette applies to that pixel.

A.1.4 Binary Image

In a the binary image, each pixel assumes one of only two discrete values 1 or 0 and these values are interpreted as white and black respectively.