Optical Flow Computation Methods
Optical flow, which is the approximation of local motion, is defined as the distribution of apparent velocities for movement of brightness patterns in a sequence of images. It gives important information about the motion, spatial and structural arrangement of the objects viewed as well as the rate of change of this arrangement. Reliable, stable and fast computation of optical flow is important. Optical flow has been used mostly in the computer vision and artificial intelligence community since the late 1970s and its estimation methods have been proposed as a pre-processing step for many high-level vision algorithms. It is a convenient and useful image motion representation, useful in fields as diverse as robot vision and video coding.
*******Discuss what happens in this chapter ******
1.2 Optical Flow
If a scene is being viewed by a digital camera, movement of objects within the scene, or of the camera itself, will generally result in changes to the intensity values in the resulting viewed image. Motion tracking algorithms attempt to compute the actual motion of objects in a scene or the camera it's self, from the intensity changes in a sequence of images and is purely a geometric concept (Horn and Schunck, 1981). The computational analysis of motion uses a process known as optical flow. Horn and Schunck (1993, p.82) describe optical flow as:
"... a velocity field in the image that transforms one image into the next image in a sequence ..."
In other words optical flow techniques computationally model the biological process of seeing image motion that we experience in every day life. For example if someone is sitting in a moving vehicle whilst looking out of the window, the objects they see appear to be have motion which is in the opposite direction relative to the travel of the vehicle. Optical flow is also used to calculate the distance to the objects within the scene; distant objects observed from the moving vehicle, move so little within the scene that they appear to have relatively no motion. Closer objects have much more motion than distant objects, the closer the object the greater the motion. Also, they appear to move in the opposite direction of travel.(Burton and Radford, 1978) Mathematically the correlation between the magnitude of the optical flow and the relative position of the object to the observer is well defined. The magnitude of the optical flow is directly proportional to the speed which the vehicle is travelling; if the vehicle doubles in speed the optical flow magnitude is also doubled. Similarly the distance from object to the observer is also proportional to the magnitude of optical flow; if the distance is halved the magnitude of the optical flow observed is doubled. Optical flow is also dependent on the angle between the direction of travel and the direction of the object being observed. The magnitude of the optical flow is at its greatest when the observed object is at the extremity of the field of vision. Conversely if the object is viewed directly in the centre of the field of vision the magnitude is at its lowest value.
1.3 Optical flow methods
Usually image motion and optical flow are treated as continuous functions, as in Horn and Schunck's (1981) work, although in real world images this is rarely ever the case. Optical flow techniques are based on the idea that for most points in the image, neighbouring points have approximately the same brightness. The optical flow cannot be computed at a point in the image independently of neighbouring points without introducing additional constraints, because the velocity field at each image point has two components while the change in image brightness at a point in the image plane yields only one constraint.
Optical flow is calculated by sampling the image intensity at point in the image plane at time t, this is denoted by and (Horn and Schunck, 1981) initially considered an element of the intensity pattern which has moved a distance of and in the and directions respectively in time . Horn and Schunck (1981) assumed that the image intensity of the area remains at a constant which provides the following:
Where is image intensity, is the x coordinate of the pixel value within the image plane, is the y coordinate of the pixel value within the image plane, is the time difference between each frame. are the changes in the and directions respectively in time .
Expanding Equation [eq:Ch2_Horn_and_Schunck Optical Flow Equation] about the point in the image plane gives:
Where captures the second and higher order terms in , and . Subtracting from both sides of the equation and dividing through by gives:
Thus by considering what happens when the pattern moves, the single linear optical flow equation has two unknowns (and velocity components and respectively) is obtained, using the chain rule. If and then the two dimensional optical flow constraint equation (Equation [eq:Ch2_H&S brought to equal zero])
Where and denote the partial derivatives of image intensity with respect to and respectively. Equation [eq:Ch2_optical flow constrain equation] relates to the change over time in image intensity at a point to the motion of the intensity pattern in space. The optical flow cannot be determined at an image point independently of neighbouring points without introducing and additional constraint, since from the optical flow constraint equation [eq:Ch2_H&S brought to equal zero] it is only possible to measure the component of the flow that is in the direction of the intensity gradient.
Optical flow methods are generally applied to a vast array of well known synthetic sequences used with the area, to test and compare the performance of new or modified algorithms. The largest comparison test to date has been carried out by Barron et al. (1992), where all the well established algorithms are tested for accuracy, speed and error. Figure [fig:Ch2_Rotating-Sphere]a shows a typical synthetic frame of a rotating sphere. Figure [fig:Ch2_Rotating-Sphere]c is the next frame of the sequence. Optical flow is then computed by calculating the velocities for the motion difference in the two frames. Figure [fig:Ch2_Rotating-Sphere]b shows the velocities calculated by the optical flow algorithm which in this case was the well cited Horn and Schunck algorithm.
Approximations to motion have been used in many different and diverse applications such as the estimation of motion parameters from a moving visual sensor Baratoff et al. (1998), the measurement of biological parameters in medical imagery Aisbett et al. (1989), tracking aeroplanes at airports Gandhi et al. (2000) and obstacle detection for a motor vehicle Enkelmann et al. (1994). The two main types of motion which can be measured are the movement of objects in the scene and the movement of the camera (also called egomotion) with a stationary scene or with moving objects. Many authors Chivilo et al. (2004); Silva and Santos-Victor (1997); Kuno et al. (2003); Tsukune (1990); Negahdaripour and Madjidi (2003); Yamamoto et al. (1995) from different backgrounds and objectives have pursued research into motion.
The main Optical flow techniques used within the Image Processing community are correlation-type approaches, energy-based approaches and gradient-based approaches. Some are well established, with numerous other approaches currently in development Denman et al. (2005); Mignosi et al. (2005).
1.4 Correlation-type techniques
Correlation-type (also known as patch-matching) methods are perhaps the most straightforward and reliable approaches towards computing optical flow fields in those spatial regions in the image that are "well-structured" having distinct features such as intensity-corners or edges Fogel (1988). Essentially correlation methods match these distinct features between two consecutive images, these patches are found in spatial regions within the image where the autocorrelation approximates the delta function; these patches are then sought in the next image. Motion generated from quick moving objects, camera motion and image noise (real world scenario) may result in the search being carried out over a very large search region(Camus, 1995). Since the larger search areas are needed to enable finding the match in the next image, this leads to increased execution time of the techniques and also a decrease in accuracy as finding the matching patch may be impossible. Correlation techniques in the past have mainly been used in the application of processing of stereo images for features which are far apart obtained by moving cameras(Biilthoff et al., 1989).
Accuracy can be improved with the use of tracking algorithms such as Kalman filters (Kalman, 1960) and recursive least square filters (Hayes, 1996), however this will further increase execution time and therefore the real time implementation of high accuracy correlation methods is impractical (Liu et al., 1998). Since the assumption made by correlation methods is that the time period between two sequential images is very small and the textures and shapes in one part of an image change minimally. Therefore it assumes that the optical flow is locally uniform. Generally these approaches are only reliably correct at a small percentage of locations in images. Establishing the correct correspondences can be problematic as multiple potential matches can exist within the search window of any pixel.
1.4.1 The Camus method
Some of the limitations of correlation based techniques were overcome by Camus (1995); Camus (1997) where he realised that the simple relationship between velocity, distance and time could be exploited to improve the execution time of the correlation. In this method, rather than searching for the distinct feature over a wide range of possible combinations of pixel displacements, Camus limited the pixel displacement to one by one in all directions. He then calculated the correlation over several time delayed images and used the time delayed image with the best matching score to calculate the optical flow. Therefore, the execution time is dramatically decreased since the search base is very small. Using this techniques it was possible with 4 - 5 frames per second (fps) to maneuver a robot travelling at 4 - 5cm per second.
1.4.2 Fast cross correlation
Finding similar features between two images is the most essential part of correlation based optical flow techniques. Many correlation approaches with varying performance and computation cost have been used as a measure of similarity for the purposes of matching features between images. However the most common measure utilised is the cross correlation method Sun (1999). As an attempt to improve the computation speed of the cross correlation measure, Sun (2002), using the zero-mean normalised cross correlation (ZNCC), applied a box filtering technique which improved the computational efficiency of ZNCC. Box filtering replaces each pixel of an image with an average pixel value calculated from the pixels from within the box (McDonnell, 1981), this effectively produces a smaller image which is proportional to the size of the box filter. Further to this Sadykhov and Lamovsky (2006) improved the speed of the fast correlation method by using a parallel processor which can be found in most modern computers.
1.5 Energy-based techniques
Energy based approaches extract velocities by applying band-pass filters that have been optimized for velocity and orientation (Barman et al., 1991). These techniques are also known as frequency based due to the fact that the tuned filters are implemented in the Fourier domain (Fleet, 1992). The Fourier transform of a translating 2-D pattern is given by (Barron et al., 1992):
where is the Fourier transform of , is a Dirac delta function, is the temporal frequency, and is the spatial frequency. This maps the translational power association to a plane implies that all power associated with the translation will be mapped to a plane that crosses the frequency space origin.
A selection of Gabor filters with different spatial resolution can be used to extract velocity information from a sequence of images. This was demonstrated by Heeger (1988)when he used a group of twelve Gabor filters to extract velocity information from a selection of image sequences. With the use of Gabor filters, (Fleet, 1992) was able to obtain a band bass representation of spatial and frequency localization. Fleet (1992) then applied a least square fit to the distribution of frequencies so that the optical flow can be calculated.
1.6 Gradient techniques
Gradient bassed approaches are one of the most common methods used to solve the optical flow problem. These methods differentiate the image intensity values to compute the optical flow and have frequently been used to compute velocities from spatio-temporal derivatives of image intensity or filtered versions of the image (Horn and Schunck, 1981; Fennema and Thompson, 1979; Nagel, 1989). Most gradient solutions of the optical flow problem make the assumption that the intensity values of the image objects remain constant while in motion for a short period of time. As made by the assumption of the optical flow constarint equation ([eq:Ch2_optical flow constrain equation]).
1.6.1 Horn and Schunck
Horn and Schunck (1981) presented a gradient-based approach in the form of an iterative algorithm to find the optical flow pattern in which the derivatives of image intensity are approximated using a finite volume approach. It is one of the most powerful algorithms for optical flow computation, making the simple assumption that image brightness itself does not change during motion. This method requires the use of an additional constraint which is formed by minimising the square of the magnitude of the gradient of the optical flow velocity.
The optical flow calculation was treated by Horn and Schunck (1981) as a minimisation problem of equation [eq:Ch2_optical flow constrain equation] for the sum of the errors of intensity values and the distance from the velocity field smoothness as given by:
Therefore a combination of the equation [eq:Ch2_optical flow constrain equation] and equation [eq:Velocity field smoothness distance] solves the optical flow problem by finding appropriate values for and that minimise this combination is given by:
Where is a constant which determines how dependent equation [eq:the combination equation] is on the smoothness function, equation [eq:Velocity field smoothness distance]. By applying the smoothing factor, Horn and Schunck (1981) state that only plays a major role in sections of the image where the intensity gradient is small. This prevents changes to the optical flow being caused by noisy derivative calculations. Horn and Schunck (1981) suggest that a good estimate for is approximately the same as the expected noise of the optical flow velocities.
220.127.116.11 Aperture Problem
The formulation described in [sub:Horn-and-Schunck] provides only a 1D optical flow solution since only one constraint is placed at any position within the image, this is considered to the aperture problem. This problem happens when the optical flow constraint equation is not capable of calculating both components of the velocity vector when only motion is detected in the local gradient direction. In other words "mathematically, a line with no identifiable features on it does not possess a well-defined direction of motion. However, perceptually such a line is always seen to move in the direction perpendicular to its orientation"(Wuerger et al., 1996, p. 1). To provide a fully constrained estimate of the optical flow Horn and Schunck (1981) made the assumption that the image is locally smoothed everywhere, therefore the optical flow field can achieve a global minimum solution which over comes this problem.
1.6.2 Lucas and Kanade
By placing a further constraint,
Lucas and Kanade (1981) assumed as an additional constraint that the optical flow is varying smoothly with neighboring object points that have exactly the same velocity. The least squares estimator has been adapted in equation [eq:the combination equation] to minimize the squared error, expressed as:
*** needs edited *** When comparing the results for correlation type approaches and differential approaches Barron et al reported that the results produced by correlation type approaches were generally poorer than those from differential approaches. They found that the error depended on how close the motion was to an integer number of pixels per frame hence these techniques had difficulty in measuring sub-pixel motions and seemed more suitable for measuring larger motions and aliased image sequences.