This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
In this era, the digit media technology has been deployed into our daily lives. There are substantial applications which are spreading over the industries. For instance, we deploy the technology into medicine, geography for X-ray generator and landmark identification respectively. Additionally, in the computer domain, the merged applications are fingerprint recognition, special visual/audio effects, facial expression synthesis, animation, object tracking and recognition. All of these applications focus on improving human-machine interaction. However, it is difficult to fulfill the face recognition, which involves that we have to make the computer understand human emotion. Gavrila (1999, p82) claimed that we need to extract information from the interactive which is concerned with who the people are and what are their activities between the human and computer without inputting devices.
Our objective is to use the hand as the input device instead of the function of mouse by video camera, but there are two main problems we will encounter. First of all, the gesture is hard to be recognized by the computer because of the complicated hand shape. For example, we use the camera to capture the scene in front of it, where the hand will be detected as an input mouse. If you use your hand to play a game with different gestures which our application should recognize, the response time must be fast, otherwise, user cannot accept it. However, Mas et al. (2005, p97) have argued that the more complexity increases, the longer will delay. Most of known hand tracking and recognition system cannot achieve this requirement. For instance, there is a tracking system which is particle filtering-based algorithms can achieve the real-time tracking, but it has a limitation of the computation demands. The second one is that we have to use hand to play simple game or paint a pattern on a drawing board as well. We have robust solution for these two problems. In the recognition progress, we are going to use object tacking to locate the hand. For the mouse control, we use the different color between the foreground and background to distinguish hand directives.
The first project is concerned with hand tracking and mouse input by Mars et al. (2005, p97-100). It is demonstrated a hand tracking and gesture recognition system. They divided this system into two sections. First, there is Hand Segmentation Criteria which is based on a probabilistic model of the skin-color pixels distribution. The pixels restricted in a learning square for model learning. Then, the selected pixels are transformed from RGB to HSL and the chromatic color information is taken. There were two problems appeared, which were the errors happened because the human skin color are closed to red, and color is unstable. These problems can be solved by applying a connected components algorithm to the probability image. Second one is gesture recognition. They used tracking process and an association method which is standard computer vision techniques to build five hand gesture models. These models made the system follow the defined gestures that could make the hand control accomplished.
Arora et al. (2008, p504-505) developed a system able to implement panel detection, hand detection and fingertip detection and tracking. In the panel detection progress, the camera captured the image and system transformed them into pixels. Amount of captured images were analyzed in order to check their status. For the hand detection, the system began the second analysis by comparing the hand image area to the white region after putting the hand on top of the panel.
The third one is fingertip detection and tracking. They set a pixel value for the fingertip area, and the fingertip had a minimum value which is associated with tip position. Then the coordinates of tip position are calculated and updated for the coming image.
Nikel et al. (2004, p2) presented a comprehensive 3D tracking system for pointing gesture recognition, which they combined color and range information together in order to increase the system performance. They used stereo camera to catch the video and connected to computer, then the computer handled the incoming information. In the next step, they applied skin-color distribution model to locating head and hands.
This application is to use the hand as the input device to accomplish mouse function performance by acquire image from web camera, which fulfills the interaction between human and computer in more nature way. Due to the color of hand is hard to distinguish with the background color, we use a glove with unique color instead of hand in this process. The glove approach can locate the hand position and state more efficiently. In addition, using hand input mouse to play a maze game or paint a picture on the board can monitor the precision of hand tracking and hand directives detection. This application can satisfy users with basic mouse function what they need.
For this application, a low-cost web camera, which placed in front of computer, is the necessary equipment. Application acquires images of hand and background environment from the web camera. User requirements of the application as follows:
1. Real-time performance is provided
This application try to accomplish hand motions interact with computer in natural way. The play can use hand as mouse to play video game or paint a picture. Therefore the system should perform the real-time feature to solve the problem of long response time. Long delay between the user performance and the computer responds is not acceptable.
Hand movement in a average and regular speed is acceptable. This application can guarantee that player can use hand to control a ball in a maze game or paint a simple picture. In the process, the interaction will not be interrupted or lost.
2. Hand recognition
In the starting phrase, image is acquired form web camera. The application should recognize the hand from the background. To facilitate the process, the system resort to specified colored gloves or markers on hand. Moreover, it allowed that hand can be recognized in different background environment, such as the light condition changed, object changed.
Hand can be recognized in the same background environment with different light conditions. The darkness condition is not acceptable. In addition, recognition process should perform successfully in colored background, such as lab. The specified color of gloves or maker can be red or green, which should be chosen by users.
3. Hand tracking
Once the application recognize the hand or the glove, a square will be appear on the sub screen, located the hand. The square should locate the glove location all the time.
The square should confirm the location of the glove after the glove is recognized. In tracking process, the square cannot be disappeared and always point to the location of glove.
4. Mouse cursor movement
The image which acquired from web camera will be display on sub screen on the bottom right corner of computer screen. The application should control the mouse movement by tracking the route of glove movement.
In the screen, movement of mouse cursor can be control by user glove. Mouse and glove have the same movement path in real-time performance.
5. Hand gesture operator
The application should recognize the hand gesture to distinguish hand directives. Basically, there are three operators need to be implemented. One is right-hand button operation, system agreed that fist gesture is stand for right mouse click. Another one is left click operation, system recognize hand rotate 90 degree as lift click, this function can be used to paint a line.
These two hand gesture directives can accomplish the basic mouse function. Hand rotation and making a fist need to operate in turn in testing process. After implementing this phrase, user can use this application paint a simple picture in blank board.
6. Maze game is provided
Except drawing picture, the system is expected a simple maze game for users. The maze game should be operated by hand input mouse. Maze puzzle has starting and ending point, a ball need to avoid several blocks in the maze and walk through from starting to ending. Acceptance Criteria:
The player can control the ball successfully in the maze environment. The path of ball should be the same with the hand routes. User can easily control the ball pass the game. In the process, the ball cannot be out of control or lost connection with user's hand.
Our system is going to track the hand and recognize the gesture, and we use the hand to control the computer as the function of mouse. At the beginning, we deploy a low level camera between the computer and human. It is used to capture the video and sent the information to the system, and the system manipulates the incoming image by the model that we build up. For the tracking model, there are two significant supports. First, we use the frame difference method to detect the target. If they are matched, we are going to use the CamShift to track the object. Second, we apply a single color glove to our hand that is foreground color which is different from background color. It is not mainly for the tracking but the gesture recognition, because the hand gesture is changing too fast and complex so that the system cannot response in time. At last, we are going to use the virtual mouse to play a 3D game which is built by the OpenGL. In these three main progresses, there are C++, OpenCV and OpenGL that we will use.
C++ is widely used programming language. It is a static data type checking, and supports for multiple paradigms of other programming language. It also supports procedural programming, data abstraction, object-oriented programming, generic programming, making programming and other programming style. Its model function can complete many works which contributes to improve the efficiency.
OpenCV is based on open source computer vision library that can run on Linux, Windows and Mac OS operating system. It is efficient by a series C functions and a small amount of C++ class structure. It realizes the image processing and computer vision aspects of many common algorithms. OpenCV has promoted the executive time by optimizing the preparation of C programming language.
OpenGL is open graphics library which is a programming language defines a cross-platform programming interface. It is for both three dimensional and two dimensional images. OpenGL is a professional graphics program interface, and it is easy to call its graphics library which can provide basic point, line, polygon drawing function, but also can provide a complex three-dimensional objects and function curves and surfaces. There also have special image effects such as blending, anti-aliasing and fog that can be processed. Furthermore, a significant function is double buffering, which has front and back cache.
Work activities are divided into four main parts: Hand recognition and tracking, Hand directives, Maze game and system testing. In each part, we can list tasks that will be used to build up the whole program. The project will be developed from week3 to week 12.
1. Hand recognition and tracking
Using OpenCV access video:
Learn to access video and web camera using OpenCV and familiar with visual studio environment.
Time: week 4
Recognize the color of glove
Find the RGB of glove color from web camera. Accomplish the requirement 2.
Time: week 5
Tracking the location of hand or glove, get the right paths and display on the screen. Accomplish the requirement 3 and 4
Time: week6 --- week7
2. Hand directives
Accomplish requirement 5, which is recognition hand gesture, complete basic mouse functions. In this phrase, the application can be used to drawing a picture on a board by handing directives.
Time: week8 --- week9
3 Maze game
Implement a maze game using OpenGL or OpenCV library and try to accomplish the requirement 6.
Time: week 9--week 10
4 Testing phrase and prepare final presentation
Testing the system in different environment and operate by different users. Testing the requirement is aim to make the system can performance successfully. Prepare the final presentation and report in advance.
Time: week 11---week 12
We have presented an interaction system which utilized the camera for detecting and tracking object. In addition, we also could use the hand as a virtual mouse to operate the system. We proposed a method which utilize the different color between foreground and background that makes the object easier to be caught. Although there are many disadvantages of our system, for instance, it lacks consistency between the mobile object and system response time, and users cannot play complex game, we are going to improve these aspects of our system. In conclusion, our system results in high performance human-machine interaction system.