0115 966 7955 Today's Opening Times 10:00 - 20:00 (BST)

Developing Humanoid Robot Animations in Motion Capture

Disclaimer: This dissertation has been submitted by a student. This is not an example of the work written by our professional dissertation writers. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.

Introduction (Chapter 1)

This research describes the framework in which the different human movements have been taken from motion capture and that information is animated which sets the direction to study the digital character models and its locomotion in the virtual environment. It also gives feasible approach in understanding of walking gait patterns in that environment. This framework also leads to the study issues related to safety engineering.


Analysis of human locomotion and its research area have changed since it began form the cave drawings of the Paleolithic Era. The motive for human locomotion studies at early stages were driven by the need to carry on by resourcefully moving from place to place, dodging from predators and hunting for food (Alexander, 2000). Modern-day human locomotion studies have added to a wide range of applications ranging from military use, sport, ergonomics, and health care. In locomotion studies, according to (Hall, 1991) the term biomechanicsbecame accepted during the early 1970s as the internationally recognized descriptor of the field of area concerned with the mechanical study of living organism. In sport, human locomotion studies are made to extend the restrictions of an athlete when even the minimum improvement in performance is pursued eagerly (J. A. , 1984). However, the development of human locomotion studies remains reliant on the improvement of new tools for observation. According to (Alexander, 2000) lately, instrumentation and computer technology have grant opportunities for the improvement of the study of human locomotion. (J. A. , 1984).

Illustrate frequent techniques for measuring motion and mentioned the co-ordinate analyzer (motion capture device) as a major advance in movement study. According to (Furniss, 2000) Motion capture or mocap was initially created for military use earlier than it was modified into the entertainment industry since the mid 1980.s. (Dyer, 1995) define motion capture as measuring an objects location and direction in physical space, then recording that sequence into a computer usable form. According to(Micheal, 2003) ; (Suddha Basu, 2005) motion capture is the fastest way to produce rich, realistic animation data. (James F O'Brien, 2000) illustrate that Mocap can also be useful in several additional fields such as music, fine art dance, sign language, motion recognition, rehabilitation with medicine, biomechanics, special effects for live-action films and computer animation of all types as well as in defense and athletic analysis/training. There are basically three types of motion capture systems accessible such as mechanical, electromagnetic and optical based system. All three systems go through the same basic process shown in figure. The first step is the input where the movement of live actors either human or animal is recorded using various method depending on the type of the motion capture system used. Next, the information is processed to identify the corresponding markers of the live actor and then transferred into virtual space using specialized computer software. Finally the output is where the information is translated into a 3D trajectory computer data that contains translation and rotation information known as motion capture file.


Producing realistic character animation remains one of the great challenges in computer graphics. At present, there are three methods by which this animation can be produced. The first one is key framing, in which the animator gives important key poses for the character at specific frames. A second one uses physical simulation to drive the character's motion its results are good, due to lack of control its difficult to use and it's costly and with characters it's not much successful. The last one is motion capture, has been widely used to animate characters. It uses sensors placed on person and collects the data that describes their motion however they are performing the desired motion. As the technology for motion capture has improved and the cost decreased, the interest in using this approach for character animation has also increased. The main challenge that an animator is confronted with is to generate character animation with a realistic appearance. As humanoid robot renovation is a popular research area since it can be used in various applications to understand the emerging field of robotics and other digital animation fields. Currently most of the methods work for controlled environments and human pose reconstruction to recognize humanoid robots is a popular research area since it can be used in various applications. Motion capture and motion synthesis are expensive and time consuming tasks for articulated figures, such as humans. Human pose view based on computer vision principles is inexpensive and widely applicable approach. In computer vision literature the term human motion capture is usually used in connection with large scale body analysis ignoring the fingers, hands and the facial muscles, which is the case in this research. The motion capture is fairly involved to calculate a 3D skeletal representation of the motion of satisfactory value to be functional for animation. The animation generation is an application of motion capture where the required accuracy is not as high as in some other applications, such as medicine (Ferrier, June 2002)

Problem Context

1) Even though motion capture is applied into so many fields by creating physically perfect motions, it has a few significant weaknesses. According to (Lee, MCML: Mocap, 2004) firstly, it has low flexibility, secondly the captured data can have different data formats depending on the motion capture system which was employed and thirdly, commercially available motion capture libraries are difficult to use as they often include hundreds of examples. (Shih-Pin Chao, 2003) States that motion capture sessions are not only costly but also a labor intensive process thus, promotes the usability of the motion data.

2) In the field of animation and gaming industry, it is common that motion information are captured to be used for a particular project or stored in a mocapdata. This data can either be used as the whole range of motion sequence or as part of a motion synthesis. In sport science, mocap data is used for analyzing and perfecting the sequencing mechanics of premier athletes, as well as monitoring the recovery progress of physical therapies. This simply means that a vast collection of motion capture data models are limited for different sets. Currently, motion data are often stored in small clips to allow for easy hand sequencing for describing the behavior (Jernej Barbic, 2004) (Tanco L. M., 2000). However, according to (Lee, MCML: Mocap, 2004) (Morales, 2001) (Tanco L. M., 2000) a motion capturedata models lack interoperability. This calls for an immediate need for tools that synchronize these datasets (Feng Liu, 2003).

3) In light of the recent course of interest in virtual environment applications, much research has been devoted to solving the problems of manipulating humans in 3-D simulated worlds, and especially to human locomotion. However, most of the animation approaches based on these studies can only generate limited approach lacking the locomotion capabilities for example walking their application in virtual environments are inevitably limited.

Project Objective

The objective of this project is to create a framework taken from motion capture data techniques which can set the direction to study 3D articulated figures and the humanoid robot locomotion in the virtual environment by understanding walking gait patterns in human. This framework also leads to the study issues related to safety engineering.

The other objective of this project is to capture, process, and examine the locomotion feasibility in virtual environment and analyze different tasks in virtual environment.

In system overview diagram all the different steps has been described it starts from mocaop suit that is on the subject and then its data of random movement has been taken into computer and motion analysis is done. After motion analysis it's been retargeted and with avatar model the final output scene has been created. Then with software development kit feasible program has been created to deal with different information of that scene.

Project Scope

To capture the human motion from the motion capture technology and using the captured data to animate the different motions and then refining the animated data. By using the technology called motion builder we can simulate and study the effects of walk and fall in the virtual environment.  After mapping the captured data in the animated character which is called digital humanoid robot an application is build to study the nature of the animated scene which is called an enhanced framework. The other technology is used is called mathematica which is used for studying the factors in mathematical terms because the human motion builder is a simulation technology and mathematica is a dynamic solver engine. So it will lead towards the study of digital humanoid robot of walking and falling in virtual environments on some assumptions.


This part outlines the in general structure of the thesis, and a short explanation for each chapter:

Chapter 1: deals with Introduction, scope and objective with problem context.

Chapter 2: Introduces human motion capture techniques and different work in animation of human walking in virtual environment and gives a summary of the related work in this area.

Chapter 3: deals with the system structure which describes the hardware and the software technologies involved in the research and also illustrate the frame work model and this model help exploit the behavior of humanoid which sets up the framework.

Chapter 4: describes the framework analysis based on the study of articulated animation models in virtual environment and walking gait patters with Bezier curve algorithm.

Chapter 5: mention all the techniques that have been extracted from different software's and how it's used to set up the whole framework and evaluates results which are categorized in three phases the application which represents coordinate system and structure, walking gait patters by using Bezier curve and the falling effect by visual aid.

Chapter 6: is the conclusion that summarizes the outcome of the project, and discusses the future works.


This chapter describes the introduction of motion capture and how it will be utilized to improve the study of human locomotion. The project scope and objectives are elaborated and listed down in this chapter.

Literature Review (Chapter 2)

Motion capture system

 Motion capture is an attractive way of creating the motion parameters for computer animation. It can provide the realistic motion parameters. It permits an actor and a director to work together to create a desired pose, that may be difficult to describe with enough specificity to have an animator recreate manually (Ferrier, June 2002). The application areas of motion capture techniques can be summarized as follows (Perales, 2001):

Virtual reality: interactive virtual environments, games, virtual studios, character animation, film, advertising

Smart surveillance systems: access control, parking lots, supermarkets, vending machines, traffic.

Advanced user interfaces: advanced user interfaces.

Motion analysis and synthesis: annotations of videos, personalized training, clinical studies of medicine.

Understanding the working of humanoid robot has been always on study of human locomotion.  This literature review discusses human motion control techniques, motion capture techniques in general and advance, non-vision based motion capture techniques, vision-based motion capture techniques with and without markers and other enhanced techniques which are covered in details for which the framework can be understood easily.

Properties of Tracking Systems

This section lists properties of tracking systems and discusses the relationships between the various properties.


Accuracy can be defined as the agreement between the measured results from tracking technologies and the actual position of the object, and because the true value is unknown the tracking technologies can only be evaluated with relative accuracy. For one tracking system, the accuracy is limited by the principle and affected by the noise/interferences from the environment. The sources of noises are depending on the tracking technology we use. For different tracking principles, the influencing factors are different. For example, for optical motion tracking, the interference is lighting and AC current; for magnetic, ferrous objects distort the magnetic field and cause errors. If the model or the mechanism of the noise is quantitatively known, it is a systematic error and can be compensated by post-treatment after tracking or eliminated by pre-filtering before tracking.


Robustness defines the system's ability to continue to function in adverse conditions or with missing or incorrect measurements. Some systems make assumptions about the surrounding environment during operation. Also, a system may be unable to take a measurement at a particular time. Related to the robustness is repeatability in the reported data. If the reported values are consistent over time and over operating conditions and environment, then measuring the accuracy (or the lack thereof) is possible, and corrective algorithms can be applied.

Tracking range

The range is the space in which the system can measure sufficient and accurate data for the application. For some systems, the range can be reduced by noises from the environment or be limited by the hardware of the system itself. For example, magnetic system cannot track accurate data when the tracked object is at the margin of the magnetic field due to the inhomogeneous distribution of the field.

Tracking speed

Tracking speed is the frequency at which the measurement system can obtain the updated tracking data. There are two significant numbers for the system, one is update rate and the other one is latency. Update rate is the frequency at which the tracking system generates the tracking data; latency describes the delay between tracking data has been generated and the host computer receives the data in real-time mode.


The hardware means the physical realization of the components of the tracking system. It includes the number of components, and the size and weight of those components, especially those that the user is required to carry (or wear). Some systems may have a significant amount of hardware that must be set up in the environment, although it may need no further attention from the user once in position. Ideally, the application would like to give the user complete freedom of movement. Some devices tether the user to a fixed object. Some systems may have a heavy or unwieldy device which the user must manipulate in order to move. Some devices have a tendency to pull the user back to a “resting position” for the device. The hardware also determines the biggest part of the costs and therefore is very often a decisive factor for the choice of the applied motion tracking system

Non-vision Based Motion Capture

In non-vision based systems, sensors are attached to the human body to collect movement information. Some of them have a small sensing footprint that they can detect small changes such as finger or toe movement (Hu, A survey - human movement tracking and stroke rehabilitation, 1996). Each kind of sensor has advantages and limitations (Hu, A survey - human movement tracking and stroke rehabilitation, 1997).

Advantages of magnetic trackers:

  • real-time data output can provide immediate feedback
  • no post processing is required
  • they are less expensive than optical systems
  • no occlusion problem is observed
  • multiple performers are possible

Disadvantages of magnetic trackers:

  • the trackers are sensitivity to metal objects
  • cables restricts the performers
  • they provide lower sampling rate than some optical systems
  • the marker configurations are difficult to change

Advantages of electromechanical body suits:

  • they are less expensive than optical and magnetic systems
  • real-time data is possible
  • no occlusion problem is observed
  • multiple performers are possible

Disadvantages of electromechanical body suits:

  • they provide lower sampling rate
  • they are difficult to use due to the amount of hardware
  • configuration of sensors is fixed

Vision-Based Motion Capture with Markers

In 1973, Johansson explored his famous Moving Light Display (MLD) psychological experiment to perceive biological motion (Johansson). In the experiment, small reflective markers are attached to the joints of the human performers. When the patterns of the movements are observed, the integration of the signals coming from the markers resulted in recognition of actions. Although the method faces the challenges such as errors, non-robustness and expensive computation due to environmental constraints, mutual occlusion and complicated processing, many marker based tracking systems are available in the market. This is a technique that uses optical sensors, e.g. cameras, to track human movements, which are captured by placing markers upon the human body. Human skeleton is a highly articulated structure and moves in three-dimension. For this reason, each body part continuously moves in and out of occlusion from the view of the cameras, resulting in inconsistent and unreliable motion data of the human body. One major drawback of using optical sensors and markers, they cannot sense joint rotation accurately. This is a major drawback in representing a real 3D model (Hu, A survey - human movement tracking and stroke rehabilitation, 1997). Optical systems have advantages and limitations (Perales, 2001).

Advantages of optical systems are as follows:

  • they are more accurate
  • larger number of markers are possible
  • no cables restricts the performers
  • they produces more samples per second

Disadvantages of optical systems:

  • they require post-processing
  • they are expensive (between 100, 000 and 250, 000)
  • occlusion is a problem in these systems
  • environment of the capturing must be away from yellow light and reflective noise

Vision-Based Motion Capture without Markers

As a less restrictive motion capture technique, markerless-based systems are capable of overcoming the mutual occlusion problem as they are only concerned about boundaries or features on human bodies. This is an active and promising but also challenging research area in the last decade. The research with respect to this area is still ongoing (Hu, A survey - human movement tracking and stroke rehabilitation, 1996). The markerless-based motion capture technique exploits external sensors like cameras to track the movement of the human body. A camera can be of a resolution of a million pixels. This is one of the main reasons that optical sensors attracted people's attention. However, such vision-based techniques require intensive computational power (Bryson, 1993). As a commonly used framework, 2D motion tracking only concerns the human movement in an image plane, although sometimes people intend to project a 3D structure into its image plane for processing purposes. This approach can be catalogued with and without explicit shape models (Hu, A survey - human movement tracking and stroke rehabilitation, 1996). The creation of motion capture data from a single video stream seems like a plausible idea. People are able to watch a video and understand the motion, but clearly, the computing the human motion parameters from a video stream are a challenging task (Ferrier, June 2002). Vision-based motion capture techniques usually include initialization and tracking steps.


A system starts its operation with correct interpretation of the current scene. The initialization requires camera calibration, adaptation to scene characteristics and model initialization. Camera calibration is defined as parameters that are required for translating a point in a 3D scene to its position in the image. Some systems find initial pose and increment it from frame to frame whereas in other systems the user specifies the pose in every single frame. Some systems have special initialization phase where the start pose is found automatically whereas in others the same algorithm is used both for initialization and pose estimation (Granum, 2001).


Tracking phase extracts specific information, either low level, such as edges, or high level, such as head and hands. Tracking consists of three parts (Granum, 2001):

  • Figure-ground segmentation: the human figure is extracted from the rest of the image.
  • Representation: segmented images are converted to another presentation to reduce the amount of information.
  • Tracking over time: how the subject should be tracked from frame to frame.


Mechanical measurement is the oldest form of location; rulers and tape measures provide a simple method of locating one item with reference to another. More sophisticated mechanical techniques have been developed. Nowadays measurements of the angles of the body joints with potentiometers or shaft encoders combined with knowledge of the dimensions of the rigid components allow accurate calculations of the position of different body parts.(Beresford, 2005)

Today mechanical position tracking devices can be separated into body-based and ground-based systems.

Body based systems are those which are mounted on, or carried on, the body of the user and are used to sense either the relative positions of various parts of the user's body or the position of an instrument relative to a fixed point on the user's body. Body-based systems are typically used to determine either the user's joint angles for reproduction of their body in the synthetic environment, or to determine the position of the user's hand or foot, relative to some point on the user's body. Since the body based systems are used to determine the relative position between two of the user's body parts, the devices must somehow be attached to the user's body. This particular issue has raised many questions: How is the device attached to the body in a way which will minimize relative motion between the attachment and the soft body part? How are the joints of the device aligned with the user's joints to minimize the difference in the centers of rotation? Some other problems associated with body-based tracking systems are specifically caused by the device being attached to the user's body. These systems are typically very obtrusive and encumbering and therefore do not allow the user complete freedom of movement. Body-based systems are, however, quite accurate and do not experience problems like measurement drift (the tendency of the device's output to change over time with no change in the sensed quantity), interference from external electromagnetic signals or metallic devices in the vicinity, or shadowing (loss of sight of the tracked object due to physical interference of another object)(Frey, 1996).

Ground based systems are not carried by the user but are mounted on some fixed surface (i.e. the user's desk or the floor) and are used to sense the position of an implement relative to that fixed surface. Ground-based systems are typically used to determine the position and orientation of an implement manipulated by the user relative to some fixed point which is not on the user's body. Like body-based mechanical systems, they are very accurate and are not plagued by measurement drift errors, interference or shadowing. Ground-based systems do suffer from one thing which the body-based systems do not: They confine the user to work within the space allowed by the device. Usually this means that the user is confined to work in a space the size of a large desk. If the application does not require the user to move around much throughout the task (i.e. the user remains seated), this is not considered as a problem.

Mechanical tracking systems are the best choice for force-feedback (haptic) devices since they are rigidly mounted to either the user or a fixed object. Haptic devices are used to allow the user a 'sense of touch'. The user can feel surfaces in the synthetic environment or feel the weight of an object. The device can apply forces to the user's body so that the user can experience a sense of exertion. Mechanical tracking systems also typically have low latencies (the time required to receive useful information about a sensed quantity) and high update rates (the rate at which the system can provide useful information). Therefore these systems have found good commercial niche as measurement devices and hand tracking systems.


  • high update rate
  • low latency
  • accurate
  • No blocking problem, no interference from environment · best choice for force feedback


  • Restricted movement from mounted device


Acoustic tracking systems utilize high frequency sound waves to track objects by either the triangulation of several receivers (time-of-flight method) or by measuring the signal's phase difference between transmitter and receiver (phase-coherence method).

Generally the user carries the transmitter, and a series of sensors around the room determine the linear distance to the transmitter. Some systems have the user carry a receiver and listen to a series of transmitters positioned around the volume.

The 'time-of-flight' method of acoustic tracking uses the speed of sound through air to calculate the distance between the transmitter of an acoustic pulse and the receiver of that pulse. The use of one transmitter on a tracked object and a minimum of three receivers at stationary positions in the vicinity allow an acoustic system to determine the relative position of the object via triangulation. This method limits the number of objects tracked by the system to one. An alternative method has been devised in which several transmitters are mounted at stationary positions in the room and each object being tracked is fitted with a receiver. Using this method, the positions of numerous objects may be determined simultaneously. Note that the use of one transmitter (or one receiver) attached to an object can resolve only position. The use of two transmitter (receiver) sets with the same object can be used to determine the position and orientation (6 DOF) of the object. The desire to track more than just the position of an object suggests that the second method (multiple stationary transmitters with body mounted receivers) may be preferable.

The other method of acoustic tracking is the phase-coherent tracking. It may be used to achieve better accuracies than the time-of-flight method. The system does this by sensing the signal phase difference between the signal sent by the transmitter and that detected by the receiver. If the object being tracked moves farther than one-half of the signal wavelength in any direction during the period of one update, errors will result in the position determination. Since phase coherent tracking is an incremental form of position determination, small errors in position determination will result in larger errors over time (drift errors), which may be the reason why only few phase-coherent systems have been implemented successfully.

Some problems associated with both acoustic tracking methods result from the line-of-sight required between transmitter and receiver. This line of sight requirement obviously plagues the devices with shadowing problems. It also limits their effective tracking range, although they have better tracking ranges than electromagnetic systems. Unlike electromagnetic systems, they do not suffer from metallic interference, but they are susceptible to interference caused by ambient noise sources, by reflections of the acoustic signals from hard surfaces, and environmental interference (e.g. temperature variations).


  • Very high freedom of movement



  • Line-of-sight problems
  • Either high range or high accuracy (not both!)
  • Environmental interference (e.g. temperature variations, other noise sources)
  • Drift errors (phase-coherent)
  • High latency, low update rates


Electromagnetic tracking systems are currently the most widely used systems for human body tracking applications. They employ the use of artificially-generated electromagnetic fields to induce voltages in detectors attached to the tracked object. A fixed transmitter and the sensors consist of three coils mounted in mutually orthogonal directions. The sensors range in size, but tend to be around a few cubic centimeters. The transmitters range in size with the power of the field they are expected to generate, and range from several cubic inches to a cubic foot. There are four magnetic fields that have to be measured: the environmental field (including the Earth's magnetic field), and three orthogonal fields in the transmitter's coordinate directions in figure. Each of these fields is measured in the sensor's three coordinate dimensions for a total of twelve measurements of each sensor. From this information, the position and orientation of the sensor with respect to the transmitter can be computed.

These tracking systems are robust, fast, and fairly inexpensive and can be used to track numerous objects (body parts) with acceptable position and orientation accuracies (on the order of 0.1 inches and 0.5 degrees). Unlike electric fields, magnetic fields are unaffected by the presence or absence of human bodies and other non-metallic objects in the environment. This offers a tremendous opportunity, because it enables magnetic trackers to overcome the line-of-sight requirement that plagues acoustic, optical, and externally connected mechanical tracking systems. On the other hand, the magnetic systems suffer from sensitivity to background magnetic fields and interference caused by ferrous metal devices in the vicinity, and therefore is inaccurate in practical environments. Due to this and the limited range of the generated magnetic field, the magnetic tracking systems are restricted to a small special area.


  • High update rates
  • Very low latency
  • High robustness
  • No shadowing
  • Rather cheap

Acceptable accuracy in artificial environment


  • High sensitivity to background magnetic fields
  • Inaccurate in practical environments due to interference caused by ferrous metal devices
  • Low range of the magnetic field and Tracking scope is low due to cable


An internal sensor contains three gyroscopes, to determine the angular rate, and three accelerometers, to determine linear acceleration. Originally, they were mounted to orthogonal axes on a gimbaled platform, as it can be seen in figure. After removing the effect of gravity from the vertical accelerometer, the data has to be double-integrated to provide a measure of the offset between initialization and the current position. In fact, this combination of sensors has been used successfully for inertial navigation systems in ships, airplanes and spacecrafts since the 1950s, but to mount on a human body they are too cumbersome and heavy. Also, the human-motion-tracking requires precision on the order of centimeters or less, whereas with navigation hundreds of meters are often sufficient.

Now several companies have begun producing tiny micro-machined accelerometers which, given the time for development of application specific integrated circuits, would be small and light enough for use in inertial human body tracking.

Inertial body tracking has the potential to be the best method of human body tracking. This is due primarily to the fact that there is no artificial source signal required for operation. The only true signal sensed is the earth's gravitational field, which does not change enough to effect the operation of an inertial tracking device.

The lack of an artificial source means that inertial tracking methods do not suffer from signal or metallic object interference or shadowing. The transmitting of body tracking data via wireless means would also eliminate tethers. The user would be free to move around in the real world with no restrictions, but the range of the data transmitter (which can have a very long range, considering pocket radio or satellite communications). Also, the use of advanced micro-machined inertial sensors and application specific integrated circuits would minimize the encumbrance of the user. The only major drawback of inertial sensors is their tendency to drift, and therefore the tendency to accumulate measurement errors over time. Techniques are available which minimize the effects of drift. The most widely used is a combination of angular rate and linear acceleration sensors which compensates for drift using the earth's magnetic and gravitational fields as a directional reference.

The current rate of advance in inertial sensor technology has the potential of making inertial technology the primary means of human body tracking within the next few years.


  • Very high range
  • High freedom of movement (if wireless)
  • No shadowing

No environmental interferences


  • Long term drift of orientation
  • Expensive


In optical technologies, the motion capture is realized by tracking the marker points on the human body (marker based tracking) or by processing the image to get counter of the human body without markers (marker-less tracking).

Generally speaking, optical tracking methods are based on computer vision principle and image processing.

Theoretically, if one point can be captured by two cameras simultaneously, we can fix the position of the spatial point. If the cameras capture successive images, we can get the trace of the point from the image sequences with pattern recognition and image processing techniques.

In order to simplify the post processing and synthesis and analysis, during the tracking process, the tester is required to wear single-color clothes doing actions in simple background environment. The markers are stick to the important joints or positions of the human body, so that the motion type can be tracked for real time analysis and post data processing.

The advantage of this technology is that the moving space is very large; without cable, sensors and other mechanical mechanism, there is no constraint to the human body. It can track several markers simultaneously without increasing the cost.

The disadvantages of it are it is difficult for post-data processing, high requirements for the image quality. Especially when the movement is very complex, there would be occlusions and confusion between different markers, we cannot get the right tracking result.

Vicon, Motion Analysis and Meta Motion are three most famous companies in optical motion tracking domain. Take Motion Captor developed by Meta Motion as an example. The software works under the Windows NT and Windows 2000 OS. It can connect with 4 to 9 cameras, and the working range is a circle with diameter up to 6 meters, and there is no limitation for the number of tracking objects.

Today probably the simplest and most common method is the outside-in marker tracking. In this arrangement at least two cameras are mounted on the walls or ceiling looking in on the workspace, so that each camera has a unique perspective of the targets. The sensors of each camera detect the direction to the targets or markers attached to the object being tracked and generate a 2-D picture of the scene. A computer then triangulates the 3-D position of the beacons using the bearing angles from the two nearest cameras figure below. While only two cameras are required to achieve three dimensional tracking, more are typically used to provide redundancy in an effort to prevent shadowing of the markers. Image-based tracking systems can be divided into two broad categories; those that use passive markers (or no markers) and those that use active markers. Markers are devices which, when placed on the object to be tracked, are visible to the tracking system. The passive markers reflect the light and active markers emit the light. In both systems, cameras are used to record the object being tracked and detect the motion of the markers on the object.

Substances which are used for passive markers should have good retro reflecting characteristics within the measurable light spectrum. An often used material consists of many tiny glass balls, which capture the light and reflect it to the inverse direction. These passive markers which are also shaped to small balls are attached to the human body. Due to the fact that the light source is placed very close to the camera, the camera can capture the bright reflecting light from the flashing light sources. Again triangulation of the markers is used to track them in three dimensions. But this technique reveals the first major problem with passive optical tracking systems, which is the determining correspondence of markers in each of the camera views. In order to use several views of the same marker to triangulate its position, a marker must be distinguishable from the other markers around it.

One method of distinguishing the markers is to use active ones. The markers used in active optical tracking systems are typically infrared light-emitting diodes (IRED's) which emit light visible to the system but not to the user. These diodes are also attached to the human body and emit light that is pulsed in sequence with camera detection. This way the markers can be distinguished. The remaining question now is, how many markers may be used simultaneously. If orientation of the object is desired in addition to position, at least two targets must be placed on the same object and their difference in position used to determine the orientation of the object.

Image based systems rely on the cameras being able to detect the targets at any given instant in time. If an object passes between a marker and a camera during the detection interval, the camera will fail to detect the marker. If this condition persists for a long enough period of time, tracking of the object will fail. Failure in tracking a human body may be caused by a simple thing as one body part obscuring another from all of the camera viewpoints. This effect is called shadowing. As mentioned before, shadowing may be minimized by the use of multiple, redundant cameras, but it cannot be totally eliminated. As would be expected, multiple-source image processing requires a level of computational complexity not required by the other methods of motion capture. The combination of the computational requirements and the use of multiple high resolution cameras make image based tracking one of the most expensive body tracking solutions available.

Advantages and disadvantages of passive optical motion tracking system


  • High accuracy
  • High freedom of movement
  • High update rate
  • Low latency
  • Multiple markers possible

High capturing volume


  • High costs
  • Shadowing of markers
  • High sensibility to reflecting objects
  • marker distinction if the distance between two markers is too short

Advantages and disadvantages of active optical motion tracking system


  • High accuracy
  • No correspondence problems
  • High update rate
  • Low latency
  • Multiple markers possible
  • High capturing volume


  • Extremely high costs
  • Shadowing of markers
  • Low freedom of movement through cable for markers

Animation Methods

There are three main methods by which computer animations are created: (1) key frame interpolation; (2) physical simulation; and (3) motion capture. Each of these methods has its advantages and disadvantages and is appropriate in different situations. In this research we are mainly concerned with using motion capture data.

Motion Capture from animation method

The use of live data for animations has a long history. Traditional animators (animators who draw each frame by hand) often carefully study motion by looking at movie frames in slow motion to see exactly how a live person executes a motion. Taking this idea even further, they may revert to “rotoscoping”, in which case a film of a live actor is actually traced frame-by-frame. In this case they then often go back and accentuate the motion beyond what was originally there to make it more extreme and give even more personality to the characters.

With the advent of computer and 3D models of articulated figures, animators have another tool at their disposal in the form of “motion capture data”. In motion capture data, the joint configuration of a live person are detected by sensors and these angles are then read into the computer model to create the animation. Originally such data was extremely difficult to obtain, as the sensor technology is costly and few animators had access to it.      In recent years the technology has improved and motion capture data has become much more readily available. As a result, there has been increased interest in using it for creating computer animations. The advantages of using motion capture data are clear. It immediately provides motion data for all degrees of freedom at a very high level of detail. If the motion is unusual or if extreme realism is required, it may be difficult for an animator to accurately create the motion. As a result, it may appear that it is much easier to simply capture the actions of a live actor performing the motions, and then map the data onto a model. However, there are some distinct disadvantages of this method. One of the largest is that the data may not be exactly what the animator wants. Motion capture sessions are still costly and labor intensive, so it is best not to have to repeat them. Yet it is often difficult to know exactly what motions are desired before the session.

As a result, a great deal of research in recent years has been aimed at devising methods for editing motion capture data after it has been collected. Many of these methods allow one to vary the motions to adapt to different constraints while preserving the style of the original motion. For example, Witkin and Popovic (Popovic, 1995) developed a method in which the motion data is warped between keyframe-like constraints set by the animator. In other work, Rose and his colleagues developed a method to use radial basis functions and low order polynomials to interpolate between example motions, while maintaining inverse kinematic constraints (C. Rose M. C., September 1998).

A number of techniques for editing motion capture data have been based on the method of spacetime constraints (Kass A. W.), For example Gleicher (Gleicher, 1999) developed a method which allows one to begin with an animation and interactively reposition the character. A spacetime constraints solver then finds a new motion by minimizing the distance from the old motion subject to constraints specified by the animator. A similar method was used to allow for retargetting of motions to characters of different dimensions (Gleicher, 1999). To make the solution tractable, dynamics were not taken into consideration at all. Dynamics was added in the method of Popovic and Witkin (A.Witkin, 1999), in which the editing is performed in a reduced dimensionality space.

Video Motion Capture

Extension of the video motion captures method developed by Bregler (Bregler, 1997). That technique was created to allow recovery of the motion of a high degree-of-freedom articulated figure from video sequences, without requiring the subject to wear any special markers or suits. Given a high-speed video sequence of a human and an accurate model of the subject, meaning all of the joint locations and angles for some initial configuration, the method returns the change in angle for each joint for each frame of the video sequence. It makes use of twists and the product of exponential maps, which allow for a linear approximation that can be used to find a robust solution for the kinematic degrees of freedom.

Motion in Animation

Human Motion Control Techniques

There are mainly two motion control techniques for animating articulated figures: kinematics and dynamics. Kinematics methods use time-based joint rotation values to control the animation while dynamics methods use force-based simulation of movements (Kiss., November 2002). Creating data for these techniques can be done manually by talented animators or can be captured automatically by different types of devices (Thalmann D. , 1996)(Thalmann N. M., 1996). Motion control techniques are summarized in the following sections.

Prior exertion

Control of joint figures has been admired in area of computer animation for years, and there is a lots of studies focused on the significant problems of locomotion control with emphasis on algorithms that have been applied to human walking, especially those used for motion control of limb trajectories. The following are the techniques mentioned in following five categories: kinematics i.e. forward and inverse, dynamic, constraint optimization, and genetic programming. These techniques are focused on different algorithms used in for human walking.


Kinematics approaches obtain the motion parameters by considering position, velocity, and acceleration without being concerned with the forces that cause the motion. Kinematics approaches for simulating human locomotion have been described by quite a lot of researchers over the years (Boulic, 1990) (Calvert, Knowledge-Driven, Interactive Animation of Human Running, 1996). Kinematics methods can be classified into two parts: Forward Kinematics and Inverse Kinematics. Forward Kinematics directly sets the motion parameters like the position, the orientation of joints at specific times for each joint, whereas Inverse Kinematics uses nonlinear programming techniques and the positions of the end-effectors to determine the position and the orientation of the joints in the hierarchy.


Dynamics approaches obtain the motion parameters by using dynamic motion equations considering the forces, the torques, and the physical properties of the objects. The dynamic methods can be classified into two parts: Forward Dynamics and Inverse Dynamics. Forward Dynamics computes the motion parameters by applying forces on the objects, whereas Inverse Dynamics computes the necessary forces for a specific motion.

Motion capture is an effective way of creating the motion data for an animation (Ferrier, June 2002). Quality for animations caused challenging requirements on capture systems. To date, capture systems that meet these requirements have required specialized equipment that is expensive. Computer vision can make animation data easier to obtain. Ideally, the capture of motion data should be easily available, inexpensive. Using standard video cameras is an attractive way of providing these features. The use of a single camera is a particularly attractive way. It offers the lowest cost, simplified setup, and the potential use of legacy sources such as films (Ferrier, June 2002). Human motion capture systems generate data that represents measured human movement, based on different technologies. According to used technology, human motion capture systems can be classified as non-vision based, vision based with markers, vision based without markers.

Forward kinematics

Forward kinematics provides motion control by identifying the joint angles. The motion of the end-effector is calculated as the gathering of all transformations from the sequence root to the end-effector. The advantage of forward kinematics approaches is that they provide the animator control of the motions in nominal cost of computation need. Some of the difficulties faced by animators are given below:

Referring to forward kinematics constraints are obligatory on the motions may be debased. For e.g. in animating human motion, the most essential limitation are that the supporting base should not go off the ground, and that the global motion should be continuous. Particular behavior will be required to assure these constraints. One solution for this problem with forward kinematics is to make the supporting foot the root. The motions projected by this method look genuine; the technique is quite manual work and consumes lots of time to get the results. This technique becomes less realistic as the movement becomes more difficult because the study of the human motion is directed towards the high level motion not on specified desired motion. According to (D.Zelter, 1982), used structured motor control methods to animate motion of a human skeleton with a straight-ahead gait over point, unobstructed terrain. This involves the user the detailed information of the skeleton animation system as well as programming practice. One more negative aspect of this approach is that the animator must operate artistic control in return for automatic motion combination.

(Calvert, Interactive Animation of Personalized Human Locomotion, 1993) projected technical animation techniques to animate human locomotion. In their system, three locomotion parameters, step length, step frequency and velocity, are used to identify the basic locomotion step. Supplementary locomotion elements are added at different levels of the motion control hierarchy to individualize the locomotion. Their model is mainly focused on normal walking on flat ground, without additional adjustment of the model; its application is very much limited in virtual environments.

Inverse kinematics

Inverse kinematics for end-effector objective positioning is adopted from robotics. It calculates the joint angles for each segment from the position and direction of the end of the limb. The advantages of inverse kinematic approaches is the animator sets the configuration of the endeffector only, and inverse kinematics will solve for the configurations of all joints in the link hierarchy. (Boulic, 1990) used a generalization of investigational data based on the normalized velocity walking. The overview, in its application, could produce undirected results, such as constraints abuse some of the kinematic constraints imposed on walking. Inverse kinematic was applied to correct these problems. In the compound inverse kinematic explanations, the one that is the closest to the unique motion is selected to maintain the original characteristics of the walking data. Jack system(Phillips C. B., 1993) developed at University of Pennsylvania, Phillips and (Badler P. C., 1991) applied an inverse kinematics algorithm to generate motions. The users have to select the end-effectors accurately and then characterize sets of constraints that drive the limbs to move in preferred patterns. (Koga Y., 1994) used a path planner to calculate the collision-free trajectories for collaborating arms to operate a portable entity between two configurations. Inverse kinematic algorithm was utilized by the path planner for the creation of forearm and upper arm positions to match the hand position. After that, joint approach of the wrist was calculated to match the hand course.

Genetic programming

Genetic programming utilizes genetic algorithms to write programs. It has been used to give results to an array of problems in computer animation. It defines indefinite number of possible motion for articulated motion figure in hyperspace. The motion which focuses on walking running and jumping are natural human motions and not much of the work has been done in this field. (K, 1994), (Hahn G. L., pp. 129-142) have developed systems to animate articulated figures' behaviors and arrangements in virtual world. In this users have to sacrifice some control when using these approaches. Similar to the objective functions in spacetime approaches, the robustness functions are the determining issues in genetic programming. For articulated figure motion, particularly deliberate movement of composite articulated figure, such as human locomotion, determining the proper robustness measures and elaborating is a big task to the animators.

Constraint optimization

Constraint optimization ideas produce animation through an optimization of the goal subject to the limitation specified by the animator. Due to the non linear relation and lack of constrain on movement trajectory between joint and limb motion the modeling of articulated figure is a basic problem in addition, experiential studies of synchronized animal motion propose that limb route and body movement appear to be determined in terms of optimization of performance, such as minimization of jerk about the end of the limb (Badler N. I., 1991). (Kass W. A., 1988) used spacetime constraints to control the motion of a jumping Luxo lamp. The performance of spacetime in Witkin and Kass's work was limited by the fact that the aim functions had to be optimized over the whole distance of an animation. To decrease the computational difficulty of optimization and provide user more control over the motions. (F, 1992) Separated the original spacetime work into subsets or smaller spacetime windows, over which subproblems are formulated and solved. These spacetime approaches are capable of giving realistic results. However, they all undergo from computational difficulties when the difficulty of the character or animation increases. They are not suited for interactive human figure simulation.

(M, 1997), projected a locomotion system to use footprints as the basis for generating animated locomotion. The basis of this approach is to simulate the motion exclusively in terms of a center of mass trajectory which itself is produced from the footprint information. The footprint planning algorithm is determined for bipedal characters and uses some timing information in addition to the footprint position and direction. (Hahn C. S., 1999), introduced a hierarchical motion control system for animating human walking the length of predetermined paths over uneven terrain. Their method guaranteed that the foot remains in make contact with the ground during stance and avoid collision during swing. The joint angles for the subordinate limbs and the trajectory of the pelvis are calculated by inverse kinematic and optimization methods. Utilizing the proposed control algorithms, their walking model can be familiar for rising slopes and stairs. Constraint optimization techniques have able to routinely cause animated and natural limb motions that satisfy relatively a few of the fundamental ethics of animation. Improved spacetime practices, such as (F, 1992) (Laszlo J., 1996) are robust for multifarious motions i.e. locomotion on irregular landscape could be broken into several spacetime windows to convince the constraints and animation goals. The motions produced are depended on the animator's ability to program the mathematical functions that meet the goals of a desired animation but it seems to be a difficult task for the animators

Further revision in human gait investigation

Investigation study in biomechanics and human gait analysis (Chao, 1985)(F, 1992) (Holt K. G., 1991) has made general studies of human body motion during normal level walking. Standard results have come from investigation of motion prototype of different positions, of body joints, muscles behaviors and feedback of the foot with the floor. They provide a resource for simulating human locomotion. Though, focus in on level walking. (Chang, 1968), made revision of the force expended in walking by examination of the motion of the leg and foot in the swing period of a stride. The force used is obtained by calculating the work done in traveling a given distance. The results test reasonably fit with natural gait, and specifies that for a given individual there is a natural gait at which he can go a given space with minimum exertion.

This suffers from the necessity of maintaining balance during gait. (Tsai, 1976), projected a bipedal robot model for uneven terrain walking. This idea uses a frequent locomotion algorithm and varies the coefficients and initials to generate a certain range of gaits. The climbing and downward gaits were produced according to broad postural stability and other possibility requirements for a kinematic constrained, articulated walking model. As far their studies focuses on biped machines many of their features are taken from human motion attributes and the result focuses on the swing leg cycle in gait algorithm. Therefore, system kinematics is that iteration that could utilize the initial and last incurable pattern data to define successive walking, and a diversity of walking can be attained by adjusting the same basic gait algorithm and varying preliminary surroundings.


Based on the discussion of the motion capture technologies above, the suitable motion technology should be selected to fit the requirements of this project.

According to the criteria of this project, the motion capture system is to be used to record the movement in order to study the movements. In the real environment there are all kinds of interferences which are limited so the performance is not suitable for some motion tracking technologies. Due to the limitations and requirements for this project, the inertial motion capture technology is the most suitable one

System Design (Chapter 3)

In this chapter the overall structure of the system designed is described and also the model of the framework has been introduced. In the previous chapter we selected the best approach for designing the model of the framework which leads to the better understanding of the whole system and also describe the procedure from software perspective.

Hardware design

From the figure overall layout of the hardware system is shown. All wired with different parts as its fits in the body of subject.

System Introduction

The Xsens MVN motion capture outfit is cost proficient system for full-body human motion capture. MVN is based on single, inertial sensors, biomechanical models and sensor fusion algorithms (Bachmann, 2000). MVN does not need external cameras, emitters or markers. It can thus be used outdoors as well as indoors. It's unique for inertial motion capture technology: the sensor-suit captures any type of movement, including running, jumping, crawling and cartwheels. (Zhou, pp. 295-302)

The use of inertial sensors has become a common practice in ambulatory motion analysis (Morris, 1973) (Bonato, 2003). For accurate and drift free orientation estimation several methods have been reported combining the signals from 3D gyroscopes, accelerometers and magnetometers (Bachmann, 2000) (Foxlin, 1996). Accelerometers are used to establish the course of the local perpendicular by sensing acceleration due to gravity. Magnetic sensors give firmness in the parallel plane by sensing the track of the terrain magnetic field like a scope.

To provide full six-degree-of-freedom tracking of body segments with connected inertial sensor modules, each body segment's orientation and position can be estimated by, respectively, integrating the gyroscope data and double integrating the accelerometer data in time. However, due to the inherent integration drift, these uncorrected estimates are only accurate within a few seconds (D. Giansanti, 2005). By combining the inertial estimates with other body worn aiding systems, such as an acoustic (D. Vlasic, 2007) or a magnetic tracker (D. Roetenberg, 2007), unbound combination flow can be prohibited.

System Calibration

The MVN motion capture has 17 sensors with two primary sensors (Technologies, 2011). It also consists of 3D gyroscope accelerometers and 3D magnet meters given by (38*53*21 mm, 30g). The sensor units are connected to the primary unit means that only one cable is attached to them. The primary sensor synchronies all sensor working and provide powers dealing with the wireless communication with computer system.

With the plugging of double AA size batteries the total weight of the system is 1.9 kg, sensors are attached on the feet, covers legs, upper legs, shoulders, forearms, sternums and head. The calibration is important part of the whole procedure because of the stochastic nature of the biomechanical model and over all sensors.

Proper steps should be taken for setting up the whole procedure. Since the entire model has to be stepped in a right way so it can be integrated with a scheme called fusion scheme. 

In the prediction step, sensor kinematics is computed via inertial navigation algorithms (INS) from the measured accelerations and angular velocities (Technologies, 2011). By means of the biomechanical model, the sensor kinematics is rendered to segment kinematics. In the correction step, joint updates are affected to the segments, pursued via detection of contacts of points with the external world (D. Vlasic, 2007). Aiding sensors can be incorporated in the sensor fusion scheme. After all correction steps, estimated kinematics are fed back to the appropriate prediction step.(Koopman, 1989) (Gelb, 1974).

As from the above explanation enhanced algorithms are used i.e. INS inertial navigation system through with all the sensors are processed. Sensor fusion scheme mostly is used to correct the estimated quantities such as velocities and sensor different methods are used such as T pose i.e. upright with horizontal and thumbs forward.

After T pose, the person is asked to execute a confident movement that is implicit to communicate to an assured axis e.g. the arm axis is defined in rotations of the hand and forearms so the palm faces upwards. The measured direction and angular speed are used to discover the sensor point of reference with respect to the segment's functional axes(H. Luinge, 2007).

Preliminary approximation of joints locations are acquired by calculating several body proportions: body height, arm width and foot size. For precise leveling, these approximations can be distinguished by determining several anatomical pointers. Among each supplementary given aspects, the scaling model is accustomed. Dimensions which can contain the greater trochanter, tangential epicondyle on the femoral bone, lateral malleolus, anterior sup. ilias spine and the acromion. Other proportions are achieved by using regression equations on the basis of anthropometric models (Koopman, 1989) (Morlock, 1989) (Contini, 1963).

Last step in the calibration method, the sensor to segment alignment with segments lengths can be re-projected by means of a data about the space between two points in a kinematic chain. For instance, when the person grips his hands together while moving them around, the space among the left and the right hand palm is zero for each pose. This closed kinematic chain can be explained which will advances the calibration standards.

Motion Simulation software

MVN Studio

MVN Studio is a graphical user interface furthermore allows the user to monitor the movements of the person in real. The data can be stored in different formats and MVN format. Mocap data can be transfer to standard a

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Request Removal

If you are the original writer of this dissertation and no longer wish to have the dissertation published on the UK Essays website then please click on the link below to request removal:

More from UK Essays

Get help with your dissertation
Find out more