Multimodal Biometric Authentication Using Kinect Sensor

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


Security and privacy in any network plays important role for any IT Industry, smart Home or any other social networks. Experts have found many techniques to secure the networks using various Authentication mechanisms. Each mechanism has its own strengths and weaknesses. To enhance the security this research proposes Multi-modal user Authentication by combining more than one mode of authentication mechanism. This multimodal biometrics are fused together to produce strong and universal authentication. The proposed system authenticates the user by using Microsoft Kinect Sensor device. Kinect sensor plays important role in Biometric authentication where it captures user information by skeleton tracking. Tracked skeleton points are validated in various computations to figure out the person match with existing data record. Authentication fails when there are differences in observed data and already captured data, the system notify message to the known user.

Index term— Multiple Authentication, Multimodal Biometrics, Kinect Sensor.


A Home Area Network (HAN) is a resident local area network (LAN) used for communication between digital devices typically deployed in home includes multiple pc’s, smart phones and other networked home appliances like smart Television, Refrigerator, Washing machine, Air Conditioner etc. All these devices communicate within a Home Network based on trust and store or share any private information. These networks can be connected remote to access these devices from anywhere on the internet. Most accessible devices like personal computers, laptops used for online banking, online shopping and sharing of private information to friends network or any other network are susceptible to hacks and Intruders to

attack. In Most of the Home networking technologies, the focus is on device authentication rather than the user authentication which is not reliable to use. User Authentication checks the user credentials based on user request.

Multifactor Authentication provides more privacy for user information in Home network. Biometric Authentication also satisfies the regulatory definition of true multi-factor authentication [7]. Users may biometrically authenticate via their fingerprint, voiceprint, palm print, signature, vice recognition etc. and are interoperable with standard authentication mechanism like password, PIN, Security token. Unimodal or static biometrics reduce the error rate but it cannot resolve the problem [6]. Some of the limitations imposed in Unimodal biometrics can be resolved by multimodal biometrics. Here hard biometric trait like fingerprint, face, iris and ear and soft biometric trait like voice, weight, colour etc. are used to enhance the security level.

Kinect is a human motion tracking sensor device to detect the human height, movements, colour and audio by using infrared projector and camera. Microsoft released Software development Kit for windows which it provides Kinect capabilities to developers to build applications using c, c# using Microsoft Visual Studio. The measurements provided by Kinect fluctuate between frames. The accuracy of the system varies from 5% to 10%. User finds more ease of use and universality in using this system.

Kinect includes features like Raw Sensor streams, Skeleton Tracking, Advanced Audio capabilities etc. Capturing the information about the users standing in front of Kinect is Skeleton tracking it need to gain control over the application that interacts with human body motion[12]. Up to six users can be tracked simultaneously and two in detail, which means the sensor, can return all the 20 tracked joint point information. Kinect Sensor can recognize the gesture of the user, voice recognition and facial expressions. Kinect can track more than 40 facial marks of user using Kinect SDK [5].

Fig: 1 The Twenty Joint Positions in the Kinect[12].


2.1 Security with Visual Understanding (SVU)

This paper presents the effective security scheme based on SVU client system. Here Kinect camera monitors the human skeleton form and when a person is detected, SVU validate the person by tracking the skeleton starting from head to toe where nine measurements are taken and compared with data stored in database. Voice command is used to enter the name of the new person. If known person is detected, authentication is deemed successful otherwise an audible alarm is notified to the known person mobile as SMS message. The measurements provided by the Kinect fluctuate between frames since the Kinect consider each frame independently of the last. The SVU system has added a filter to smooth these fluctuations which reduces the false matches [1].

2.2 Dynamic time warping for gesture-based user identification and authentication with Kinect

The Kinect has primarily been used as a gesture-driven device for motion-based controls. To date, Kinect-based research has predominantly focused on improving tracking and gesture recognition across a wide base of users. In this paper, they propose to use the Kinect for biometrics rather than accommodating a wide range of users, it exploits each user's uniqueness in terms of gestures. Unlike pure biometrics, such as iris scanners, face detectors, and fingerprint recognition which depend on irrevocable biometric data, the Kinect can provide additional revocable gesture information. It proposes a dynamic time-warping (DTW) based framework applied to the Kinect's skeletal information for user access control. Their approach is validated in two scenarios: user identification, and user authentication on a dataset of 20 individuals performing 8 unique gestures. We obtain an overall 4.14%, and 1.89% Equal Error Rate (EER) in user identification, and user authentication, respectively, for a gesture and consistently outperform related work on this dataset. Given the natural noise present in the real-time depth sensor this yields promising results[2].

2.3 Tracking of Fingertips and Centers of Palm using KINECT

This paper presents a novel method for fingertips detection and centers of palms detection distinctly for both hands using MS KINECT in 3D from the input image. KINECT facilitates by providing the depth information of foreground objects. The hands were segmented using the depth vector and centers of palms were detected using distance transformation on inverse image. This result would be used to feed the inputs to the robotic hands to emulate human hands operation [3].


3.1 Biometrics

In Biometric authentication human physical and behavioral characteristics are used to verify the identity of the user. Biometrics for authentication is relatively easy to calculate the strength of password from its length but strength of biometric shows the difficulty to quantify the data [5]. Physiological biometric identifiers include fingerprints, hand geometry, ear patterns, eye patterns (iris and retina), facial features, and other physical characteristics. Behavioral identifiers include voice, signature, typing patterns, and others [2]. Multimodal biometric systems address the problem of non-universality, since multiple traits ensure sufficient population coverage. Further, multimodal biometric systems provide anti spoofing measures by making it difficult for an intruder to simultaneously spoof the multiple biometric traits of a legitimate user [9].

3.2 Kinect Sensor

Kinect is a line of motion sensing devices which developed a system that can interpret specific gestures, making completely hands-free control of electronic devices possible by using an infrared projector and camera [10]. Microsoft released Software development kit for windows which it provides Kinect capabilities to developers to build applications using c, c# using Microsoft Visual Studio. Kinect include the features like Raw Sensor streams, Skeleton Tracking, Advanced Audio capabilities etc. Capturing the information about the users standing in front of Kinect is Skeleton tracking. Need to gain control over the application that interacts with human body motion. Users can be tracked up to six people in time and two in detail, which means the sensor can return all the 20 tracked joint point information. In order to capture the skeleton data first, need to check the sensor is connected, enable the skeleton stream, attach the event handler for tracking the skeleton data and start the sensor. Once the sensor returns the skeleton data, read the skeleton frame and map it with UI elements[5].


To explain the Kinect system application in detail, consider home appliances and mobile devices which are connected through wired LAN or wireless LAN where, Confidentiality of transferring data which to control home appliances from outside home using mobile devices. It secures each device within the home by authenticated user. It provides authorization by assigning access rights and users to roles e.g., Family head be the administrator controlling appliances and network resources to children and visitors. Users can temporarily share the data and access data remotely using cloud. Single factor authentication increases the risk of hacking where people use mobile, like PDA to control home appliance will any time steal by intruder. Similarly the intruder can get into Home and try to access our private data and devices. In multiple biometric authentications each user identification can be captured by providing multiple pieces of evidence. Getting input or data from user in more than one way are defined as multimodal Biometrics. In Multimodal biometrics, verifying the user credentials on request of user by tracking the posture or gesture of the user and verified with collected data. To avoid inherent problems consistently occurred in single biometrics we extent our approach to multimodal fusion biometrics. To achieve a higher recognition and performance rate the fusion techniques.

The proposed system has been developed using Microsoft Kinect Sensor Camera and Kinect Software development Kit (SDK) which acts as Intrusion -detector camera to capture the user and identify them with the help of user dimensions and authenticate them .It replaces the watch dog activity at home by sensing the unidentified person visiting the home. Initially The Kinect can recognize the human whoever stand in the field view of camera rather than any object are set to ready for tracking the user. Kinect can track skeleton by identifying their skeleton joints or dimensions where it can track 20 track points all over the body and for face alone it can track 40 map points. Each pixel in the image is taken as human body parts. More than one camera can also be used and it can be managed by the Kinect Manager. All the skeleton joints are shown in three Dimensions X, Y, Z Co-ordinates where Z is the distance from the sensor. The Line can be drawn to get the human skeleton picture in 3D space by finding the square root of the sum of squared differences of the coordinates.

Fig: 2 Multimodal biometric Authentication using Kinect (MBAuK) Model.

We define equation for calculating in two dimensional spaces by

c = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2}\,

To calculate distance in 3D Space X, Y and Z Coordinates are considered where Z is distance from Kinect and user.

d = \sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2 + (z_1 - z_2)^2 }\,

Distance between skeleton points is calculated to view the skeleton whole body. The system designed to authenticate the user or members in the family by maintaining a database where all family members’ skeleton records are stored. The application built using various computations to identify and verify the user. Identification of user considers the person height, person shoulder width, upper part of the body(body frame) and leg length periodically and compared with already recorded data in database for verification.


For example to find the height of the person consider the points mentioned below,

Head -Shoulder Center

Shoulder Center – Spine

Spine – Hip Center

Hip Center – Knee Left or Knee Right

Knee Left / Knee Right – Ankle Left / Ankle Right

Ankle Left / Ankle Right – Foot Left / Foot Right

Distance between these skeleton joints are calculated and total difference in height measurement is compared with database to identify the closest person in the database,

Similarly to measure the length of hand by considering the given Skeleton Dimensions,

Wrist Left - Elbow Left

Elbow Left – Shoulder Left

Shoulder Left – Shoulder Center

Shoulder Center – Shoulder Right

Shoulder Right – Elbow Right

Elbow Right – Wrist Right.

Distances between these skeleton joints are calculated and total difference in hand Length measurement is compared with database,

Delta = Total Difference

Hl1s = Tracking Person Skeleton data

Hl2s = Detected Person Skeleton data

Similarly to measure the Body Frame by considering the given Skeleton Dimensions,

Shoulder Left – Shoulder Center

Shoulder Center – Shoulder Right

Shoulder Right – Hip Right

Hip Right – Hip Center

Hip Center – Hip Left

Hip Left – Shoulder left.

Similarly to measure the Leg Length by considering the given Skeleton Dimensions,

Hip Right – Knee Right

Knee Right – Foot Right.

Delta = Total Difference

Ll1s = Tracking Person Skeleton data

Ll2s = Detected Person Skeleton data

Each measurement is compared with the database and finds the best matches among the existing user in the database. All these four measurement periodically check and shortlists the best matches to minimal. when the data meets the threshold value or it is near to threshold value the known person is recognized and allows him to access any device.

Closest person = lowest Difference among the best matches

Kinect authentication mechanisms, accuracy to identify and authorize the person are improved at the maximum. The Multimodal biometric authentication using Kinect has collected some real data (shown in Fig:3) and made some statistical report to show the measurement of each user and difference in them.

Fig:3 MBA using Kinect raw data graph analysis structure.

Multimodal Biometric Authentication using Kinect system builds a real time application which identifies the user and authenticate accordingly.


Find difficult to track skeleton on traditional clothes like sari, night wears


This paper describes about Multifactor Authentication on Home Network devices using multimodal biometrics using Kinect sensor device. Strong identification, verification and authorization can prevent threats to home networks from intruders, friends and visitors. Home user can use this system with more convenient and non – intrusive way. System focused on strong authentication as well more ease of use by the user.


[1] S. Joseph Fluckiger, “Security with Visual Understanding: Kinect Human Recognition Capabilities Applied in a Home Security System”, White Paper, 2012.

[2] Wu. J Dept. of Electr . & Comput. Eng., Boston Univ., Boston, MA, USA Konrad, Ishwar, P, “Dynamic time warping for gesture-based user identification and authentication with Kinect”, IEEE International Conference on May 2013.

[3] Raheja , J.L andChaudhary, A.;Singal. k “Tracking of Fingertips and Centers of Palm using KINECT” Third International Conference on 20 -22 Sep 2011.

[4] Sinha, A.;Innovation Lab., Tata Consultancy Services Ltd., Kolkata, India Chakravarty . K., "Pose Based Person Identification Using Kinect", IEEE International Conference on Oct 2013.

[5] Abhijit Jana,” Kinect for Windows SDK Programming Guide” Ch6:Skeleton Tracking.

[6] Wikipedia []: Home


[7] Wikipedia []: Biometrics.

[8] Wikipedia []: Multi-factor authentication.

[9] A. Ross and A. K. Jain, “Information fusion in biometrics,” Pattern Recogn. Lett., vol. 24, no. 13, pp. 2115–2125, Sep. 2003.

[10] S. Imran Ansari, S. Ahmad Qutbuddin: Biometrics for home security, Electrical and computer Engineering, proceedings of 2009 IEEE student conference on Research and Development.

[11] Heikki Kalviainen, Jussi Parkkinen,Arto Kaarna(Eds) “Image Analysis” Springer 14th Scandinavian Conference, SCIA 2005 Proceedings.