Study On Behaviour Based User Identification Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Abstract- The whole concept of the paper lies in the fact that a mother is being able to identify which child of her did a particular mistake by recognizing the behaviour pattern his/her on the scene. The same concept can be implemented on a computer system, so that if a user logs onto a system, the system can identify if he/she is the actual user or someone else has intruded into the system. So the behaviour based approach can be applied to generate algorithms where a user is identified based on the ways he interacts with the computer everyday. Moreover to make the authentication more efficient we can also add mouse movement based identification as an additional parameter. This algorithm can also be applied to assist human in day to day activities which he/she does everyday like reading news paper, listening to music etc. so that the next time the user logs in into the system his frequent activities are done automatically by the computer.

Keywords- Behaviour Model, User Identification, Mouse movement, Log history, Authentication


In recent years there has been a considerable interest in exploring and exploiting the potential concepts of behavior arbitration for application in computer science and engineering. This paper explores and attempts to model a system that is inspired by various aspects of the behavior arbitration, like the way a user uses his computer i.e. his activities history, mouse movement etc. In recent years there has been work done in this field using the mouse movement mechanism and behavior biometrics [4]. In our proposed algorithm the system log file has been taken into consideration and the whole algorithm is based on the analysis of the activities in the log file generated for the user. We also propose of detecting user by means of mouse biometrics [1].This algorithm provides a means to the machine for training itself so as to recognize the user and also for validating for the actual user. The algorithm proposed is for re-authenticating the user so as to make the system more secure.


Behaviour based identification is the term used for identifying the user based on the recent history of his interaction with the computer system. The following model is based on the identification of the activities which the user usually performs whenever he/she logs onto the system. These activities usually remain same for a particular user. If the user shows a deviation from his/her previous behaviour, than his/her whole past history is checked to see if the user has ever shown these traits, for example if a particular user may show the following traits: listen to a particular music whenever he/she logs onto the system, then to go to MS Word for typing some message and then to read a newspaper etc. .His/her way of reading a newspaper can also act as an entity in identifying the user, as the way one starts reading the newspaper varies from person to person, for instance one may start by reading the headlines and then business news and then the sports section while a other user may do the reverse. So our proposed algorithm keeps track of the various user activities so as to accurately predict the user behaviour he/she is likely to show in future. Thus the algorithm basically works on a probabilistic model as of predicting the user behaviour so as to validate the user. The paper proposes two different but related approaches for identifying the user so as to detect and restrict any unauthorized user from accessing the system. The whole model is divided into two parts:

Log history based identification

Mouse movement based identification

In the first approach, Log history based identification, user interaction with computer is taken at runtime and various parameters are calculated for training and validating the user.

In the second approach, Mouse movement based Identification, various parameters like speed of the mouse; acceleration, deceleration, length of mouse curve etc. are calculated for the user and saved as a reference to identify the user. Both the approaches can be implemented independently as a module, but their result should be same because if a user shows same behavioural pattern then his mouse movements are also similar.


The proposed algorithm will work on the following assumptions:

The user has to have a minimum of 3 activities for being identified.

If the user remains ideal for sometime (in this case we have assumed it to be 100s) he is automatically logged out and the next time a user logs in he is considered as new user and is expected to show the basic requirements of the algorithm.

This algorithm consists of 2 modules:

Training the system and

Validating user


The training parts consist of a user who gives his name and then the computer guesses the patterns which he is most likely to show now, based on user's log history and the number of times the user has show a specific pattern, which is calculated at run time. Then an average is calculated for the user's patterns. Based on the average and number of occurrences (Fig.2) the possible pattern is predicted (Fig.3). The system maintains a log file (Fig.1) for the user with his name and the pattern which he shows .The activities of computer are mapped to numerical digits form 0 to 9.

If the user agrees with the pattern predicted by the computer than the computer just increments its successful count for the given pattern.

If the user doesn't agree with the pattern then he is asked to enter the pattern and the log file is updated. So later when a user shows the same pattern he is correctly recognized by the system.

Fig: 1: THE system log file which consists of user and his activities mapped to numerical digits form0 to 9.

Fig: 2: Occurrences for each pattern is calculated

Fig: 3 User gives his name and corresponding guess is made.

Generalized algorithm:

input : Uname

output: R = System Reaction

memory: History of patterns for user(Lf)


Check for the Uname in log file Lf.

Calculate the occurrence of each pattern and store.

Increment count of occurrence by 1 corresponding to a pattern till next different pattern is encountered.

Calculate the average no. of occurrences for patterns

Find the pattern corresponding to Uname. based on an average and number of occurrences.

If pattern found

Then find the pattern he is most probable to guess and show it to the user.

If Uname agrees with pattern

Then Do

increment the count corresponding to pattern by 1.


Do ask Uname for pattern and update the log file.



In the recognition part the user is asked to enter the pattern

(Fig.4). The system than scans through the history of log file identifying the user based on some probabilistic approach.

All user whose percentage match with the given pattern is greater than the threshold value which here is set as 0.5, is taken into consideration.

Then the number of occurrences of activities is calculated and the user with highest number of occurrences in his history is identified (Fig.5).

The algorithm makes the system intelligent itself so that it is able to check for variation shown by the user and also to predict the correct user in case a conflict arise when more than one user show same degree of match. In this case the algorithm checks through user's history for finding the number of times he had shown that pattern. (Fig.6)

Fig.4 User is asked to enter the pattern in case of training.

Fig 5. Pattern form history, entered sequence, user name and %match is calculated at run time.

Fig.6: Prediction being made in case of chaos

Generalized algorithm:

input: Ppattern = pattern for recognition

output: Uname = Name of user

memory: Lf = Logfile


Check for Ppattern.

Match the pattern to each pattern in the log file and find the users who show maximum match.

If user is unique

Then Do

print username



check for maximum no. of times the user showed the Ppattern in recent past and his whole past.

The user satisfying the above criteria is given as the valid user for the given pattern.


3.2. Mouse Movement based identification mechanism

We design the mouse movement based recognition algorithm to be simple, efficient and fast so that it can be applied in real time situations and can detect user with dynamic mouse movements. Earlier studies in this field were mostly based on a predefined pattern which the user were asked to follow [9]. Our study tries to identify user based on parameters of his mouse movement that are independent of his mouse movement path. This mechanism is divided into three parts: mouse Data Collection, Feature Extraction and User Identification.

3.2.1 Data Collection

A mouse movement software Recorder.exe is used to record mouse movement into a log file [10]. When a user starts using the mouse this program runs in the background and captures mouse movement data and stores it in the log file. The raw data file collected by the application is stored in a .txt file and contains the following information for each entry:

• Time of the event in milliseconds

• X and Y co-ordinates of the mouse pointer on the user screen. A sample log file is given in figure 7.

Figure7. Log file generated by the Recorder.exe software.

With the help of the log file a graph for user's action was generated. The user was asked to perform a particular task five times. The resulted graph is shown in figure 8.

Figure8. The graph showing the similarity in mouse movement of a user.

3.2.2 Feature Extraction

We believe that, like keystroke biometrics; there is enough individuality in mouse usage to identify the user. In our initial investigations, we identified several features, which can be used to create a pattern, and then these patterns can be used to create a profile. We considered mouse movement speed, ratio of number of lines and curve, ratio of average length of line and average length of curve for our preliminary experiments.

Feature Extraction involves taking the raw data that we collected in the Data Collection phase of the program and applying calculations to identify extract characteristics that signifies user behaviour. From those measurements we create a feature vector, which in turn represents a user profile or a user signature. The following section describes feature definitions and how they are computed. Size of the curve and line

Size of a curve and line is defined, as the total number of continuous points that constitutes the curve or the line.


Size of the curve n = ∑ (Pi)


P is the pixel number. Length of the mouse curve

Length of the mouse curve is defined as the sum of the distances between all adjacent curve co-ordinates.

A mouse curve c with n points has a length of:

Len(c) =


i=1 Total time of the mouse curve


Total time(c) = ∑ ()

i=1 Average speed

Average speed(c) =

(∑)) Ratio of number of lines and number of curve and Ratio of length of line and length of curve.

The number of lines and the number of the curve varies greatly with the duration of mouse movement, so a ratio of these two parameters was considered as an identification parameter as our experiments showed that this ratio remained similar for a particular user. Moreover we found that the ratio of length of line and length of curve also remained similar. Jitter ratio

The number of line segments and curve segments of length two pixels were calculated separately and a ratio of these two entities was also included in identification parameter.

3.2.3 User Identification

In user identification part, a program reads the user's mouse movement data from the log file and stores it into a current database. It then reads different input values from the database and calculates the result based on the above given formulas [5] [6]. The result is then compared with different user's signature. Each user's signature is build using standard deviation on each parameter of the user (Fig 9). The user is identified on the basis of closest match that is found. If the match is not found a new profile is created with new user id. The program has been implemented in java [7] [8].

Figure9. The Profile created for a particular user.

The flowchart of the user identification program is given in figure 10.

Figure 10.Flowchart for the user identification program.


As discussed the algorithm is a measure of re-authenticating the user so that any form of intrusion is detected and thus stopped from proceeding. Since every person in the world shows a unique behaviour, this algorithm can be deployed so as to make our system secure and trustable. The algorithm can be deployed for assisting the human for carrying out the activities on computer automatically which he frequently does. For example, the

algorithm can be deployed with the operating system so as to check for duplicate file. Thus whenever the user tries to create a copy of a file the computer checks for the existence of file in the system at runtime and notifies it as duplicate copy if it already exists. Thus computer space can be optimised for storing more valuable data rather than keeping multiple copies of the same data.

It can also be implemented in a net banking system or web based accounts, where the user's pattern is send to the server for identification. It can act as another safety measure in case of password theft which is the most common cyber threat.


This research sheds light upon a new field of behaviour biometrics which gives the individual computer system to learn and identify its user , and also help in deploying a user friendly environment for the him.

The Behaviour Based User Identification is a new field to be explored.

The algorithm can find a wide variety of application ranging from making a secure connection to assisting humans in computational activities.

The algorithm can be optimized so as to include more number of parameters like, mouse click pressure, facial response to a particular situation, key stroke dynamics, etc. for more precisely measuring the human behaviour.