This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
As any biometric system, mouse dynamic or mouse biometric system basically is adopting a basic biometric system model. A simple mouse biometric system usually consists of three modules. The three modules are:
i) Data Capture Module where an application is applied to collect information of mouse events or mouse movements by conducting a specific task or any relevant tasks.
ii) Feature Extraction Module where all the raw data from the previous module are processed and applied with calculations to get feature characteristics of each user.
iii) Classifier Module where the extraction feature will be used as identification or authentication to a user.
Data Capture Module consists of an application that can collect all data regarding the mouse behaviour or mouse events of a user when he or she is interacting with a GUI. The application can be developed using Java, Visual Basic, C# or many more language, while some of the mouse characteristics that can be collected are mouse click (right or left), mouse drag or mouse move. The mouse behaviour or mouse characteristics are very important to determine the accuracy of the results on any experiment. The GUI comes from pre-determined tasks and some can be random with or without the user's acknowledgement. In this case, a random GUI is important to avoid the user to be familiar of the tests so that the researcher will get more precise results.
(Among the earliest researcher that conducted research on mouse dynamic were Hayashi et al. (1997) that proposed a schema to identify a user using a mouse. In this research the authors conducted two sets of experiment to capture data of mouse movements. On the first experiment, the users were asked to draw a circle inside a given circle on the screen while the second experiment allowed the users to draw any sort of shape (i.e. triangle) but still inside the given circle. From the experiment, the authors used Event Precess of Xlib to record the coordinate of (X, Y) and the elapsed time (from the moment. Those raw data were stored in a file to be used later on.) (Everitt and McOwan (2003) proposed authentication of a user using signatures that written by a mouse. To capture data, the authors asked the users to choose a username and a password and wrote their own signature on the screen. The application was web-based and developed using Java applet. The captured raw data were the lengths of the signature trace, the spatial size and temporal information.) (In a research conducted by Pusara and Brodley (2004), the data capture was done when the users were working within Internet Explorer. The characteristics of the mouse movements that were collected were the cursor movement and mouse events (i.e., mouse wheel movements or whether single or double clicks). From the cursor movement that were recorded, the authors computed the distance, angle and speed between a pair of data points to get the raw features. While for mouse events, the authors grouped the mouse into a hierarchical structure. After that, the authors also calculated the distance, angle and speed between pairs of data points as the raw features. These raw features would be used for the next module.) (Hashia et al. (2005) research on authentication method by mouse movement, asked the users that enrol into the authors system by positioning the mouse cursor over a set of randomized dots. The 10 dots were random because to the authors wanted to see the behaviour of the mouse movement when he or she moved the cursor from one dot to another dot. From the movements, the coordinate of (X, Y) for every 50 ms was captured and stored in an online analysis system. The data would be used for feature extraction module.)
(In Schulz (2006) research on mouse curves, the capturing data began as soon as the volunteered users began their typical workday activities using Microsoft Window operating system. (i.e. surfing, playing games). An application that developed by Microsoft DirectX's DirectInput library ran after the users successfully login their username and password. The application captured the mouse activity such as X and Y velocities and timestamp for every mouse movement.) (In research on mouse movement biometric system by Weiss et al. (2007) and Ajufor et al. (2008), the authors were using a standalone application that developed by Java for capturing data of mouse movement whenever the users click and move the mouse. The users had to click on 25 buttons that had been pre-determined. The standalone application was collecting mouse movements such as mouse event whether it was mouse movement ( move, drag or click) or left mouse click), time of the specific event and the X and Y coordinates.) (Gamboa and Jain (2007) were using a web based data capture module, which built based on the WIDAM (Web Interaction Display and Monitoring) system. This system could keep all users interaction into a file. The data would be collected when the users were interacted with a GUI in this research, a memory game. As a result, these raw data were collected, the coordinates of (X, Y) of the mouse cursor position, mouse clicks and timing that associated with these activities.)
(Kaminsky et al. (2008) were interested to test the mouse biometric system to identify the players of certain games. To capture the data, the authors developed a C# that connected to Window API to monitor the low-level mouse events such as mouse movements and left and right mouse button whether by pressing or releasing. The users were asked to do three tasks. The first task required to click between two targets as accurate as they could. Second task, the users were asked to drag a circular shape to a target destination and the third task was to double click on a target.) (In a research to actively authentication user by mouse movements (Aksari and Artuner, 2009), the authors created an application to capture data. After the user fill in his or her username, a blank screen with a square appeared. The users were required to click on the first square, and then the second square would emerge in another location of the screen. The random order of the path was created so that the users could not able to guess the next square hence a comparable dataset could be collected each time the users used the application. As for the raw data, this application would record the coordination of cursor and the time when the event was happened.) (Bours and Fullu (2009), created an application that needed the users to do a specific task. The users were asked to navigate the mouse cursor between two lines that had a start point and an end point. The application would capture data of the mouse events such as the X and Y position values and the authors recorded 100 samples per second.)
Feature Extraction Module comes after the data capture module finish. The purpose of this module is to synthesis and analyse all the raw data from the previous module and generate user feature vectors. For example the raw data that we capture during the capture data module are mouse clicks (right/left or single/double), time elapse, mouse wheel and many more. During the feature extraction the raw data will be processed to mouse features. These features are different for each person, therefore they are important to use in classifier module whether to identify or to authenticate the users. The examples of the user features are speed, angle, acceleration (positive/negative) and so on. In this module, the feature vectors can be generated using common numerical programming environments such as MATLAB.
According to Hayashi et al. (1997), in their research, during the feature extraction module, the coordinate of (X, Y) and the elapsed time that was in a file were processed and put in a database with new added data; the length from the coordinate to the center of the circle. The feature extraction for Everitt and McOwan (2003) were quite complex. On the raw data; lengths of the signature trace, the spatial size and temporal information, the authors used Euclidean distance to find angle and distance relationship between the internal points. Then from these two features, the authors used two approaches to extort the characteristics of the signature that was written by the users. The approaches were Ranking Approach and Genetic Approach. The final outputs were salient angle and distance relationship from any input signature. In feature extract module, Pusara and Brodley (2004) extracted both raw features. The raw features; distance, angle and speed between pairs of data points were extracted to produce their mean, standard deviation and the third moment values ( distance, angle and speed) over a window of N data points. In Hashia et al. (2005) research about mouse movement, the authors were able to record the coordinates of (X, Y) as raw data. That raw data later were transformed in the feature extraction module. The outputs were speed, deviation from a straight line and angle (positive and negative). Then from the outputs, the authors found out the average, standard deviation, minimum and maximum data. After that, those parameters were normalized, averaged and saved in a file called Vector.txt. The authors further analyse the parameter's average to produce standard deviation for each parameter and was saved in AvSd.txt file.
The research by Schulz (2006) on mouse curve biometrics, the feature extraction of the raw data started with the authors broke them into mouse curves. The group was based on each mouse movement time stamps and the X and Y velocities were converted to coordinate data to construct proximate mouse curve coordinates. Weiss et al. (2007) and Ajufor et al. (2008), explained in feature extraction module, the raw data was applied with calculation to come out with features attributes of mouse curves and mouse clicks. Each attribute was connected to the following features; size of curve, length of a mouse curve, total time of the mouse curve, mouse speed over a pre-defined action, angle of mouse movement, acceleration and mouse click duration. To create a profile for the mouse user, the attributes were further analysing to find the mean, average and standard deviation. According to Gamboa and Jain (2007), the raw data that the authors gathered was extracted to get the appropriate features. The authors cleansed the data using a cubic spline that remove any abnormalities. Then the authors extorted spatial and temporal information which was the input vector and at the end to extract the salient features, the authors apply statistical approaches.
The raw data from Kaminsky et al. (2008) research; the mouse move, left and right drag and left and right click, a set of features were extracted by using MATLAB. The features were, the mean of path length, click length and velocity, the standard deviation of path length and velocity, the number of clicks per minutes and also the percentage of each clicks. Those features would later on being use to classify the users. According to Aksari and Artuner (2009), after the raw data were acquired, a set of features were obtained. The features vectors were speed, deviation from a straight line, angle and positive acceleration using statistical method. To be able to differentiate each user, the authors further analyse the features vectors by calculating the average, standard deviation, maximum and minimum of them (path features) and normalized them so that better comparison could be done. Bours and Fullu (2009) research on login method using mouse dynamic also did feature extraction on the author's raw data. The X and Y position values (raw data) were categorized into two transition vector; horizontal and vertical tracks. These data would be normalized later to be used in classifier module.
In the Feature Extraction Module, the raw data is converted into feature vectors that can be used to distinguish each user behaviours through their mouse movements. The feature vectors then will be used in Classifier Module. The purpose of Classifier Module is generally to identify or to authenticate each user. Specifically the Classifier Module is developed to normalize and classify the feature vectors so that each person or user will has their own patterns. After that, in this module the researchers can make a comparison of the legality of each feature vectors that have been extracted and then verify them with the data in the database that will be applied either to accept or to reject the identity that the user claim to be. There are many methods can be applied to this module with each method has their own strength and weakness. The methods are Neural Network, Decision Tree, k-Nearest Neighbour and many more.
In classifier module, Hayashi et al. (1997) were using comparing and verifying method. Whenever an input data was presented, it would be compared with database using certain procedures. At the end, Match-rate is calculated, and the authors concluded that if a user got the match-rate that higher than the data in database, he or she would be identified as an authorized user. Everitt and McOwan (2003) research on authentication of a user using signatures that written by a mouse; performed three neural networks to execute user verification in Classifier Module. Those three neural networks held different information and were trained using back-propagation algorithm. The first network used hold and latency times to test typing style, the second network employed angle meanwhile the third network used distance information to test the input signature. To be able for a user to authenticate himself, his input data must pass all three tests. Pusara and Brodley (2004) adopted different Classifier Module than other research. The authors first built a standard model that the authors defined as normal behaviour by using a supervised learning algorithm. They concluded that there was existed a distinct model of normal behaviour for each user. After a standard model was developed, the authors implemented Decision Tree method in Classifier Module. This method was applied to a windowed dataset, and each point in the dataset was estimated and if the data point was not constant with the current user, it would produce an alarm. Hashia et al. (2005) did the classifier by comparing the current user value of the counter to the user's counter value that the authors kept during the registration phase. The user was authenticated if the current counter value was laid within the user's range.
In research that was conducted by Schulz (2006), the classifier did two types of classifying. The first type was classifying on a single mouse curve and the second type was from the single mouse curve, a histogram was generated. According to the author, each mouse curve features were tracked with it owns histogram. From this histogram, a mouse signature was produced. The author created two set of signatures. The first signature contained of the signature during the registration that described a user's identity. The second signature consisted of the test signature. Both signatures were compared and the closeness was measured using Euclidean distance algorithm. Weiss et al. (2007) and Ajufor et al. (2008) stated that the purpose of classifier module was to find the patterns. The authors used k-Nearest neighbour method to make identification from the unknown user profile to the known user profile. The authors also declared that the classifier system in this research could classify the already normalized features in two different techniques.
i) Identification technique whereby the unknown test was classified against a set of known profiles.
ii) In leave-out one technique whereby one file was compared to the rest of the files and the process repeated for all files.
At the end, the performance measure would be produced as the results of the experiment. From the research that was conducted by Gamboa and Jain (2007) concluded that the classifier verified the identity of a user based on the patterns while the user was interacted with the computer. From the feature vectors that the authors got from the second module, the authors selected Sequential Forward Selection (SFS) algorithm to add one feature at a time. Then the classifier did the authentication based on two distributions; the genuine distribution and the impostor distribution using decision rule algorithm. Form this information; the performance measure would be conducted as the results of the experiment.
According to Kaminsky et al. (2008) during the classification phase, the authors chose three best features from the feature extraction module. The features were left click length mean, clicks per minute and left drag velocity mean. From the distinctive features, the authors tried to find the patterns using SVM, 1-Nearest Neighbour and 7-Nearest Neighbour methods. Then, the authors attempted to identify each user through ten-fold cross validation. Lastly the accuracy would be calculated as the results. Aksari and Artuner (2009), the authors did training phase before the verification phase as the classification methods. In the training phase, the authors extracted path features from the mouse movement attributes. Then the authors calculated the differences of the path features. The outputs were average of difference and standard deviation of difference. To verify, after the users used the application, the application would extracted the path features as in the training phase. Then for each user, counter values were calculated and if the user counter values were high it was meant that he or she was the intended user. In classifier module, Bours and Fullu (2009) applied three different distance metrics on the feature vectors. The metrics were Euclidean Distance, the Manhattan Distance and the Edit Distance. The metrics were used to calculate the average of each horizontal tracks and vertical tracks to get the results for the experiments.
 For example of supervised learning is the profile of the current user must be matched with one of the user's profile in the database.
 Some of the researchers use the term enrolment as well.