This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Research methodology on the whole provides a general framework for any system development. The main purpose of research methodology is to accomplish the objectives of research in steady manner. This chapter discusses about research methodology framework of the mouse movement starting from the planning phase until the testing phase that eventually produce results that important to this study. As mouse movement falls under behavioural sciences, the methodology focuses on developing mathematical or statistical models for understanding human behaviour. (Yutaka, 2005). This chapter consists of planning phase, analysis phase, design phase and develop phase and lastly testing design. Figure 3.1 below shows the overall framework of this research.
Phase 1: Planning
Phase 2: Analysis
Mouse Biometric System
Phase 3: Design Phase, Development and Testing
Data Capture Module works when an application is created to collect data that regarding the mouse behaviour of a user when the user is using mouse to interact with the Graphical User Interface (GUI).
Feature Extraction Module works when the raw data are applied with calculations that will help in extracting feature characteristics that resulting a mouse movement profile and measurement of a user.
Classifier Module works to verify the features profiles and classify the patterns that can differentiate each user according to the classification of nearest neighbour using Euclidean distance.
The classifier results can be further analysed by finding the success rate of matching algorithm or the accuracy of the matching process.
Figure 3.1: Project Methodology Framework
3.2 Research Methodology Phases
The research is conducted in well-managed manner. The research methodology workflows can be referred in Figure 3.2.
Figure 3.2: Project Methodology Phases
Yampolskiy and Govindaraju (2010), proposed a general algorithm in behavioural biometrics.
Pick any relevant behaviour.
Break up behaviour into component of actions.
Find the frequencies of component actions for each user.
The results have to be combined with a feature vector profile.
Calculation or any similarity measure function is applied to the stored template and current behaviour.
Experiments are conducted to get a threshold value.
The system will verify or reject user based on the similarity score comparison to the threshold value.
Figure 3.3: General Algorithm in Behavioural Biometrics
The planning phase starts with a discussion with the supervisor to select an appropriate title for the research. After a full consideration, mouse movement biometric is chosen because in the current research of biometric environment, the researchers are focusing on behavioural biometric system and mouse behavioural biometric system show promises to be inexpensive but can be a reliable biometric system.
After the problems statements are identified, the objectives of the research have to be recognized. The objectives are important for any research because they provide a guideline for the research to be conducted. Next the research scope is decided to make sure the research is not over-achieving or under achieving.
3.4.1 Literature Review
Conducting literature review on this research provides a better understanding on mouse dynamics. The literature review begins with the investigation on general biometric systems, a little bit of the history, the components in biometric systems and the biometric measurement that involve. In general the basic components of biometric systems are sensor, feature extraction, matching and decision-making modules. The important results of the matching and decision-making modules can be either False Rejection Rate (FRR) is a rate at which a rightful user is rejected or False Acceptance Rate (FAR) is a rate at which a non-rightful user is allowed to use the system. Other than that, the matching results can also produce success rate as the indication of the accuracy of the matching process. In other words, the system can either identify a user correctly or not. Then the literature review is focusing on the types of biometric whether it is physical and behavioural. The physical concern about physiology of a human whilst the behavioural is concentrating on human's behavioural. The literature review then concentrates on mouse biometric systems. The most important thing in this section is the previous research that had been conducted. The significant about the research is the aims and the results of each trial. Every research actually has contributed largely on the development of mouse dynamics.
After that, the investigation is focusing on each module in the mouse biometric systems namely data capture module, feature extraction module and classifier module especially on the techniques and methods that they are using to get a fair result. From those literature reviews, I conclude that mouse biometric systems actually have a big room of improvement because this type of biometric is still new but shows a lot of promises to become one of the major biometrics.
3.5 Design Phase, Development and Testing
User is identified or not identified
Figure 3.4: Identification Modules in Behavioural Biometrics 
3.5.1 User Interface Module
User Interface Module is a module whereby a user is asked to do some sort of activities to record his/her behavioural while using a mouse. This mouse movement behaviour can produce raw data that will be used to characterize each user in identification behavioural biometric system.
18.104.22.168 Graphical User Interface (GUI)
This module can also be known as Data Capture Module consists of an application that collect all data regarding the mouse behaviour or mouse events of a user. These events can be captured when he/she is interacting with a specific GUI that suit the objective of each research. The application can be developed using Java, Visual Basic, C# or many more languages depend on the suitability of the language for the whole structure of the project.
The development of the application that includes a program and a GUI are very important in achieving a good result. The GUI can come either in pre-determined tasks or can be in randomly, with or without the user's acknowledgement. For example, a user can be asked to click on pre-determined buttons or can be asked to play certain type of games such as Solitaire.
22.214.171.124 Raw Data
When the user is interacting with the GUI, a program will capture the mouse movements of the user. There are many attributes of the mouse movement that can be chosen as the characteristics or behaviour of a user. Nazar et al. (2008) considered these characteristics as the raw data that will be used in the next module.
Type of action: mouse click (right or left), mouse drag or mouse move.
Travelled distance: xi and yi position values of the mouse pointer.
Time: the time that is taken by the user to complete certain tasks.
Direction: angle of the mouse movement.
At the end of the interaction with the GUI, the obtained raw data will be stored either in txt format or comma separated variable format (.csv) or any other data format.
3.5.2 Feature Extraction Module
The purpose of this module is to process and analyse all the raw data from the previous module and generate user features vectors. These features vectors are uniquely different for each person and can be generated into a pattern. These patterns later on can create a profile or a signature of each user.
126.96.36.199 Mouse Movement Profile
The raw data as mentioned above brings no significance and does not represent any meaning about a user's behaviours or characteristics. There are many ways to process the raw data to becoming a signature for every person. The usual ways are by using numerous statistical graphs such as histograms or apply calculations on the raw data to extract features that important to isolate each user. But commonly, to get the features vectors is by applying the raw data with calculations that will help in extracting feature characteristics that represent the users' behaviours. These measurements create a feature vector or a mouse movement profile that symbolize of user signature.
Figure 3.5: Histograms of time for two different users 
As mentioned above, one of the ways to process the raw data is by using statistical graphs as Histogram. Figure 3.5 shows the time that the two users have taken to complete an application. From the histogram we comprehend that the two users have different behaviours. Hence the users have their own profile or signature. 
188.8.131.52.2 Calculation or Formulas
Hashia, S. (2004) and Aksari and Artuner (2009) explain some example that can be used as the feature definitions and the formulas that involve with the definitions. One of the features from this list can be the best feature that the best to distinguish the users.
Length of a mouse movement:
Definition of this formula (equation 1) is as the distance between all coordinates.
Time of the mouse movement:
This formula (equation 2) describes as the time taken to complete the mouse movement from a coordinate to another coordinate.
Mouse movement speed:
For this purpose, the mouse movement speed can be described as the length and the time that involve when a user move mouse between two consecutive points. The mouse speed may not be consistent during the user's interaction with the GUI as the move can be slow at first but can be faster later. These can produce a distinctive profile of his/her mouse movements.
The speed between two points is computed as the distance travelled over time as in equation (3).
Angle of deviation
Other feature than can be considered is the angle of deviaton. Angle of deviation can be defined as the location of the mouse coordinates with a straight-line between two given points. a could be described as the length between current coordinate and start coordinate, b is the length between current coordinate and end coordinate, and c is the length between start and end coordinate.
Acceleration shows acceleration or deceleration of a user when he/she does mouse movement. As speed, the acceleration/deceleration can be different from the beginning to the end by each user. The formula for acceleration is time divide it by velocity.
Deviation is the orthogonal distance of mouse movement point from the straight line between the two points. The result from this calculation can determine whether the user has the tendency to follow a straight path or deviate from the line.
184.108.40.206 Mouse Movement Profile Measurement
After all the computation are done and mouse profile for each user is created, some count of the mouse movement points can be seen as differs from a user to another and even so it is difficult to find the nearest values.
So from the mouse profile, the calculation on average and standard deviation are done to find the nearest value that is different from a user to another user. These calculations will be the best way to describe a user. For examples:
Average and standard deviation of speed.
Average and standard deviation of acceleration.
Average and standard deviation of deviation.
The averages and standard deviations produce mouse profile measurement that will be saved for future references.
Average could be defined as the sum of the data divided by the total number of the data.
220.127.116.11.1 Standard Deviation
3.5.3 Classifier Module
Classifier Module is crucial to any biometric systems. The main function of this module is to verify the validity of the features vectors in previous modules and classify the patterns that can differentiate each user. The classifier module worked according to the objectives and the scope of the research. Kasprowski (2004) stated that classifier program can be functioning into two ways namely authorization technique or identification technique. 
In this research, identification technique is chosen in the classifier module. The identification process begins when the system gathers a test data and matches it with data in the database. The classification is considered as successful attempt if the Euclidean Distance (equation (10)) values between the test data and the data in the database are near with each other. In respect of the data is from the same user. Unlike authorization technique that need some sort of authorization attributes such as username or password, identification can identify the user without the attributes by using two methods specifically Leave One out method and K Nearest Neigbour Identification method. 
18.104.22.168 Normalization of Data
Before the identification process begins, the mouse movement profile should be normalized. The purpose of normalization is to give each calculation a balance and equal weight. According to Weiss et al., (2007), during the normalization of the data, the mouse profile of a user is taken as an input and then is applied with this formula:
Min and max are the minimum and maximum values of all the calculations/measurements as in 22.214.171.124.2 from all users.
126.96.36.199 Euclidean Distance
To measure the distance there are many methods that can be considered. In this research, Euclidean distance is chosen. (Eusebi et al., 2008). The formula as in equation (10) computes between two points/data, A and B. A match in this case is successful when, given the same author; the Euclidean distance is least between the data being tested and the data in the training file.
188.8.131.52 Identification Technique
Jain et al., (2004) in his research stated that identification worked when the system compared the normalized measurements to biometric templates of all the users to either accept or reject them. This method can be used when a person needs to be identified without having any other identifier such as password.
Identification technique could be implemented in two different methods:
Leave One Out method
The Leave One Out method was used for cross validation with all the data files in the database. This method worked when a set of data files were selected as training data set. Then one file from the training data set was selected as the test file and tested against the rest of the files in the training data set.
K Nearest Neigbour (KNN) Identification method
In the K Nearest Neigbour Identification method, a data file was tested against all the data files in the database. The method attempted to find the match between the unknown data file with known data files.
In both methods, by using Euclidean distance as illustrated in equation (10), the classifier module classified as the classification was successful if the Euclidean distance chose one with the highest probability which in this case the smallest distance from each other provided that the data came from the same user.
184.108.40.206 Success Statistics
In this phase, the identification was determined from the result of distances obtained from Euclidean distance method as mentioned above. To identify, the distance between two data by the same user must be close with each other. It could be concluded that the data was coming from the same user, thus he/she was identified. The success percentage could be obtained if the experiment that conducted could identify as many users as it can. The success statistics could be summarized into four descriptions as below:
Matching first choice
Matching second choice
Matching first and second choice
Matching third choice
3.6 Chapter Summary
This chapter summarized the methodology that was conducted throughout this project. This chapter discussed about three phases specifically planning phase, analysis phase and design, development and testing phase. In the analysis phase, discussion was done to find a suitable research topic. Then problem statements and objectives were identified. In the analysis phase, literature review was done on both biometric systems and mouse biometric systems.
In the last phase namely design, development and testing phase, many processes were took place. The processes included three important modules and the algorithms that involved. Data capture module was done to capture the user behaviour and the result would be raw data. Then in Feature extraction module, the raw data would be converted into mouse movement profile such as speed and acceleration. After that each profile would be calculated into average and standard deviation. Lastly in Classifier module, the identification was considered as successful if the Euclidean Distance values between the test data and the data in the database are close with each other by using either Leave One out method or K Nearest Neigbour Identification method.
 An Introduction to Biometric Recognition1
Anil K. Jain, Arun Ross and Salil Prabhakar2
 INVERSE BIOMETRICS FOR MOUSE DYNAMICS
AKIF NAZARâˆ-, ISSA TRAORÂ´Eâ€
and AHMED AWAD E. AHMEDâ€¡
 Synthesis & Simulation of Mouse Dynamics
Active Authentication by Mouse Movements
YiÄŸitcan AksarÄ±, Harun Artuner
Human identification using eye movements
Authentication by Mouse Movements
By Shivani Hashia