Abstract: The objective of this paper is to develop a Fuzzy aided Application layer Semantic Intrusion Detection System (FASIDS) which works in the application layer of the network stack. FASIDS consist of semantic IDS and Fuzzy based IDS. Rule based IDS looks for the specific pattern which is defined as malicious. A non-intrusive regular pattern can be malicious if it occurs several times with a short time interval. For detecting such malicious activities, FASIDS is proposed in this paper. At application layer, HTTP traffic's header and payload are analyzed for possible intrusion. In the proposed misuse detection module, the semantic intrusion detection system works on the basis of rules that define various application layer misuses that are found in the network. An attack identified by the IDS is based on a corresponding rule in the rule-base. An event that doesn't make a 'hit' on the rule-base is given to a Fuzzy Intrusion Detection System (FIDS) for further analysis.
Get your grade
or your money back
using our Essay Writing Service!
In a Rule-based intrusion detection system, an attack can either be detected if a rule is found in the rule base or goes undetected if not found. If this is combined with FIDS, the intrusions went undetected by RIDS can further be detected. These non-intrusive patterns are checked by the fuzzy IDS for a possible attack. The non-intrusive patterns are normalized and converted as linguistic variable in fuzzy sets. These values are given to Fuzzy Cognitive Mapping (FCM). If there is any suspicious event, then it generates an alarm to the client/server. FASIDS results show better performance in terms of the detection rate and the time taken to detect. The detection rate is increased with reduction in false positive rate for a specific attack.
Keywords: Semantic Intrusion detection, Application Layer misuse detector, Fuzzy Intrusion detection, Fuzzy Cognitive Mapping, HTTP intrusion detection.
Most of the commercially available intrusion detection systems work in the network layer of the network stack and this paves way for the hackers to intrude at various other layers, especially in the application layer. Misuse detection uses rule based IDSs that follow a signature-match approach where attacks are identified by matching each input text or pattern against predefined signatures that model malicious activity . The pattern matching process is time consuming. Now a day's hackers are continuously creating new types of attacks. Because of the continuously changing nature of attacks, signature should be updated periodically when a new threat is discovered. Rule based Intrusion Detection System looks for the specific pattern which is defined as malicious. A non-intrusive regular pattern can be malicious if it occurs several times with a short time interval. The non-intrusive patterns are checked by the fuzzy IDS for a possible attack. The detection rate increases by checking the non-intrusive patterns using fuzzy IDS.
2. ARCHITECTURE OF THE FASIDS
The architecture of the system is as shown in Figure 1. The block diagram shows the order in which the different
modules process the incoming payload. The HTTP data capture block collects the application-layer traffic from the network. Captured data is then separated into the header and payload parts and are forwarded to separate buffers.
Figure 1: Block diagram view of integrated FASIDS
Always on Time
Marked to Standard
3 FUZZY COMPONENT FOR NON-INTRUSIVE TRAFFIC
Parts of traffic that get past the rule-based intrusion detection system with no matches of intrusion are fed into the fuzzy component for further analysis. A functional block diagram of the fuzzy component is shown in Figure 5.1.
Figure 3.1 Functional blocks of FIDS
The traffic is first given to a text processor such as awk, which helps in finding the number of occurrences of a specific pattern in it. These nos. are later normalized to keep the obtained values in a specific range to aid relative comparison. The normalized values are fuzzified into linguistic terms of fuzzy sets before feeding to the Fuzzy Cognitive Mapper (FCM) (Brubaker 1996). The output of the text processor for Denial of Service attack. The output of this is normalized between 0.0 and 1.0 which then goes for fuzzification. Fuzzification converts a normalized value into linguistic terms of fuzzy sets. The output of the fuzzification is given for Fuzzy Cognitive Memory (Brubaker 1996) which makes use of Fuzzy Associative Memory (FAM).
3.1 WORKING OF FUZZY COGNITIVE MAPPER IN IDS
Fuzzy rules are constructed based on a map of multiple inputs to a single output. For eg., No. of login failures, time interval between any two login failures, time duration of a login session, etc. Malicious activities that are defined by one or more fuzzy rules are mapped using the FCM. The FCM uses a Fuzzy Associative Map (FAM) to evaluate the fuzzy rules to generate an alert that could fall under either of very high, high, medium, low or very low categories, based on the severity of the attack.
The following example demonstrates the sequence of events in the fuzzy component identifies a brute-force attack, where an intruder tries to login with several users' passwords and fails. This attack can be identified by observing the number of login failures and the time interval between each failure.
FCM for login_failure is shown in Figure 5.2, which shows that if login_failure is very high for small interval of time and for same machine, then there is a suspicious event. ++, +, ïƒŽ, - & -- represents very high, high, medium, low & very low respectively. In Figure 5.3, the time interval for login failure is small which is represented by '-' and no. of login failure is high which is represented by '+'.
Fuzzy rule: no. of login_failure is very high AND time interval is small is triggered which identifies that the specific scenario may be due to a brute-force attack. FAM table for a brute force attack as shown in Table 5.1 is used to evaluate this rule. The details of FAM table are presented in section 5.3.
Figure 5.2 FCM for login_failure
3.2 FUZZY ASSOCIATIVE MEMORY BY FUZZY RULES
Fuzzy Associate Memory(FAM) is used to map fuzzy rules in the form of a matrix. These rules take two variables as input and map them into a two dimensional matrix. The rules in the FAM follow a simple if-then-else format. Fuzzy Associative Memory facilitates the conclusion of the rate of false negatives for few attacks such as Denial of Service (DoS) and brute force attacks, whose details on the behavior of FCM were explained in section 5.2.
Table 5.1 Fuzzy Associative Memory for a Brute force attack
This Essay is
a Student's Work
This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.Examples of our work
Table 5.1 shows that the FAM table for a Brute force attacks in a matrix format.
Rows in this table represent the rate of no. of login failure and the columns represent the rate of time interval between each failure. A linguistic representation of the same is as shown below in Figure 5.5.
Figure 5.5 Linguistic representation of time interval during Brute force attack
The time interval between each login failure is taken in X axis as a normalized value. The degree of membership is taken in Y axis. The min-max normalization scheme is used to normalize the time interval for login failure to a common range i.e., between 0 and 1. Figure 5.5 shows the time values assigned to the linguistic variables (very small, small, medium, high and very high). Figure 5.6 shows the login_failure values assigned to the linguistic variables.
Figure 5.6 Linguistic representation of no. of login failures during Brute force attack
Consider a scenario in which the time interval between login failures is very small and no. of login failures is very high. From Table 5.1, we can conclude that the possibility of such a scenario being detected as an intrusion is very high.
ALGORITHM FOR FUZZY INTRUSION DETECTION SYSTEM
The following algorithm presents the step in Fuzzy Intrusion Detection System.
Step 1: Let x = set of number of login failures
t= time interval
Step 2: x = normalization of (x)= (x-min)/(max-min)
min is the minimum value of x
max is the maximum value of x
Step 3: Give x and t to FCM to select the appropriate fuzzy rules (Refer Table 5.3) from FAM table which has the following format:
IF condition AND condition THEN consequent
where, condition is a complex fuzzy expression that uses fuzzy logic operators (Refer Table 5.4), consequent is an atomic expression.
Step 4: Perform Mean of Maxima defuzzification
(DMM) = sum Î£xi/ |X|
where, xi belongs to X
Table 5.3 Fuzzy rules for detecting intrusions
If (x==very small) AND (t ==very small) THEN (I==very small);
IF (x==very small) AND (t ==high) THEN (I==small);
IF (x==medium) AND (t ==high) THEN (I==high);
IF (x==very high) AND (t ==very small) THEN (I==medium);
IF (x==very high) AND (t == very high) THEN (I==very high);
Table 5.4 Fuzzy logic operators
x AND t
x OR t
1.0 - x
Several methods are available in the literature for defuzzification. Some of the widely used methods are centroid method, centre of sums, and mean of maxima. In mean of maxima defuzzification method, one of the variable value for which the fuzzy subset has its maximum value is chosen as crisp value. According to the FAM table, the defuzzification graph is obtained and is shown in Figure 5.10.
Figure 5.10 Defuzzification
In many situations, for a system whose output is fuzzy, it is easier to take a crisp decision if the output is represented as a single scalar quantity. For this reason, defuzzification value is calculated. Based on the defuzzification value, decision is taken if the traffic contains intrusive pattern or not.
The defuzzification value thus calculated for Brute Force attack is 40%.
6.1.1 Cross site scripting attack
A web site may unintentionally include malicious HTML tags or scripts in a dynamically generated page based on invalidated input from untrustworthy sources. This can be a problem when a web server does not adequately ensure that generated pages are properly encoded to prevent unintended execution of scripts, and when input is not validated to prevent malicious HTML from being passed to the user. By cross-site scripting technique it is possible for an attacker to insert malicious script or HTML into a web page. The purpose of cross-site scripting is that an intruder causes a trusted web server to send a page to a victim's browser that contains malicious script or HTML as desired by the intruder. The malicious script runs with the privileges of a trusted script originating from the trusted web server.
6.1.2 SQL injection attack
Many web pages take parameters from a web user and query the database using SQL. Take for instance when a user logins, the web page asks for user name and password and queries the database to check if a user has valid name and password. With SQL Injection, it is possible for an intruder to send crafted user name and/or password field that will modify the SQL query and thus grant him something else.
6.1.3 Denial of service attack
When a server is intentionally overloaded with many requests from an intruder, it causes it to deny normal access to legitimate users. This attack can also be in the form of an infinite loop that gets executed in the client's browser. The malicious scripts are separated and saved in a text file. It can be given as structured input to the yacc code for signature comparison.
6.1.4 Brute force attack
This attack tries all (or a large fraction of all) possible values till the right value are found, also called an exhaustive search. A brute force attack is a method of defeating a cryptographic scheme by trying a large number of possibilities. For example, exhaustively working through all possible keys in order to decrypt a message. In most schemes, the theoretical possibility of a brute force attack is recognized, but it is set up in such a way that it would be computationally infeasible to carry out.
The output of the rule-based intrusion detection module is non-intrusive for few attacks such as DoS, login failures. In DoS attack, instead of having infinite loop, the intruder will execute the loop for larger number of times. There is a bigger class of attacks which doesn't have a clear rule entry in the rule base can also be detected. These patterns are checked by the fuzzy IDS for a possible attack. Fuzzy Cognitive Mapping is used to capture different types of intrusive behavior as suspicious events.
7. RESULTS AND ANALYSIS
The time taken for the IDS interpreter to understand the semantics of the HTTP request or response is considered for analyzing the performance. The exact time taken for complete analysis of single atomic HTTP transactions (request and response) is found. This is stored in a structure defined by a structure timeval, The HTTP parsing and intrusion detection are done whose time is noted. Time needed for each of the HTTP header varies due to several factors such as the processor usage by other programs, different sizes of the headers and different contents in the header which imply matching of different objects in the IDS interpreter. The time needed for the IDS to analyze the packets also includes the time taken for message exchange between individual blocks. Due to the difference in the processing time for different HTTP packets, we find the time taken for a large number of HTTP packets and the average value of the elapsed time is taken. This gives the approximate value of time needed for analyzing a single packet.
Now the time needed for analyzing a single packet also depends on the number of rules that are defined in the IDS interpreter. It also depends on the number of objects that are numbered and considered for the interpreter. For the calculation of the average time taken to scan a single HTTP request, an average of about 100 successive individual HTTP request scan times for random internet traffic is calculated.
A graph is plotted for the average time taken for scanning a single http request (Response time) versus the number of objects that were incorporated in the IDS interpreter. As the number of objects increase, the number of ways in which the text can be matched increases and hence the time taken also increases. Figure 6 shows the performance analysis of the system.
Figure 6: Performance analysis chart
From Figure 6 it can be inferred that the response time increases linearly and then begins to saturate as the number of objects to be matched increases. When the number of objects increases beyond 80, the response time increase at a very slow rate. Hence the implemented IDS perform well when the number of objects is more than 80. Normally in any environment the number of objects required for proper intrusion detection will be greater than this mark, and hence the system is proved efficient.
The objects in each of the protocol field that are to be searched is plotted in Figure 7. It is observed that if the number of objects to be matched in each protocol field is increasing the Response time increases linearly. But the response time tends to saturate after a specific number of rules. This is because it is expected that the rules contain some common objects which are to be checked once thus improving the response time.
Figure 7: Response time vs. Rules with different number of maximum objects for each protocol field
As the payload size increases, the amount of the text that needs to be matched increases, and so the processing time also increases. Figure 6.7 shows the performance analysis for payload.
Figure 6.7 Performance analysis of payload
Figure 6.8 shows the detection rate with various components of IDS. From the Figure 6.8, the detection rate increases by combining HTTP header and payload (HTML and Scripts).
Fig 6.8 Detection Ratio with various component of IDS
Fig 6.9 Comparison of Fuzzy based Misuse Detection and Regular Misuse Detection
Figure 6.9 shows the comparison of Fuzzy based Misuse Detection and Regular Misuse Detection for various attacks. Figure 6.9 shows the detection rate of fuzzy based misuse detection is high when compared to the regular misuse detection for some attacks such as Dos, brute force, Directory Traversal attacks.
The rule-based semantic intrusion detection system proposed in this thesis has an efficient memory usage since the amount of memory needed for working of the IDS depends on the rule table size. The IDS developed will update the signatures and rules automatically, due to continuously changing nature of attacks, thereby keeping the rule base dynamically updated with newly discovered attack patterns. A fuzzy component that is added to this rule based semantic IDS as proposed in this thesis uses Fuzzy Cognitive Mapping (FCM) in order to have an accurate prediction. Thus, the system proposed in this thesis namely Fuzzy aided Application layer Semantic Intrusion Detection System draws advantages from two different concepts. The semantic rule base keeps the rules updated for detecting newer intrusions by semantically matching the patterns. The Fuzzy component contributed to improving the detection rate by scanning through the traffic for attacks which goes undetected by a typical rule based IDS. The results show better performance in terms of the detection rate and the time taken to detect an intrusion.
The Fuzzy-aided Application layer Semantic Intrusion Detection System has possible extensions at more than one concept presented. The semantic rule base can be appended with more number of semantic parameters by way which improving the accuracy of attack detection of the system is possible. The Fuzzy Associate Map drawn for the IDS can be fine tuned for such changes. Also that, more number of application layer protocols like FTP, SMTP, etc can be considered for implementation and the performance of the concept of application layer semantic intrusion detection can be validated with these protocols.