Fuzzy Intrusion Detection System On Application Layer Computer Science Essay

Published:

Abstract: The objective of this paper is to develop a Fuzzy aided Application layer Semantic Intrusion Detection System (FASIDS) which works in the application layer of the network stack. FASIDS consist of semantic IDS and Fuzzy based IDS. Rule based IDS looks for the specific pattern which is defined as malicious. A non-intrusive regular pattern can be malicious if it occurs several times with a short time interval. For detecting such malicious activities, FASIDS is proposed in this paper. At application layer, HTTP traffic's header and payload are analyzed for possible intrusion. In the proposed misuse detection module, the semantic intrusion detection system works on the basis of rules that define various application layer misuses that are found in the network. An attack identified by the IDS is based on a corresponding rule in the rule-base. An event that doesn't make a 'hit' on the rule-base is given to a Fuzzy Intrusion Detection System (FIDS) for further analysis.

Lady using a tablet
Lady using a tablet

Professional

Essay Writers

Lady Using Tablet

Get your grade
or your money back

using our Essay Writing Service!

Essay Writing Service

Return the

response/request/alarm

server/client

FIDS

Non-Intrusive

Intrusive

Payload Analyzer

Client

Server

HTTP

Sniffer

Session Dispatcher

Header

Analyzer

IDS

Interpreter

IntrusiveAn object is defined as an occurrence of an elementary pattern represented by a regular expression which may not be malicious. Combination of such objects may represent a malicious behavior of the user. A rule is defined as a set of objects having a specific sequence. Given a set of input data that fulfills the constraints given in the rule is detected as malicious event. BNF grammar for HTTP in the application layer is designed to cater to the needs of semantic intrusion detection. The payload part requires the extraction of JavaScript from real-time traffic and is parsed by Javascript parser. Javascript parser analyses the HTML and scripts in the payload and its output is given to the IDS Interpreter. IDS Interpreter checks for maliciousness in the input pattern.

In a Rule-based intrusion detection system, an attack can either be detected if a rule is found in the rule base or goes undetected if not found. If this is combined with FIDS, the intrusions went undetected by RIDS can further be detected. These non-intrusive patterns are checked by the fuzzy IDS for a possible attack. The non-intrusive patterns are normalized and converted as linguistic variable in fuzzy sets. These values are given to Fuzzy Cognitive Mapping (FCM). If there is any suspicious event, then it generates an alarm to the client/server. FASIDS results show better performance in terms of the detection rate and the time taken to detect. The detection rate is increased with reduction in false positive rate for a specific attack.

Keywords: Semantic Intrusion detection, Application Layer misuse detector, Fuzzy Intrusion detection, Fuzzy Cognitive Mapping, HTTP intrusion detection.

1. INTRODUCTION

Most of the commercially available intrusion detection systems work in the network layer of the network stack and this paves way for the hackers to intrude at various other layers, especially in the application layer. Misuse detection uses rule based IDSs that follow a signature-match approach where attacks are identified by matching each input text or pattern against predefined signatures that model malicious activity [2]. The pattern matching process is time consuming. Now a day's hackers are continuously creating new types of attacks. Because of the continuously changing nature of attacks, signature should be updated periodically when a new threat is discovered. Rule based Intrusion Detection System looks for the specific pattern which is defined as malicious. A non-intrusive regular pattern can be malicious if it occurs several times with a short time interval. The non-intrusive patterns are checked by the fuzzy IDS for a possible attack. The detection rate increases by checking the non-intrusive patterns using fuzzy IDS.

2. ARCHITECTURE OF THE FASIDS

The architecture of the system is as shown in Figure 1. The block diagram shows the order in which the different

modules process the incoming payload. The HTTP data capture block collects the application-layer traffic from the network. Captured data is then separated into the header and payload parts and are forwarded to separate buffers.

Figure 1: Block diagram view of integrated FASIDS

Lady using a tablet
Lady using a tablet

Comprehensive

Writing Services

Lady Using Tablet

Plagiarism-free
Always on Time

Marked to Standard

Order Now

The Header parser [6] module reads the header and prepares a list of the objects in the HTTP packets. Each object represents a field of the HTTP protocol and is a five tuple <message-line, section, feature, operator, content>. This sequence of objects is given to the IDS interpreter that refers to the rule-base and correlates the different objects to trigger one or more of the rules. Simultaneously the HTML parser parses the HTML data and searches for misappropriate usage of tags and attributes and also observes for the javascript based attacks injected in the HTTP [5]. The state transition analysis is done by defining states for every match (Sangeetha et al 2008). The incoming pattern is semantically looked-up only in specified states, and this increases the efficiency of the IDS pattern-matching algorithm. If the pattern matches with some predefined pattern then it generates intrusion alert to client/server. If non-Intrusive, the output of the rule-based IDS goes to the Fuzzy IDS for further analysis (Susan M. Bridges 2002). Fuzzy Cognitive Mapping captures different types of intrusive behavior as suspicious events and generates an alert to the server/client, if there are any attacks.

3 FUZZY COMPONENT FOR NON-INTRUSIVE TRAFFIC

Parts of traffic that get past the rule-based intrusion detection system with no matches of intrusion are fed into the fuzzy component for further analysis. A functional block diagram of the fuzzy component is shown in Figure 5.1.

Non-Intrusive

traffic

Text processor

Normalization

Fuzzification

FCM

Defuzzification

FAM

Intrusive

Raise alarm

Figure 3.1 Functional blocks of FIDS

The traffic is first given to a text processor such as awk, which helps in finding the number of occurrences of a specific pattern in it. These nos. are later normalized to keep the obtained values in a specific range to aid relative comparison. The normalized values are fuzzified into linguistic terms of fuzzy sets before feeding to the Fuzzy Cognitive Mapper (FCM) (Brubaker 1996). The output of the text processor for Denial of Service attack. The output of this is normalized between 0.0 and 1.0 which then goes for fuzzification. Fuzzification converts a normalized value into linguistic terms of fuzzy sets. The output of the fuzzification is given for Fuzzy Cognitive Memory (Brubaker 1996) which makes use of Fuzzy Associative Memory (FAM).

3.1 WORKING OF FUZZY COGNITIVE MAPPER IN IDS

Fuzzy rules are constructed based on a map of multiple inputs to a single output. For eg., No. of login failures, time interval between any two login failures, time duration of a login session, etc. Malicious activities that are defined by one or more fuzzy rules are mapped using the FCM. The FCM uses a Fuzzy Associative Map (FAM) to evaluate the fuzzy rules to generate an alert that could fall under either of very high, high, medium, low or very low categories, based on the severity of the attack.

The following example demonstrates the sequence of events in the fuzzy component identifies a brute-force attack, where an intruder tries to login with several users' passwords and fails. This attack can be identified by observing the number of login failures and the time interval between each failure.

FCM for login_failure is shown in Figure 5.2, which shows that if login_failure is very high for small interval of time and for same machine, then there is a suspicious event. ++, +, , - & -- represents very high, high, medium, low & very low respectively. In Figure 5.3, the time interval for login failure is small which is represented by '-' and no. of login failure is high which is represented by '+'.

Fuzzy rule: no. of login_failure is very high AND time interval is small is triggered which identifies that the specific scenario may be due to a brute-force attack. FAM table for a brute force attack as shown in Table 5.1 is used to evaluate this rule. The details of FAM table are presented in section 5.3.

suspicious event

login Failure

time interval

same machine

+

-

+

Figure 5.2 FCM for login_failure

3.2 FUZZY ASSOCIATIVE MEMORY BY FUZZY RULES

Fuzzy Associate Memory(FAM) is used to map fuzzy rules in the form of a matrix. These rules take two variables as input and map them into a two dimensional matrix. The rules in the FAM follow a simple if-then-else format. Fuzzy Associative Memory facilitates the conclusion of the rate of false negatives for few attacks such as Denial of Service (DoS) and brute force attacks, whose details on the behavior of FCM were explained in section 5.2.

Table 5.1 Fuzzy Associative Memory for a Brute force attack

t

x

VS

S

έ

H

VH

VS

Lady using a tablet
Lady using a tablet

This Essay is

a Student's Work

Lady Using Tablet

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Examples of our work

VS

VS

S

έ

έ

S

VS

έ

S

S

έ

έ

H

H

έ

έ

S

H

VH

H

H

έ

VS

VH

VH

VH

H

VS

VS

Table 5.1 shows that the FAM table for a Brute force attacks in a matrix format.

Rows in this table represent the rate of no. of login failure and the columns represent the rate of time interval between each failure. A linguistic representation of the same is as shown below in Figure 5.5.

Figure 5.5 Linguistic representation of time interval during Brute force attack

The time interval between each login failure is taken in X axis as a normalized value. The degree of membership is taken in Y axis. The min-max normalization scheme is used to normalize the time interval for login failure to a common range i.e., between 0 and 1. Figure 5.5 shows the time values assigned to the linguistic variables (very small, small, medium, high and very high). Figure 5.6 shows the login_failure values assigned to the linguistic variables.

Figure 5.6 Linguistic representation of no. of login failures during Brute force attack

Consider a scenario in which the time interval between login failures is very small and no. of login failures is very high. From Table 5.1, we can conclude that the possibility of such a scenario being detected as an intrusion is very high.

ALGORITHM FOR FUZZY INTRUSION DETECTION SYSTEM

The following algorithm presents the step in Fuzzy Intrusion Detection System.

Step 1: Let x = set of number of login failures

t= time interval

Step 2: x = normalization of (x)= (x-min)/(max-min)

where,

min is the minimum value of x

max is the maximum value of x

Step 3: Give x and t to FCM to select the appropriate fuzzy rules (Refer Table 5.3) from FAM table which has the following format:

IF condition AND condition THEN consequent

where, condition is a complex fuzzy expression that uses fuzzy logic operators (Refer Table 5.4), consequent is an atomic expression.

Step 4: Perform Mean of Maxima defuzzification

(DMM) = sum Σxi/ |X|

where, xi belongs to X

Table 5.3 Fuzzy rules for detecting intrusions

Rule No.

Rules

Rule 1

If (x==very small) AND (t ==very small) THEN (I==very small);

Rule 2

IF (x==very small) AND (t ==high) THEN (I==small);

Rule 3

IF (x==medium) AND (t ==high) THEN (I==high);

Rule 4

IF (x==very high) AND (t ==very small) THEN (I==medium);

Rule 5

IF (x==very high) AND (t == very high) THEN (I==very high);

Table 5.4 Fuzzy logic operators

Logical Operator

Fuzzy Operator

x AND t

min{x, t}

x OR t

max{x, t}

NOT x

1.0 - x

Several methods are available in the literature for defuzzification. Some of the widely used methods are centroid method, centre of sums, and mean of maxima. In mean of maxima defuzzification method, one of the variable value for which the fuzzy subset has its maximum value is chosen as crisp value. According to the FAM table, the defuzzification graph is obtained and is shown in Figure 5.10.

Figure 5.10 Defuzzification

In many situations, for a system whose output is fuzzy, it is easier to take a crisp decision if the output is represented as a single scalar quantity. For this reason, defuzzification value is calculated. Based on the defuzzification value, decision is taken if the traffic contains intrusive pattern or not.

%

The defuzzification value thus calculated for Brute Force attack is 40%.

6.1 Attacks

6.1.1 Cross site scripting attack

A web site may unintentionally include malicious HTML tags or scripts in a dynamically generated page based on invalidated input from untrustworthy sources. This can be a problem when a web server does not adequately ensure that generated pages are properly encoded to prevent unintended execution of scripts, and when input is not validated to prevent malicious HTML from being passed to the user. By cross-site scripting technique it is possible for an attacker to insert malicious script or HTML into a web page. The purpose of cross-site scripting is that an intruder causes a trusted web server to send a page to a victim's browser that contains malicious script or HTML as desired by the intruder. The malicious script runs with the privileges of a trusted script originating from the trusted web server.

6.1.2 SQL injection attack

Many web pages take parameters from a web user and query the database using SQL. Take for instance when a user logins, the web page asks for user name and password and queries the database to check if a user has valid name and password. With SQL Injection, it is possible for an intruder to send crafted user name and/or password field that will modify the SQL query and thus grant him something else.

6.1.3 Denial of service attack

When a server is intentionally overloaded with many requests from an intruder, it causes it to deny normal access to legitimate users. This attack can also be in the form of an infinite loop that gets executed in the client's browser. The malicious scripts are separated and saved in a text file. It can be given as structured input to the yacc code for signature comparison.

6.1.4 Brute force attack

This attack tries all (or a large fraction of all) possible values till the right value are found, also called an exhaustive search. A brute force attack is a method of defeating a cryptographic scheme by trying a large number of possibilities. For example, exhaustively working through all possible keys in order to decrypt a message. In most schemes, the theoretical possibility of a brute force attack is recognized, but it is set up in such a way that it would be computationally infeasible to carry out.

The output of the rule-based intrusion detection module is non-intrusive for few attacks such as DoS, login failures. In DoS attack, instead of having infinite loop, the intruder will execute the loop for larger number of times. There is a bigger class of attacks which doesn't have a clear rule entry in the rule base can also be detected. These patterns are checked by the fuzzy IDS for a possible attack. Fuzzy Cognitive Mapping is used to capture different types of intrusive behavior as suspicious events.

7. RESULTS AND ANALYSIS

The time taken for the IDS interpreter to understand the semantics of the HTTP request or response is considered for analyzing the performance. The exact time taken for complete analysis of single atomic HTTP transactions (request and response) is found. This is stored in a structure defined by a structure timeval, The HTTP parsing and intrusion detection are done whose time is noted. Time needed for each of the HTTP header varies due to several factors such as the processor usage by other programs, different sizes of the headers and different contents in the header which imply matching of different objects in the IDS interpreter. The time needed for the IDS to analyze the packets also includes the time taken for message exchange between individual blocks. Due to the difference in the processing time for different HTTP packets, we find the time taken for a large number of HTTP packets and the average value of the elapsed time is taken. This gives the approximate value of time needed for analyzing a single packet.

Now the time needed for analyzing a single packet also depends on the number of rules that are defined in the IDS interpreter. It also depends on the number of objects that are numbered and considered for the interpreter. For the calculation of the average time taken to scan a single HTTP request, an average of about 100 successive individual HTTP request scan times for random internet traffic is calculated.

A graph is plotted for the average time taken for scanning a single http request (Response time) versus the number of objects that were incorporated in the IDS interpreter. As the number of objects increase, the number of ways in which the text can be matched increases and hence the time taken also increases. Figure 6 shows the performance analysis of the system.

Figure 6: Performance analysis chart

From Figure 6 it can be inferred that the response time increases linearly and then begins to saturate as the number of objects to be matched increases. When the number of objects increases beyond 80, the response time increase at a very slow rate. Hence the implemented IDS perform well when the number of objects is more than 80. Normally in any environment the number of objects required for proper intrusion detection will be greater than this mark, and hence the system is proved efficient.

The objects in each of the protocol field that are to be searched is plotted in Figure 7. It is observed that if the number of objects to be matched in each protocol field is increasing the Response time increases linearly. But the response time tends to saturate after a specific number of rules. This is because it is expected that the rules contain some common objects which are to be checked once thus improving the response time.

Figure 7: Response time vs. Rules with different number of maximum objects for each protocol field

As the payload size increases, the amount of the text that needs to be matched increases, and so the processing time also increases. Figure 6.7 shows the performance analysis for payload.

Figure 6.7 Performance analysis of payload

Figure 6.8 shows the detection rate with various components of IDS. From the Figure 6.8, the detection rate increases by combining HTTP header and payload (HTML and Scripts).

Fig 6.8 Detection Ratio with various component of IDS

Fig 6.9 Comparison of Fuzzy based Misuse Detection and Regular Misuse Detection

Figure 6.9 shows the comparison of Fuzzy based Misuse Detection and Regular Misuse Detection for various attacks. Figure 6.9 shows the detection rate of fuzzy based misuse detection is high when compared to the regular misuse detection for some attacks such as Dos, brute force, Directory Traversal attacks.

8. CONCLUSION

The rule-based semantic intrusion detection system proposed in this thesis has an efficient memory usage since the amount of memory needed for working of the IDS depends on the rule table size. The IDS developed will update the signatures and rules automatically, due to continuously changing nature of attacks, thereby keeping the rule base dynamically updated with newly discovered attack patterns. A fuzzy component that is added to this rule based semantic IDS as proposed in this thesis uses Fuzzy Cognitive Mapping (FCM) in order to have an accurate prediction. Thus, the system proposed in this thesis namely Fuzzy aided Application layer Semantic Intrusion Detection System draws advantages from two different concepts. The semantic rule base keeps the rules updated for detecting newer intrusions by semantically matching the patterns. The Fuzzy component contributed to improving the detection rate by scanning through the traffic for attacks which goes undetected by a typical rule based IDS. The results show better performance in terms of the detection rate and the time taken to detect an intrusion.

The Fuzzy-aided Application layer Semantic Intrusion Detection System has possible extensions at more than one concept presented. The semantic rule base can be appended with more number of semantic parameters by way which improving the accuracy of attack detection of the system is possible. The Fuzzy Associate Map drawn for the IDS can be fine tuned for such changes. Also that, more number of application layer protocols like FTP, SMTP, etc can be considered for implementation and the performance of the concept of application layer semantic intrusion detection can be validated with these protocols.