The Phishing Email Analysis Computer Science Essay
Seeking sensitive user data in the form of online banking user_id and passwords or credit card information, which may then be used by ‘phishers’ for their own personal gain is the primary objective of the phishing e-mails. With the increase in the online trading activities, there has been a phenomenal increase in the phishing scams which have now started achieving monstrous proportions. This paper explains the most popular methods used for phishing and the
PhishFinder algorithm developed to detect phishing. The PhishFinder algorithm is a heuristic based algorithm which will detect phishing emails and alert the users about the phishing emails. The phishing filters and rules in the algorithm are formulated after extensive research of phishing methodologies and tactics.. The approach used in developing this algorithm, the implementation details and testing results are discussed in this paper. According to the survey, 3.6 million people lost money in phishing attacks in over the period of August 2006 to August 2007.
Phishing is a term coined by computer hackers, who use email to fish the internet hoping to hook users into supplying them the logins, passwords and/or credit card information. Phishing is a form of online identity theft which employs both social engineering and technical subterfuge to steal consumers. However with the rapid growth of internetworking around the globe and the phenomenal popularity gained by Internet since the early days of 1990’s, the going has never been easier for such criminals or phishers as this community has better come to be known as. The preferred strategy of the phishers today is to send out millions of spam mails to potential targets around the globe, masquerading as if these came from original institutions such as banks, insurance companies etc. These mails urge the recipients to click on the embedded URLs which lead them to fraudulent but apparently official looking phishing websites where the gullible users are made to divulge with their personal information such as passwords, account numbers and such. On an estimate, almost 5% recipients of phishing e-mails give away their personal information to these phishing sites while they are in operation.
PHISHING EMAIL ANALYSIS
Looking at the fact that phishing scammers are reaping enormous financial gains, it can easily concluded that the motivation behind hishing is almost always financial. Although financial gain is the major motivating factor for phishing, other factors such as identity theft, industrial espionage, malware distribution, etc., are the other motivating factors for phishers. A root cause analysis was done to identify the motivation for phishing and the following factors were identified:
Financial Gain – almost always the most important motivating factor for phishing.
Identity Theft – Stolen identities from unsuspecting victims can be used by the phisher for financial gain, for criminal activities, to commit fraud or to launch more phishing attacks by assuming the stolen identity. Sometime the stolen identities are sold to other interested parties for a premium.
Identity Trafficking - Phishers indulge in identity theft and the stolen identities are sold to interested parties on online forums for a premium.
Industrial Espionage - Highly sophisticated phishing attacks are being launched against a victim to spy on the victim and to get comprehensive information about the victims browsing patterns, product loyalties, etc. These details are used by the phishers directly or are being sold to interested parties. Using this information, the victim can be targeted to make him shift loyalties from one product to another and to tarnish a brand. The monetary losses due to industrial espionage run into billions of dollars.
Malware Distribution - Phishing attacks can be launched with the intent of distributing malware i.e., malicious software. Phishing mails are usually sent in bulk and hence, zombie networks are the best suited to launch large phishing attacks.Phishers send unsolicited phishing emails with malware attachments so that when an unsuspecting user clicks to open the attachment, the malware is installed on the victims machine converting the victims machine into a zombie. The phisher may distribute software such as Trojans, key loggers, browser overlays, fake browsers, etc., on machines for use in later scams such as harvesting further information as users unknowingly enter information into infected machines over weeks or months.
Password Harvesting - Password harvesting is done by phishers using various methods like key loggers and other malware. The harvested user information is used again for financial gain, fraud, identity theft or sold to interested parties for financial gain.
1.Weak Authentication Schemes – many websites and mail servers have weak authentication schemes which make it easier for fraudsters to launch phishing attacks. Insufficient use of digital signatures for authentication makes application more susceptible to phishing attacks.
2.Browser vulnerabilities – attackers use browser vulnerabilities like address bar spoofing, cross site scripting, HTML frame injection, script injection, browser proxy configuration and multimedia auto play and auto execute extensions to launch phishing attacks.
3.Security Flaws – like port redirection, man in the middle attack, session hijacking and client side vulnerability exploitation which are very difficult to detect, are used to launch phishing attacks
4.Non secure desktop tools – like inefficient anti
phishing toolbars, antivirus, spam filters, pop-up
blockers, firewalls and spyware detectors make it easier for a phishing attack to succeed.
5.Lack of user awareness
6.Ease of impersonating a trusted source
Based on the extensive analysis conducted on the motivation, causes and techniques used in phishing and analysis conducted on the phishing emails, we propose an algorithm - PhishFinder - Detect, Defend and Deter. The focus of this algorithm will be to detect phishing emails and alert the user about phishing emails. Using the hyperlinks collected from the phishing emails, the algorithm will further gather valuable information about phishing hyperlinks like the country of origin of the phishing attack, the brand being phished, etc., and populate the details into a data warehouse. Also collected will be the characteristics of phishing emails, for example the subject of the email, which will be stored into the data warehouse. This data repository will be a valuable source of information to derive the latest trends in phishing and also to analyze the changing methods of phishing attacks. This can help in preventing further phishing attacks.
PhishFinder is a heuristic based algorithm thatrelies on a set of phishing rules to classify phishing emails. These phishing rules are formulated based on detailed analysis of phishing emails and various phishing methodologies. After extensive analysis of phishing emails we have formulated certain categories under which the phishing emails can be classified. A unique filter is associated with each category and rules are formulated using the various combinations of these filters. Further, each filter is associated with a certain weight which we have derived based on the significance of each filter and also by regression testing of the algorithm. The weight associated with each filter plays a very important role in determining whether an email is a phishing email or not, as explained in the following sections. Each rule has been tested extensively for effectiveness and finally we have derived an optimum set of rules which will bring about maximum efficiency and lower the number of false positives. The various filters used in the algorithm are discussed in the next section.
Block Diagram Of PhishFinder Algorithm
Implemented the PhishFinder algorithm in Java. The following block diagram gives the overall picture of PhishFinder. The basic architecture of PhishFinder consists of the following modules:
a module to fetch emails,
a module to filter emails and classify them as phishing,
Alerter to issue an alert to the user and the data warehouse which will store all the information related to phishing emails.
THE PHISHFINDER ALGORITHM – TO DETECT,DEFEND AND DETER
The PhishFinder algorithm is explained below:
if new mail found
for each message
get recvdFrom from header
get From from header
if message encoded
check for text_filters match
if recvdFrom != From
find_links in each_email
if link found
compare anchor _tags
check whois(recvfrom IP)
Alert the user
Connect to database
Insert the phishing email details in db
….go to next email
WARNING MESSAGE TO USER
WARNING MESSAGE TO USER
In PhishFinder algorithm we first fetch the new e-mails from the SMTP server. The algorithm is designed to work with POP and IMAP mail servers. When any new email comes in, the email is retrieved and split up into headers and body. The body of an email is sometimes, HTML encoded and the type of encoding is indicated in the “Content-type” field in the email header. If the email is encoded, we decode it so that the phishing filters work correctly with the email. Once the email is retrieved and stripped into its component parts, the next step in the algorithm is to apply the phishing filters on the email to detect a phishing email. Firstly, the email is scanned for the presence of the text filters defined in the algorithm. The number of text filters detected in the email is recorded, which will be the weight of that filter. The weight of the filter is added to a list, Phishrank. Phishrank is a list which contains a mapping of the phishing filters to their respective weights. In the next step, the received domain mismatch is checked in the email i.e., the domain similarity between the Received from and From fields in the email is verified. The first Received From and the From fields are obtained from the e-mail header. If both these fields do not have the same
domain, then we can safely assume that the source address was spoofed in the email and hence the appropriate weight is assigned to the received domain mismatch filter. Next we look for all the available hyperlinks in the email. If a link is found, then the link is run against the linkCharacteristics() function to scan for any possible misrepresentation of the link and hence to check for link encoding. If there is any misrepresentation noticed in the link, then the appropriate weight is assigned to the link encoding filter. The length of the link is also checked and if it exceeds a certain predefined threshold for link length, then based on the length of the link, the appropriate weight is set for the length of link filter. Similarly, the number of folders and the number of sub domains in the hyperlinks are checked and the corresponding filter weights are set. In the next step, the anchor tag for each hyperlink in the email is fetched from the source HTML code of the email. Each link is compared with its respective anchor tag to check for discrepancy, if any, between the visual link and the actual link. If there is a mismatch between the visual and the actual links, then the appropriate weight is assigned to the link mismatch filter.
The user is thus warned against the existence of likely phishing e-mails in his account as soon as he physically opens his e-mail service. Forewarned about the same, he is unlikely to fall victim of the phishers trap set for him. Tackle web and spoofed mails and exploit based attacks. It is platform independent.
The present application is a single user application. Making it a multi user application will enhance its utility. Integrating the application with web browsers will make it more useful.
In this paper we have analyzed the various types of phishing attacks and have designed an algorithm PhishFinder to detect phishing. PhishFinder is a heuristic algorithm that is focused on detecting phishing links, alerting the user about a suspected phishing link and building an extensive data warehouse containing a wealth of information about phishing. This data warehouse can be further used to analyze trends in phishing and to derive statistics about phishing. PhishFinder is a very lightweightalgorithm requiring very less memory and CPU time.This work to try and develop an Anti Phishing application for the end user is a small attempt in this direction.
If you are the original writer of this essay and no longer wish to have the essay published on the UK Essays website then please click on the link below to request removal: