Disclaimer: This dissertation has been written by a student and is not an example of our professional work, which you can see examples of here.

Any opinions, findings, conclusions, or recommendations expressed in this dissertation are those of the authors and do not necessarily reflect the views of UKDiss.com.

Analysis and Detection of Metamorphic Viruses

Info: 5400 words (22 pages) Dissertation
Published: 12th Dec 2019

Reference this

Tagged: Information SystemsInfectious DiseasesBiomedical Science

Chapter 1
Introduction
1.1 Motivation

Metamorphic Viruses are very special type of viruses which have ability to reconstruct into entirely new offspring which is completely different than the parent; Main object to use these techniques to rebuild itself is to avoid detection by Antivirus Software. Although for the time being some well known metamorphic viruses are detectable, but it is predicted that in future we might face problem of similar viruses those would be capable of changing their identification and perform malicious tasks. Our objective in this thesis is to perform an in-depth analysis of metamorphic code, and evaluate some best practices for detection of metamorphic viruses.

1.2 Outline

This document has been divided into five chapters; first two chapters are for introductory purpose it provides basic information about viruses in Chapter 2 we have tried to give some details about virus evolution how metamorphic viruses came into existence. Chapter 3 includes detailed information about metamorphic Virus, Formal definition, Core components of Architecture and some explanations from a virus writer about metamorphic viruses. Chapter 3 deals with some of techniques which are being used by metamorphic viruses and what advantages these viruses have using those techniques. Chapter 4 contains different type of detection methodologies used to detect metamorphic viruses. It also contains sample code from different metamorphic viruses for their feature comparison.

Chapter 2
Computer Virus Introduction
2.1 Introduction

The term "Virus" was first described by Dr. Fred Cohan in his PHD thesis during 1986[1] although different type of computer malware where already exited that time but the term was specifically introduced by Dr. Fred. That's why in may research papers he is considered the father of Virus Research [2]. According to his formal definition as virus

"A program that can infect other programs by modifying them to include a possibly evolved copy of itself"[1]

Based on this definition we have taken some pseudo code of Virus V from his research [25].

program virus:=

{1234567;

subroutine infect-executable:=

{loop:file = get-random-executable-file;

if first-line-of-file = 1234567 then goto loop;

prepend virus to file;

}

subroutine do-damage:=

{whatever damage is to be done}

subroutine trigger-pulled:=

{return true if some condition holds}

main-program:=

{infect-executable;

if trigger-pulled then do-damage;

goto next;}

next:}

This is a typical example of a computer virus, we can divide this virus into three major parts first subroutine which is infect-executable it tries to look for and executable file or any other target file which it wants to infect it contains a loop which tried to append the virus body to with the target file. Second subroutine do-damage is the virus code its self for which virus has been written this is called virus payload upon execution it performs some damage to the system. The third subroutine trigger-pulled is some sort of trigger to execute the virus code it could be some condition based on date or system or file. Main code of virus is that once the condition is met we it should append itself to the target file and perform something.

If we evaluate this definition modern viruses cannot be considered as virus because there are several different type of viruses which are not performing any harm such as “Co-Virus”, their main target is to help the original virus by performing such tasks so the execution of original virus could be performed without being detected. Peter Szor has redefined this definition [2] as

“A computer virus is a program that recursively and explicitly copies a possibly evolved version of itself.”

This definition is also self explanatory, as the author suggest it recursively and explicitly search for the target files and then infect them with virus code to make possible copies. As we are all aware virus is special kind of malware which always requires a user attention to propagate such as either he access the infracted file or tries to execute infected files. Grimes[26] append this definition with boot sector information and other methodologies as Viruses are not limited to file infections only.

2.1.1 Different Type of Malware

In this section we will try to discuss some type of malware which like virus but they are not virus. This section is for information purpose only. Viruses its self could be of different kind based on their activity we can define their category, such as boot sector virus, File Infection Virus or some of advanced Macro Viruses which are used inside Microsoft Office documents to automate the process. Basically all virus follow the same process of infection which is described by Dr. Fred Cohen in V Sample Virus. We will define some of advanced code armoring techniques in Section 2.2.

2.1.1.1 Trojans

Trojans are very famous backdoor malware some time they are not considered as virus as their main objective is to let attacker gain access to the target machine without getting noticed by the user. Their main objective is not only to gain access but it could be executing some sort of malicious code. Origin of their name is from Greek History where a giant horse was built to gain access inside the castle and transport soldiers through that horse. Same technique is used with Trojans they tricked users by displaying something on screen and behind it is doing something else. Trojan does not infect files or attach their code to other files usually they contain some sort to joiner utility which help users to embed their code or application inside the Trojan. Trojans can used to gain access to infected systems, mounting share drives or disturbing network traffic through Denial of Services attacks. Some famous examples of Trojans are Netbus, Subseven, Deep Throat ,Beast etc.

Some remote administration Trojans can have their client side which can be used to communicate to the infected computer. Above image is Client side of Beast Trojan which can perform so many operations on the target machine once it is connected.

2.1.1.2 Spyware and Adware

Spyware are very common problem of today's internet user. They are used to get information about users and monitor their activity with or without his knowledge. Till now antivirus companies are unable to define detection and removal of spyware software because there are some famous companies who are selling spyware software to monitor user activities and they are getting legal support to protect spyware from getting removed by antivirus. With spyware it is quite possible that without user knowledge they transport all user information and activities to some monitoring email address. There is some sort of spyware which are only used to get all key press events by users whatever he is typing or writing in email or entering password. It will be recorded and based on the software settings it can be sent to email or saved on disk.

Adware are slightly different than malware they collect information about users internet activity and based on that they tries to display target advertisement to the users or install some software on users system which displays unwanted advertisement to the user.

2.1.1.3 Rootkits

Root kits are specially crafted virus; their main objective is to gain administrative level access on the target system. Usually they contain some virus or script to execute the malicious code on target machine, enable root level access for the attacker and hide the process, allowing attacker full access to machine without getting noticed. Detailed information about root kits is beyond the topic. Based on their functionality we can say that they hijack the target system and monitor all system calls. They are now capable of patching kernel also so attacker can get higher level of permissions.

Security researchers have demonstrated a new technology called “Blue-Pill”[27] which has helped them creating a super root kit without getting any performance degradation or system restart. They have used virtualization support inside processor to run in a virtual machine mode.

2.1.1.4 Worms

Worms are considered as the most advanced version of malware unlike virus they do not require any user interaction to propagate, but like virus they can replicate their code by infecting other target files. They can be combined with Trojan horses to execute on target machine. But unlike virus they are always dependent on some software for their execution without that specific software they cannot perform their actions. These try to exploit vulnerabilities of software or operating system to perform malicious actions. Love Bug is one of famous worm example it used Microsoft Emailing software to distribute its copies. CodeRed and Nimda are some other examples which used Microsoft protocols to distribute and infect other systems.

2.2 Virus Evolution

Viruses are evolved throughout the time that's why today we are dealing with the most advanced type of viruses of all time. Most of time researchers are challenged by the virus writers to detect their created virus and create vaccine for it. In the following section we will describe some of the techniques which are used by virus to satisfy the main objective of Virus writer that is “Make Virus Completely Undetectable”. From time to time they have used different techniques in this section we will discuss those techniques and how those techniques lead toward metamorphic viruses.

2.2.1 Encryption

Encryption is the main sources of information hiding. It has been used some centuries the same way virus writers are using encryption to avoid detection by antivirus. A decryptor is attached with the main virus code to decryp the virus body and performs the action.

lea si, Start ; position to decrypt (dynamically set)

mov sp, 0682 ; length of encrypted body (1666 bytes)

Decrypt:

xor [si],si ; decryption key/counter 1

xor [si],sp ; decryption key/counter 2

inc si ; increment one counter

dec sp ; decrement the other

jnz Decrypt ; loop until all bytes are decrypted

Start: ; Encrypted/Decrypted Virus Body

The above code is from [5] for Cascade Virus. In the same article the author has suggested four major reasons why some virus writer will use encryption:

  1. Prevention against code analysis: With encryption it becomes quite difficult to disassemble the virus code and examining the code for instructions which can be quite interesting for the virus researchers. For example if someone is performing specific operations such as calling INT 26H or calling specific Crypto API. By using encryption users will bet get an idea about what are the intentions of users because most of file contents will be encrypted and it is quite possible it may contain some Junk Code also.
  2. Making disassembling more difficult: Virus writers can used encryption not only to make it difficult they can also us to make this process more time consuming and difficult they can include more junk code inside or wrong instruction so the researchers will not be able to perform static analysis of code and get some confusing idea about the code itself.
  3. Making virus temper proof: Same like real life business products some virus writers do not want their virus code to be used by others with their name or generate new variant from their code because it is quite possible someone will decrypt virus and again generate another virus by modifying the code. This is also some sort of prevention from reverse engineering the virus.
  4. Avoid detection: This is the core objective of virus write to evade detection by Anti Virus software, based on time to time new techniques have been developed in following section we will discuss some of these techniques how they use encryption.

Mostly the virus contains the decryptor within their code this had helped the Virus researchers to detect viruses based on their decryption signature. But this method is not very successful as it may raise an exception in case some other software tries to use similar methodologies to decrypt data. As time evolved they have developed some new interesting techniques. Most of time in assembly they use simply XOR ing operations help then in decrypting virus code. For example in above code of Cascade Virus it is using XOR to decrypt each byte of virus code unless all body is decrypted. With XOR they have some advantage first of all it is very simple operating and second XOR ing the same values twice yields the first value this operating can help them in decryption and making it more confusing during static code analysis. Peter Szor has described some of these strategies which can be used to make process of encryption and decryption more difficult [2-Chapter7], according to him:

  • Virus Writers are not require to store decryption key inside the virus body some advanced virus such as RDA.Fighter generate their decryption key upon execution. This technique is called Random Key Decryption. They use brute force method to generate key during run-time. These Viruses are very hard to detect.
  • It is under control by the attacker how he wants to modify the flow of decryption algorithm, it can be forward or backward or it is also possible to have multiple loops inside a single body. Or multiple layers of encryption. Second most important factor is the key size which can make decryption process more difficult based on the key length. Obfuscation is another factor involved in it. In Metamorphic Viruses Similie.D was one of the virus which used non-linear encryption and decrypts the virus body in semi-random order and most important thing is that it access the encrypted portion of virus body only once.[3]
  • There is another factor involved in virus encryption such as virus is encrypted with very strong algorithm such as IDEA virus [9] which contains several decryptors. Main source of interest is that it is quite easy to detect virus and remove it but it is extremely difficult to repair the infected file as on second layer of IDEA it uses RDA for key generation.
  • Microsoft Crypto API is part of Windows operating system. This can also be used for malicious purpose, Virus writers can use Crypto API to encrypt data with some secret key or call their API through virus code to perform encryption. It is also difficult to detect this because other program such as Internet Explorer also uses this API to encrypt transmission over secure channel.
  • There is another variation in decryption which was demonstrated by W95/Silcer Virus that the first portion of virus which is already decrypted force Windows Loader to relocate infected software images once they are executed loaded in to memory. For the purpose of decryption the virus itself transfers relocation information.
  • There are other possibilities such as some virus use file name as their decryption key in such case if file name is modified virus cannot execute and there is possibility we will not be able to recover that file after infection. Other methods such as it can use decryptor code itself as decryption key it help them in such condition if someone is analyzing code or virus execution is under a debugger it will raise an exception.

2.2.2 Oligomorphism

With encrypted virus it is quite possible to find the decryption mechanism to challenge this situation virus writers implemented a new technique to create multiple decryptors and use them randomly while they are infecting other files. Major difference between Encryption and Oligomorphism is that in encryption is uses same decryptor for encryption purpose while in oligomorphic virus have multiple decryptors and they can use any of them during the process. Whale Virus was first of this kind to use multiple decryptors. W95/Memorial[7] is one of very famous examples of oligomprphic viruses it uses 96 different type of decryptors.

mov ebp,00405000h ; select base

mov ecx,0550h ; this many bytes

lea esi,[ebp+0000002E] ; offset of "Start"

add ecx,[ebp+00000029] ; plus this many bytes

mov al,[ebp+0000002D] ; pick the first key

Decrypt:

nop ; junk

nop ; junk

xor [esi],al ; decrypt a byte

inc esi ; next byte

nop ; junk

inc al ; slide the key

dec ecx ; are there any more bytes to decrypt?

jnz Decrypt ; until all bytes are decrypted

jmp Start ; decryption done, execute body

; Data area

Start:

; encrypted/decrypted virus body

Sliding key feature can also be noted as with this feature it is quite possible to change instructions for decryptor. If we get other instance of same virus it has little variations there is a little change in loop instruction Another Variant of W95 Memorial

mov ecx,0550h ; this many bytes

mov ebp,013BC000h ; select base

lea esi,[ebp+0000002E] ; offset of "Start"

add ecx,[ebp+00000029] ; plus this many bytes

mov al,[ebp+0000002D] ; pick the first key

Decrypt:

nop ; junk

nop ; junk

xor [esi],al ; decrypt a byte

inc esi ; next byte

nop ; junk

inc al ; slide the key

loop Decrypt ; until all bytes are decrypted

jmp Start ; Decryption done, execute body

; Data area

Start:

; Encrypted/decrypted virus body

. It has been mentioned [2] that a virus is only called Oligomorphic if it can mutate its decryptor slightly. Detecting Oligomorphic virus is extremely difficult because as they have random decryptors it is quite possible that our virus detecting mechanism will not able to detect if there are quite large number of decryptors.

2.2.3 Polymorphism

The term Polymorphism came from Greek origin "Poly" means multiple and "morphi" means forms. We can say that these types of viruses can take multiple forms. They are much advanced than their ancestors like Oligomorphic virus they rely on mutating their decryptor in such a way so it generates number of variation of same virus. Core of their operation is their engine which helps them in mutating. For each infection their mutation engine generates a completely new instruction set for decrypter. This process help them in generating a completely new virus having exact functionality as their parents but the sequence of instruction is entirely different from others[28].

Antivirus software are challenged by their method as every time a new file is infected it generated a new encryption code and decryptor so those software who are relying on virus decryptor signature will not be able to detect those viruses as new offspring are completely different in decryptors signature. Research has already shown that it is possible for a mutation engine to generate several million different type of decryptor code for new viruses [28].

Dark Mutation Engine is one of very famous example of polymorphic virus following code has been taken from [2].

mov bp,A16C ; This Block initializes BP

; to "Start"-delta

mov cl,03 ; (delta is 0x0D2B in this example)

ror bp,cl

mov cx,bp

mov bp,856E

or bp,740F

mov si,bp

mov bp,3B92

add bp,si

xor bp,cx

sub bp,B10C ; Huh ... finally BP is set, but remains an

; obfuscated pointer to encrypted body

Decrypt:

mov bx,[bp+0D2B] ; pick next word

; (first time at "Start")

add bx,9D64 ; decrypt it

xchg [bp+0D2B],bx ; put decrypted value to place

mov bx,8F31 ; this block increments BP by 2

sub bx,bp

mov bp,8F33

sub bp,bx ; and controls the length of decryption

jnz Decrypt ; are all bytes decrypted?

Start:

; encrypted/decrypted virus body

Idea behind making a code engine was that in beginning virus writing was very difficult and time consuming so the experienced virus writers helped novice in virus generating by giving them code mutation engine with little modification they can use this engine within their own virus code and it can perform same operations.

Based on the virus type and engine capabilities it can enhance the virus functionality there are several viruses which can use Microsoft CryptoAPI in their polymorphic operations. Marburg is also one of very famous polymorphic virus which has entirely different mechanism in file infection. till now we could think that infection method if polymorphic virus could be same just decryptor is changing but that virus introduced some of new methodologies like key length in encryption could be different and each file which it is infecting is using different encryption mechanism.[8]

Start:

; Encrypted/Decrypted Virus body is placed here

Routine-6:

dec esi ; decrement loop counter

ret

Routine-3:

mov esi,439FE661h ; set loop counter in ESI

ret

Routine-4:

xor byte ptr [edi],6F ; decrypt with a constant byte

ret

Routine-5:

add edi,0001h ; point to next byte to decrypt

ret

Decryptor_Start:

call Routine-1 ; set EDI to "Start"

call Routine-3 ; set loop counter

Decrypt:

call Routine-4 ; decrypt

call Routine-5 ; get next

call Routine-6 ; decrement loop register

cmp esi,439FD271h ; is everything decrypted?

jnz Decrypt ; not yet, continue to decrypt

jmp Start ; jump to decrypted start

Routine-1:

call Routine-2 ; Call to POP trick!

Routine-2:

pop edi

sub edi,143Ah ; EDI points to "Start"

ret

There are examples of other viruses which shows that

2.2.4 Metamorphism

After all these evolution in virus, now we are dealing with one of the most advanced version of these viruses. Polymorphic viruses were really challenging to detect and remove from system, but it was just a matter of time Researchers tried to build solutions against polymorphic viruses. Viruses writer tired to work on something really amazing a virus which would be able to rebuild itself with same functionality but entirely different from the parent. This proposed solution was first implemented in W32/Apparition, If it finds a compiler in some machine it tries to rebuild itself into completely new shape. Following code has been taken from [2] two different variants of W95/Regswap . This virus was first of its kind to implement metamorphism in shifting registers.

a.)

5A pop edx

BF04000000 mov edi,0004h

8BF5 mov esi,ebp

B80C000000 mov eax,000Ch

81C288000000 add edx,0088h

8B1A mov ebx,[edx]

899C8618110000 mov [esi+eax*4+00001118],ebx

b.)

58 pop eax

BB04000000 mov ebx,0004h

8BD5 mov edx,ebp

BF0C000000 mov edi,000Ch

81C088000000 add eax,0088h

8B30 mov esi,[eax]

89B4BA18110000 mov [edx+edi*4+00001118],esi

Although till now there is no big incident reported due to metamorphism as normal computers do not contain such utilities like compilers or scripting support to rebuild virus but this situation could be very dangerous for Linux machine where scripting languages and compilers are enabled by default. Upcoming versions of Microsoft Windows are also having support of .Net and MSIL which is capable of generating such virus very easily MSIL/Gastropod is one of famous example of metamorphic virus. In upcoming section we will describe main architecture of metamorphic viruses.

Chapter 3
Metamorphic Virus Architecture

The idea behind metamorphic legacies came from the same biological aspect that the parents are mutating and generating new offspring's which are entirely different than their parents but they are performing the same actions as their parents were doing. Virus Writers have adopted the same idea and implemented in the form of metamorphic virus. Power of any virus relies in its power to bypass the Antivirus Scanner and perform actions. Usually constants in their virus body, specific register allocation, patterns or heuristics scanning are some of the common ways to detect a virus.

Metamorphic Viruses are one of those kinds which are capable of transforming their code into new generation, these viruses are capable of changing their syntax but their semantics remain same throughout generations. Polymorphic viruses were difficult to detect but their main weakness was their decryption mechanism once researchers found their decryption methodology and add this as a signature to antivirus products through this they were able to detect full generation of polymorphic virus but in case of metamorphic virus this approach fails because the syntax of code and mechanism of operation is entirely different throughout generations. They are considered as shape shifters [2] because each generation is entirely different than each other.

Metamorphic engines are mostly buggy, this could be our luck that till now there is no perfect metamorphic engine available. It has been reported that metamorphism has been used as a mean of software security the same way it has been used in viruses to for their protection. They can be used stand alone by which they are self generating viruses and capable of performing actions on target system or they could take help from the surrounding environment in downloading some plug-in form internet or generating their new copies.

Metamorphic viruses are capable of changing arrangement of their instruction. This ability gives them ability to generate new undetectable virus for examples if a virus contains n number of subroutines it will generate n! different type of generations. In case of BADBoy Virus it has 8 subroutines and it is capable of rearranging it's subroutines it can generate 8! = 40320 type of different virus. This grows if number of subroutine increases inside the Virus Body.

Above image is a code module of Badboy Virus in file it just need to take care of Entery Point whereregardless of where it is located remaining subroutines are access through jump instructions throughout the code.

Zperm is another exam of metamorphic virus the above code sample is from Zperm virus which shows its rearrangement of code.

3.1 Formal Definition

This formal definition is presented in [13] according to this definition let áµ P(d,p) represented as a function which is going to be computed by a program P in the current environment (d,p) in this case p represents programs stored on computer and d represents data processed. D(d,p) and S(p) are two recursive functions , T(d,p) is a trigger and is an injury condition and I(d,p) is considered as injury condition.

In Case of this we can say that pair (v,v') are recursive functions and( v and v') are metamorphic virus if all conditions X(v,v') satisfies.

Where T(d,p) ,I(d,p),S(d,p) is entirely different than T'(d,p) ,I'(d,p),S'(d,p). Based on that we can say that v and v' are metamorphic virus and they are performing same actions. Polymorphic Virus share their kernel but in metamorphic virus each virus has its own kernel.

3.2 Core Architecture

In this section we will discuss major components of metamorphic virus, although there are several other components already explained but architecture represented in [10] is considered as best. According to original author they have divided metamorphic virus in to two categories close-world or open-world. Open World are those who integrate with executing environment and perform some actions such are download some spyware etc. Here we will describe functional architecture of closed world viruses. Most of them perform binary transformation.

3.2.1 Locate Own Code

The virus must be able to locate its code from inside the infected file or its own body each time it is transforming into new form or infecting a new file, metamorphic virus which are infecting other files and use them as their carrier must be capable of locating their code from inside the infected file. Mostly in file they use some predefined location of their startup code this location is mostly constant and remains contestant throughout the other generations. There are only few incidents when Engine tries to put dynamic locations.

3.2.2 Decode

Once the code of virus is located by metamorphic engine it tries to obtain some sort of blue print information about how to transform. Although this is one of the drawback of metamorphic virus that within them self it they contain their architecture about how they are getting transformed. This information is very critical because this information is further encoded inside body of new virus. This unit can also retrieve information about flags, bit-vectors, markers, hints which will help in building new viruses. There is a drawback of this approach as it is required by the virus engine itself so virus write cannot obfuscate this area.

3.2.3 Analyze

Once the core information is gathered there is other information which is very critical for proper execution of metamorphic virus. Without this information transformation cannot be performed. Metamorphic engine must have information about the register liveliness. If it is not available from Decode phase the engine must be capable of constructing it via "def-use" analysis. Control Flow Graph is also required by transformation phase because this will help in the rewriting logic and flow of the program.

Control Flow graph is required in case if the malware itself is capable of generating the code which can shrink or grow in new generations and also it is required to process the control flow logic which is further transformed into code. In the following code it has gather its main idea about code what it is required to perform and it is further transforming it to simplified instructions.

1)

mov [esi+4], 9

mov [esi+4], 6

add [esi+4], 3

2)

mov [ebp+8], ecx

push eax

mov eax, ecx

mov [ebp+8], eax

pop eax

3)

push 4

mov eax, 4

push eax

4)

push eax

push eax

mov eax, 2Bh

3.2.4 Transform

This unit is most important area of virus as it is capable of generating new virus. Most of virus logic resides here. This unit generate new instruction blocks semantically which are exactly same like its code but syntax is a bit different. Here some sort of obfuscation is also performed, metamorphic engine tries to rename registers , inserts NOP and garbage instructions and reorder the execution of block.

Following code block has been taken from their examples in [10].

1)

mov eax, 10

mov eax, 5

add eax,5

2)

mov eax, 5

sub eax, 10

mov eax, 1

add eax, 2

sub eax, 8

3)

mov eax, 5

add eax, 5

mov eax, 10

4)

cmp eax, 5

ja L1

cmp eax, 2

je L2

cmp eax, 5

jb L3

L1 : mov ebx, 3

jmp L4

L2 : mov ebx, 10

jmp L4

L3 : mov ebx, 10

jmp L4

L4

cmp eax, 5

ja L1

cmp eax, 5

jb L2

L1 : mov ebx, 3

jmp L3

L2 : mov ebx, 10

jmp L3

L3

3.2.5 Attach

Attach unit it only available in those viruses which infect files and use them as source of replication. Transform unit not only transforms own code but also the code of target file, where it sets some entry point to virus main routine. During the attachment process it also shuffle the code

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

Related Content

All Tags

Content relating to: "Biomedical Science"

Biomedical Science focuses on how cells, organs and systems function in the human body and underpins much of modern medicine. Biomedical Science applies parts of natural and/or formal sciences to help develop advances in healthcare.

Related Articles

DMCA / Removal Request

If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: