This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Among some of the most important issue of reverse engineering being discuss, one particular area related to the perspective of reformation of malicious action on Andriod malware had repeatedly gaining focus of many party to futher explore new dimension of thought. The most large anticipated area of usage of Android which is smart mobile devices had been put in place to use and it hold extremely sensitive information which is malwares prone. Such malicious situtation caused by malwares is critically affecting evidences for a digital analysis of forensic as one of the main challenge of mobile forensic will be reconstruction of situation. Anyhow, reconstuction much depend on analysis of the malware code. The hurdles will be how fast could the suspicious be identify, how to workaround anti-forensics protection of codes, and how to characterize the behaviour of the malicious situation to the code discover. In order to provide a feasible solution to this, an systematic reconstruction method of analyzing is studied based on Android. Base on the method of reconstruction, a static and dynamic approch is also included as to enable larger variant of understanding from time to time inorder to futher understand throughly how Android malware is revolving.
General Term / Keywords- Android Malware, Forensic Analysis, Static Analysis, Dynamic Analysis, Reconstruction, Malicious Event,
Since the blooming of modern day mobile operating systems from the early 2000's till the current moment, huge significant changes had been seen and identified from time to time. In line with that, the era changes taking place include the advancement of technology available from time to time had greatly contribute towards the development of a new dynamic environment choice of operating system for mobile devices. While this had been identified as one of primary focus area for developer and user itself, many had been seen aware of this niche area itself. This had causes many involvement directly or indirectly to take advantage of the advance modern mobile operating system and architecture available via any possible method in order to gain an upper hand benefit and advantage. It is also leading to possible harm such as attack of malicious malware.
As the world facing super dynamic evolution of mobile operating system like Android itself, the developing capability of processing of mobile hardware is improving leading to greater exponential growth of smart mobile devices. Such device today's played an important role in life where it is use as a personal companion and contains sensitive information. As information is a lucrative business, it is a targeted platform for cybercriminal to take advantage of it. An undefined amount of malwares is produced to inflict damage to data privacy and environment security of smart mobile devices. This form of challenges is very new to mobile forensic analyst thus study must be done in order to build a knowledgebase to combat against mobile malware.
Nearly at all-time forensic personal will only focuses with data acquisition and simple analysis process. This is not enough as mobile malware analysis itself not only restraint to surface layer of information retrieval but to a chain of situation reconstruction. In another manner it means it is needed to decipher the malicious situation by understanding the data analyse. Challenge such as to discover the evidence through bytes analysis. At a common malware forensics event it sound like the following, an investigator obtain a mobile device and it is require to obtain data of malicious situation occurred due to the malware. Some deeper challenging situation is that only memory dump is provided. Following the investigator will have the authorization to statically extract the data from the devices or power up the devices without altering data environment to conduct some dynamic analysis sequence.
Investigator do cover four primary concerns such as recognition of suspicious application, workaround anti-forensic defences, recovering malicious code from malwares and deciphering malicious action occurrence. Classic method of forensics often focus on static data acquisition as evidence, while in the situation of malwares; the static application code and the activity of it is crucially vital and often neglected for reconstruction. Application codes consists raw input which could not be used for analysis unless it is decipher. Reconstruction of the malware situation will require bridging the connection between application, operating system, input/output data and hardware architecture.
Some bits of indispensable information is also challenge for reconstruction effort. The uniqueness of todays' malwares architecture is that it is being develop to particular requirement. The intricacy of both advance new operating system software application and variety of CPU including file system hardware environment prove to be challenge to novice investigator. The build structure and architecture of mobile programs varies broadly from standard programs for general computers. Hence in order to push out a mobile application analysis is prove to be challenging due to lacking of analysing tools on the investigation environment purpose.
Priority of the paper is to introduce to the study of available approach of systematic procedure of forensic analysis for Android malware, paying close attention to deciphering and reconstructing of malicious situation. Under the main segment of study, three main steps of reconstruction is being studied in depth. It is started off with recognition of potentially harmful applications on Android infrastructure. Followed by how to work around and overcome anti-forensic code protection for Android. Lastly will be on the identification of commonly found malicious characteristics on Android whereby investigators should focus on and the deciphering capability of such hideous event.
Within this paper itself, it contains 8 different segments and will be displayed within 2 different sections. Section II consist of related area of research include segment 1 - Android Malware A Simple Introduction, segment 2 - Recognition of Suspicious Program, segment 3 - Workaround Anti Forensic Defences and segment 4 - Reconstruction of Malicious Situation. Section III determines the related work result of the study from the collection of information base from the research publications. This section include segment 5 - Analysis by Static Method, segment 6 - Analysis by Dynamic Method, segment 7 - Analysis by Interval Monitoring and segment 8 - Classification of Reconstructed Malicious Sample. It will then follow by a conclusion where by everything will be summarizes base on the study conducted and it will be obvious to show particular finding after a proper chain of study being apply.
Limitations of Research Scope
Base on the current topic, the paper focus on analysis trends beginning from the early year of year 2010 and due to much technicalities difference from different author and approach use from time to time of study, it is not possible to just cover wider scope of study but instead stream line down to a much specific point of research study. This form of approach will reflect much more accurate result of study and whatever, whichever and however result found based on the trend of the paper exhibit will definitely explain the intention of the research to present the finding as much accuracy as possible. There is also certain consideration that this piece of study research could not be use entirely or fully rely on it due to the research nature and environment used which many not necessarily applicable in another similar context of research.
Besides that, it is also found out that most of the sample research paper indirectly having the similar research did provide only limited information pertaining to area of research. This is obvious as most of the paper providing similar content of this area, particularly it is focus on fact and theoretical knowledge. While only a minor amount of relevant resources show statistical data and graphical data which consider as significantly relevant whereby most of the information display shows that proper studies have been conducted in order to fulfil the requirement of the mention research area.
It has also come to the attention that most of the papers reviewed are from the context of reverse engineering primarily are collection of IEEE and its professional research area of interest. The publication based on the area of interest and the collection of research journal collectively limited to a group of people surrounding and working in line with that area of interest. Thus the reliability of the resource knowledge remains as a positive question outlook and it is view as a form of positive challenge from researcher point of view. As the researcher identified and the research effort collectively done, it is to facilitate an effort to clear the doubt identified and unlock new dimension of approach to display informative knowledge which sought to be stricter in deliverables discipline and approach.
While in the effort of understanding every possible of reconstruction of malicious situation on forensic for Android Malware based on the research conducted had shown that much precaution and awareness had been put in place to acquire the required information. This is to ensure that excellent level of research result being achieved due to the scarce and complexity of information available. Along with this, multiple effort of ensuring highly relevant information being lifted for this set of research often check and double check as it could not be simply left unnoticed that every single piece of quality and valuable information being left overlooked.
As a group of researcher, there is few key point of important analysis being emphasize which cover both qualitative and quantitative. Thus each relevant research resource document being thoroughly inspected and filtered to obtain the necessary volume and type of information for research purpose.
While it is forecasted that information gather may not be extremely precise in detail and scope it cover, this had been an important aspect whereby fellow researcher focus mainly on current result and evidence available from the information mass to support the finding of the study. This is important to deliver clear cut content of the study done on reverse engineering initiative thus projecting the paper information focus on details with sample data given which is currently available. Now to further discuss, much focus will be based on the currently available information. But the shift of focus happen in the process whereby information collected and available spread over a period of timeline and this is not abnormal due to the circumstance that root information will evolve from time to time.
While the core information deem challenging, the other part of it which include privacy protection law as well is a huge challenge to face by the researcher group. Sample result of information that is vastly available will require filtering and from the original point of information. It is important to ensure that rearranging of information gathered in an appropriate manner to support the information of the current research conducted. Solid understanding and knowledge foundation that is being exhibit by the researcher show that much information had to be place with care layer after layer to achieve a salient flow of excellent research paper.
RELATED AREA OF RESEARCH
Study show that nearly all form of malwares discover on Android OS being coded with the programming language which is JAVA and being run on the system based on Dalvik VM . Despite Android is based on Linux system, one of the best possible methods of invasion of malware is through standard installation of application. IT is important to understand the structure of Dalvik VM based prior to analyse malware on Android. Dalvik Android-based applications are removed and stored in a device with a format of APK .
The first application is compiled and then archived into a single file with all its parts APK, including assets and code. APK file is actually an application in the form of ZIP archives with code, resources, assets, certificates, and a manifest file. The folders and files within the Archive in accordance with JAR file format specification. After installation, copy the APK file to a specific location in the system. For the application of the system, it is usually the/system/app and the location of the user-installed applications/data/app. from the viewpoint of forensic analysts, the APK file holds three-part summary information: signature, bytecode, and resources.
The signature contains the message summary of the APK file. Ever since any adjustment to the APK file will change the message summary of the signature, one could swiftly identify if an application is tainted by read-through the signature. Analyst could also gather signatures of malwares to discover out malwares quickly.
The part of the application which could be executed, including the classes.dex file that can be found in the archive and the entire compiled program is in the form of bytecodes. In the case of android programs, the original bytecodes which is JAVA bytecodes are altered into an instruction set the used by the registered-based VM, The Dalvik VM.
Resources are the part of the application which cannot be executed, that covers all additional data that essential to the application. The most part of the resources that available in the application are user interface component that include bitmaps, menus, layouts, and widgets. In the majority of the cases the malicious part of the Malware functions as background application and does not require any user interface. Therefore, the User Interface resources are occasionally worried. Yet, AndroidManifest.xml is one of the resource file that significant to indicate critical forensic information. The AndroidManifest.xml file is encoded into binary format in the APK file. It holds the authorization request of an application. The most significant forensic information is permission and component.
Permissions: In the context of access some protected APIs of Android, the application will declare the permission request in "AndroidManifest.xml", such as the permissions to read message, contacts, etc. Permission request is a very important clue to reveal malicious functions . For instance, a normal application, such as calculator, declares "READ_CONTACTS" permission, it can be very suspicious because a calculator should never need information about contacts. This character is unique for Android applications and is useful for analysis.
Components: Android applications are formed by components. The components of the application are divided into four kinds - activities, services, broadcast receivers and content providers. A malware who runs in background often has a service component and a receiver component in order to receive the boot Intent on system booting. By checking components and their received intents, analyst may have a brief view of the potential behavior of an application.
There are thousands of applications in common smartphone device nowadays. Malwares only reside in a few portions of them and majority of the others are kind. The opening phase of forensic analysis is to identify the malicious programs from the kind ones to the harmful one. Although many research works and tools are claimed to support malware detection, there are still certain unsolved problems for forensics. In one way, automatic tools need samples to generate malware database. The immediate evolvement of malware makes automatic detection tools difficult to keep up.
Furthermore, some malwares are aimed for attacking specific devices and up till now are difficult to be collected beforehand. In another way, forensics not only needs to discover the suspicious programs, but also requires code analysis and events reconstruction. Therefore manual check is helpful for later in-depth analysis, and manual methods are essential for forensic analyst to guarantee the identification.
One of important conclusion to identify malicious program is that malwares are always connected with some unusual structures. These structures indicate the prospective suspicions. We suggest checking the following features to efficiently and effectively identify suspicious applications.
Message digest is a useful cryptographic feature to dissociate kind applications from the affected ones. Frequently, a trustful application is released via online market and the market provides its message digest. A database for normal applications can be built by collecting message digest information from online markets. Then the analyst simply checks an application's message digest and if the message digest of the checked application can't be found in database. It is possible that this application is harmful. However, only with the message digest it is not restrained to determine the malicious applications. An extra in-depth analysis should be engaged to accomplish the identification.
The permission requirement is a unique character for Android programs. Due to the design philosophy of the Android OS, the application only needs to apply for permissions when being installed and determinedly own these permissions without repeatedly requesting. Users may ignored the initial request, and a common malware pretending to be an kind application with faked normal functions will ask for a set of permissions such as SMS and Contacts database access, even the faked functions of the application need not these permissions at all. Suspicious permission requirement is the leading clue to confirm an Android malware . Most of the Malwares declare a list of high-privilege permissions to fulfill malicious functions. According to the AndroidManifest.xml file of an APK file, we can find out the permissions requested by the application and filter out the suspicious requests.
At the theoretical level, Android application is formed by components. The structure of components can be used to judge the program's characters. As mentioned in Sec II, service component and receiver component are delicate weapons for malwares. So, from the examination of components and their received purposes, analyst could have a brief view of the prospective malicious function of an application. And also suspicious applications are to be differentiated from the normal ones.
Workaround Anti Forensic Defences
From In this section we introduce three common anti-forensics techniques and discuss how to deal with them.
Events could be inferred from the code. Though, malware developers continuously try to stop the inference or make it hard. Before code analysis, one important thing is to clean the barrier - anti-forensics codes. Anti-forensics codes are common inside malwares of commodity personal computers. For instance, many malwares detect the execution environment to check whether it is executed inside a virtual machine. Android Malwares inherit the property to inconvenience the forensic analysis.
obfuscation. The obfuscation techniques in Android malware is pretty much the same as the one that establish in JAVA obfuscation, because the developing programming languages have many similarity . The case that make a different is that in an obfuscated program all of the packages, classes, methods, fields are renamed to single alphabet such as a, b, c.a(), d, e.b,f.a, g.b(). So that analyst is hard to distinguish different parts of the code yet is difficult for her to understand the functionalities.
strings encryption. For an experienced reverse engineer, strings in a program are valuable information sources. Many malwares use string encryption to avoid plaintext detection. constant strings in malware are encrypted with symmetric algorithms such as DES and the AES and the key is fixed (dynamic key is seldom used because no matter how complex the key is, it will finally be used to decrypt the ciphertext). The encryption makes static analysis hard. However, if the analyst has the capability of dynamic execution she may manually extract key and decrypt the ciphertext, thus the information is still available for retrieving.
environment verification. Some of the mobile malwares are designed to attack certain types of mobile devices. Specific symbols like Android system properties (from android.os.BUILD) are often verified to make sure the malware is not executed in an emulator or other types of devices. And the subscriber ID (IMSI) is used to make sure the malware is running on a certain device with the special IMSI. If verification fails, the malicious code will stop executing, and the analyzers could not simply reproduce the malicious behavior by emulation or using any improper devices. This anti-forensic technique lets malware deceive dynamic black-box analysis.
We reviewed some countermeasures to the anti-forensics techniques mentioned above.
decompilation and deobfuscation: For an Android application, the high level JAVA-like source code is much easier to read and to be understood than the bytecode. However, the State of Art decompilation tools cannot decompile programs perfectly. The decompiled source code typically contains mistakes or code absences. What's more, in many cases the bytecode of malware is obfuscated, which makes decompilation more difficult and inaccurate. In the meantime, the bytecode is constantly correct and accurate although it is much more difficult to be analyzed. So analyst should utilize both bytecode and decompiled source code, and take both codes into analysis to compensate the inadequacies of each other.
Three key stages are advised to employ decompilation and deobfuscation. First, the analyst could use apktool  to extract the bytecode (with .dex format). Then, the combination of dex2jar  and jd-gui  are helpful to decompile the bytecode file to JAVA source code. The decompiled JAVA source code may contain huge number of errors. The following measures are possible options for code fixing.
removing empty classes
decompile errors correction
control flow error correction
name conflict correction
missed information fixing
strings decryption: Strings are important information sources and most constant strings (e.g. remote server URL) in malware are encrypted. Often a decryption process is required to extract these strings. The whole decryption process involves encryption algorithm recognition, secret key extraction and string decryption. One convenient aspect is that many malwares use system cryptographic APIs to deal with encryption and decryption. Analyst could filter out these situations and quick identify the key.
program patching: As mentioned above, to deceive dynamic analysis, system properties and the subscriber ID are often verified by the malware. In order to employ dynamic analysis, analyst could automatically search for these features and manually patch the code to avoid these verifications.
Reconstruction of Malicious Situation
Main part of mobile malware forensics is to reconstruct events through program code and additional information such as network flow. However, in most cases, binary program is the only form of malicious code given. In order to understand the logic of the program, reverse analysis is important. It helps to unlock the unsolved clues in order to understand the crime. Since that there is no standard procedure of reverse analysis, some typical behaviour on Android may also help to relate it back to malicious code with clear patterns. Thus, analysts can act to search and locate the malicious behaviour according to the pattern and eventually infer the behaviour of the event to merge
Specific suspicious behaviors on Android
Android malware will likely steal personal information such as SMS and Contacts, and will automatically send it to a remote server. For example, a malicious program may pretend to be a normal financial application when accessing users' personal information with a background service. The background service and then wait for a specific command from the remote server and transmit certain personal information through self-defined protocol. Related suspicious functions that essential to the malicious behavior are listed below:
service core loop. Most of the malwares contain a service that supports continually execution. On Android OS, this is often implemented using a service component.
self-defined communication protocol. A malware often contains a self-defined protocol to communicate with a certain remote server with its own ''language". Inside the malware some modules handle the communication between the client and the command server. Malwares often pack and encrypt the sent information, decrypt and parse data returned by the server. On Android OS, this function is always related to the permission of network access such as "android.permission.INTERNET" and "an- droid.permission.ACCESS_NETWORK_STATE".
cryptographic utilities. The cryptographic utilities from system libraries support encryption operation of the malware, such as generating message digest of device information, decrypting encrypted strings, exchanging key with server and making encrypted communication with the server.
Sensitive data access. Sensitive data access is the core function of the malware from the point of privacy leakage. It needs high privileges to achieve the goal. First, data access permissions such as "android.permission.READ_SMS", "android.permission.READ_CONTACTS" are required. Then, specific protected APIs and content providers are used to visit the database. An application with sensitive data access permission request is highly suspicious.
Taking Code analysis to reconstruct crime
Since Android malware are built under Java programming language, and the bytecode of the hardware contains all logic function, it is quite difficult to gather the malicious behaviour of malware in the view of top level. In mobile malware forensic analysis, to discover the malicious event itself can only be found out through the malicious code and only the code related to malicious behaviors itself could give a hand to analyst to locate and analyze the malicious event in advance. In brief is that it is not a simple thing for an analyst to combine functions into a high-level abstract events and it only can be done through reverse code analysis that targets program at extracting fragments first.
The reconstructed events may include following information, the work flow of malicious code, sensitive information that the malware accessed, the encryption algorithms, and the details of malware's communication protocols. Reconstructing malicious events is not only depending on the function mining inside of the codes, but the analyst also needs to reorder these functions correctly. Android it keeps records of its high level operations such as the service log and system API calls in a logcat mechanism. The analyst should also try to bring back the execution of malware and the record of operations.
ANALYSIS AND DISCUSSION OF RESULTS
Analysis by Static Method
For static analysis, malware body is to be studied  in-depth. Set of instructions on binary image is parsed to trigger malicious functions or shell code that has been placed or compromised.
IDA Pro  is one of the static analysis tool to perform string search and disassembling a binary program. This debugger has the capability to operate from both local and remote host. Environments such as Win32 local and remote, Win64 Remote, Linux Remote (x86), OSX Remote (x86 ) and WinCE Remote (ARM ) is supported by this debugger.
By visualization of program execution has a record of producing accurate results by making it clear to study and monitor program executions. For example, Xia et al.  visually presents the methodology of the system.
The visualization graph divides the disassembled codes into basic block  approach which describes the sequences instructions where control flow enters at only the first instruction and exits at only the last instruction without any jumps in the middle. Basic block is connected via directed edge as a presentation of jumps into the control flow. This connection has the capability control transfer instruction (CTIs) such as conditional and unconditional jumps. As a result, this will have a huge advantage for the investigator to investigate from the (CTIs)
Analysis by Dynamic Method
Debuggers such as (e.g. OllyDbg , GDB , WinDbg ) and with the aid of Operating System State tracking such as Sysinternals' Process Monitor  and system call will construct the Dynamic analysis. These methods are used to examine and keep an eye on a run-time malware action by identifying the execution instructions and behaviors from the execution trace. This method will monitor changes made on the system's registry; recent new service(s) installed and network communications. In addition, it is possible for the dynamic analysis tools to control run-time executions.
Further information about OllDbg debugger for dynamic approach is a 32-bit assembler level dynamic debugger for Microsoft Windows executable. It plays as an important role for binary code analysis should a source code is not made available. This software performs a reverse engineering method where it digs into registers, recognizes procedures, API calls, switches, tables, constants and strings, as well as locates routines from object files and libraries
The VERA framework  visually presents the flow of a program as well as to expedite reverse engineering method which makes the process faster
Static analysis results provide an overall structure of a binary executable while the dynamic analysis approach provides tracing feature which monitors and visualizes the executive path of binary creation. The concept of it is to trace the executable file together with the graph generated from static analysis as mentioned above. The relevant portion of the code will be highlighted to display the merger procedure as the program runs through it. This method provides an on the whole concept on the program execution to show the various paths of control flow. The intermediate values of the parameters involved in the program can help the security analyzer to understand accurately the details of a suspicious program execution.
Reverse engineering can be a tedious affair apart from the time that it takes up. The benefit of this approach is to provide the security analyzer the opportunity to re-study instructions that have been overlooked.
Analysis by Interval Monitoring
XManDroid  is a security framework that monitors the real-time communication between applications while it also verifies the inter-process communications against a set of pre-defined security policies. It protects the Android permission properties to enable from privilege escalation
Another method is by running a tool called Paranoid Android on remote servers while identical copies of android OS are running in a virtual environment. All the necessary information is gathered and then is needed to replay the execution and finally transmits it to the remote server. The information is re-executed on the virtual android OS. The method purposely to run for security checks on applications. While at the same time to maintain minimal computational and battery overhead.
Classification of Reconstructed Malicious Sample
The reconstruction of malicious events involves connecting the relationship between programs, operating system, hardware and I/O data. Latest mobile malwares are designed towards its platform only. The complexity between the mobile's hardware and software challenges for inexperience forensic analysts because mobile applications is different from personal computers platform. Thus to deploy mobile program analysis is even harder for lacking of well- developed analyzing tools on mobile platform.
First the analyst needs to identify suspicious programs from the benign ones since some of malware detection tools are incomplete for forensic analyst. Forensics requires code analysis and events reconstruction. Not to forget manual check, it is helpful for forensic analyst for their future in-depth analysis and manual methods. To identify malicious programs, they are always connected to unusual features which are suspicious for the analyst.
Message Digest is one of the important features since it has cryptographic features. Analyst just needs to verify from the application's message digest whether it is found in a database. If message digest cannot be found, the checked application is potential to be malign.
Permission requirement is another important feature for analyst since permission requirement is a unique character for Android application. This feature requests permission before it is installed and research has shown that most users may ignore the request while malware sends permission via SMS and Contact database access.
Android application is produced by components. It is important for the forensic analyst since components program's characters can be extracted from the structure of component. Since service component and receiver component are sensitive for malwares, these parts is important for the analyst to examine where potential malicious function may be found from the received intents
Another method is by listing the differences in time of execution of all applications into an opinion table and display their behaviors and services. An application uses the service and netwrite operations. For example, if filewrite is used by one of the application but service is not employed, application is not belong in the bridge family. If file employs service and filewrite but netwrite is not used, this means it belongs in the bridge family. Analyst can now detect whether the file is a potential malware
Malwares can also be found by noticing an event being called without any purpose such as net write as an example. Analyst can analyze from the lists time of execution and look for suspicious behaviors.
FUTURE WORK, OPEN QUESTIONS AND CONCLUSION
This paper had shown in-depth reviewed of various fields, such as android malware, recognition of suspicious application, workaround anti-forensic defences, recovering malicious code from malwares and deciphering malicious action occurrence. Below is the flow of the content in this paper:
To look from the forensics point of view, android malware contain three abstract information, signature, byte codes, and resources. Classic method of forensics often focus on static data acquisition as evidence, while in the situation of malwares; the static application code and the activity of it is crucially vital and often neglected for reconstruction. Application codes consists raw input which could not be used for analysis unless it is deciphered. Reconstruction of the malware situation will require bridging the connection between application, operating system, input/output data and hardware architecture.