Virtualization Techniques And Cloud Computing Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Virtualization enables multiple operating systems to run in a single physical machine. Limited resources can be shared with multiple users with lesser compensation. There are two main trends in the development of virtual machine systems, full system virtualization and para-virtualization. Virtualization techniques are assisted at software level as well as hardware level.

Cloud computing is another contemporary topic closely linked with virtualization. Cloud computing can be named as a development of grid computing, parallel computing and distributed computing. It provides Infrastructure as a Service(IaaS), Platform as a service(PaaS) and Software as a service(Saas). Various virtualization techniques can be used at different levels to enhance and improve the scalability and availability in cloud computing.

This survey will explore the possibilities of using the well known virtualization techniques, especially Xen para virtualzation, to over come the limitations in cloud computing and enhance the performance in a cloud.



Virtualization, Cloud computing, Xen, Grid computing



\IEEEPARstart{A}{long} with the technological advancement in the present day many computer users have faced the difficulty in maintaining their hardware. It has become mandatory to provide separate machines to all the employees, constantly add new servers to cater the increasing data and address various requirements of the users at once. Apparently sharing a single machine is not adequate today. Thus, one might argue that its crucial to adapt multiple computer systems now. However physically purchasing these devices can not be accepted since the capital expenditure and maintenance cost of the hardware is very high and eventually will turn out to be a waste.

Virtualization allows one computer to be shared among several users. This guarantee the multiple computer systems needed with less expenditure. Virtualization provides an excellent solution to the lack of resources along with additional benefits. It adds reliability to a system. It will ensure save migration from one server to another at times of hardware failures. Security could be enhanced where untrusted applications can be sandboxed. Data and performance isolation is provided by virtual machines because each application will be having its own operating system. The application is isolated from other applications. Furthermore the system resources are accessed separately by each operating system hence thrashing of one application will have minimum effect on other applications running on other operating systems. As defined by Popek and Goldberg classical virtualization would mean that applications running in the software has equivalent execution, performance and safety compared to the real environment\cite{classicVirt}.

Virtualization could be supported by software or hardware. Intel and AMD has included virtualization support at their CPU designs\cite{HWVirt}. These hardware implementation can simplify designing a stand alone virtual machine monitor(VMM). Techniques such as paravirtualization and full virtualization provides software emulations of the hardware and enables virtualization.

Another popular technology advancing in the distributed environment is Cloud computing. It implements a model to distribute and consume resources over the internet, most of the time, which is scalable dynamically with the use of virtualized resources. Cloud computing can enhance the availability of IT resources. It is possible to provide services without any manual interaction with service providers. Cloud computing can provide many services such as Infrastructure as a Service(IaaS), Platform as a service(PaaS) and Software as a service(Saas), being the most popular among them. There are public clouds, private clouds and hybrid clouds. Leading commercial clouds in use today are EC2 from Amazon, Azure from Microsoft, AppEngine from Google, Blue cloud from IBM and CRM from Salesforce. Abicloud, Eucalyptus, Nimbus and Opennebula are several popular cloud platforms in use today.

Virtualization is heavily used in cloud computing. Grid computing and cloud computing are also closely linked. Research has shown that virtualization techniques could be used in the services provided by clouds. Virtualization at the grid level could bring a boost up in the performance of clouds. Furthermore many applications running in the clouds can be virtualized to increase the availability for cloud clients.

This survey will illustrate in detail the facts mentioned above. Section 1 will contain a detail description of prevailing virtualization techniques. Section 2 will provide insight to cloud computing platforms. Section 3 will describe several ways virtaulization can be used in cloud computing to enhance performance, scalability and availability. And finally the section will include the conclusion and

\section{Virtualization Techniques}

\subsection{Full virtualization}

Full system virtualization provides the exact copy of the system's hardware. The operating system running on top of it need not be changed because the virtual hardware is exactly similar to the original hardware. The underlying hardware needs to be virtualizable in order to implement full virtualization.

One of prime IBM virtualizable architectures is the Virtual Machine Facility/370 (VM/370) where the underlying hardware is similar to the IBM System/370\cite{HWVirt}. This VMM consist of three main components.


\item \textbf{Control Program:} Handles all the duties of the VMM and starts the virtual machines

\item \textbf{Conversational Monitoring System:} The operating system which runs in all the virtual machines

\item\textbf{Remote Spooling and Communication subsystem:} Handles network communication between virtual machines, and also workstations.


The VM/370 uses shadow page tables to translate the virtual addresses. Guest OS assumes the whole memory is allocated to it. Thus dynamic address translation takes place with the assistance of shadow page tables.\\

Simulating many of the instructions in the architecture for each virtual machine in this manner was considered costly. This lead to the IBM System/370 Extended Architecture(370-XA) VMM. In order to improve the performance specific CPU instructions were allowed to execute safely in hardware instead of each virtual machine. An extension called \textsl{"assists"} were incorporated to enable the new feature. Assists were introduced to a new execution mode call \textsl{"interpretive execution"} which is enabled by a privileged \textsl{"Interpretive Execution Start"} instruction. In the current interpretive mode privileged instructions can be recognized and executed directly. A state description is maintained in order to check the current state and to handle any exceptions. 370-XA was able to by pass the costly reference to shadow table by opening assists in the hardware that allowed trusted guests to directly access memory mapping tables.\\

VMWare uses full virtualization. VMWare hosts each guest operating system in a separate, secure virtual machine. Each virtual machine (VM) has its own virtual CPU, memory, disk, and all of the virtual hardware is mapped to the computer's real hardware. Each virtual machine also has its own BIOS that can be edited. It is a virtualization tool recognized as easy to implement in the x86 architecture, which was initially not built for virtualization unlike other architectures discussed up to this point. \textsl{"Hosted Virtual machine Architecture"} being the reason to this, VMWare let the host operating system to provide abstractions for the various devices available in x86 architecture. Furthermore the operating system driver named VMDriver installed in the operating system kernal by VMWare allows each Virtual Machine to access the devices on the system faster. The VMDriver put the physical network card in promiscuous mode and creates a virtual Ethernet bridge that receives all the packets which are then routed either to the host OS or to a virtual machine. Since the network address translation is done in the virtual bridge each virtual machine is virtually provided by its own IP address. It must also be noted that only a generic set of devices are provided to the virtual machines, in a VMWare, which simplifies the implementation of it in the x86 architecture. It is also possible to take snap shots of the virtual machine in VMWare which allows a roll back to any point in the virtual



Full virtualization as discussed above can be complex in the, now mostly used, computer architecture x86. Since full-virtualization interprets every hardware communication of guest operating systems, its performance degrades. Trying to over come these drawback makes the VMM rather complex than more manageable.

A well establish solution to overcome these pitfalls would be paravirtualization. In paravirtualization the hardware is not emulated directly and the guest operating system is not used straight forwardly as the original design. In paravirtualization the operating system is modified so that when executing a protected task instead of going to the CPU it will be directed to the Virtual Machine monitor.

Denali is one of the earlier system which used paravirtualization\cite{denali}. Hence most virtual instructions are executed directly in the physical processor. However Denali architecture is different to the x86 architecture. It has virtual instructions which are similar to the system calls in an operating system, but operates at the physical level not at the OS level. This means that these instructions are non blocking at the physical level.Denali has virtual registers as a light weight method for passing data between the Virtual Machine Monitor and the Virtual machines and these registers are mapped to a pre defined region of a Virtual machines addresses. It must be noted that many architectural features are modified or eliminated in this technique. However the deviation from the standard virtualization of creating replica of the system as helped Denali to boost the performance of the virtual system.

For example the idle time an OS is in when accessing I/O devices is eliminated by Denali by introducing a new \textsl{"idle"} instruction which the OS in virtual machines must call when it enters these situations. When the idle instruction is called a context switch happens and the VMM takes control. Afterwards the VMM can schedule other tasks for the other VM. Another architectural difference in Denali is queueing the interrupts. When the number of VM increases the probability is low that a particular interrupt is for the currently running VM. Hence context switching becomes expensive in this case. So Denali simply queues the interrupts and dispatch them when the related VM is running. Along with this modification Denali has to enable new interrupt semantics to notify than interruption occurred recently. The VM requests a timer interrupt be scheduled with the VMM saving lot of context switching processor time. Denali constrains each virtual machine to a single address space and does not use virtual memory. The focus of Denali was to support small scale virtual machines hence the development team has dismissed the above constrain and proposes to run any application causing problems in its own virtual machine. System information is provided in read-only virtual registers. Just as in VMWares Denali chose the VM to give access to generic I/O devices reducing the complexity.

Certain drawback that can be highlighted in Denali\cite{xenArt}. With all the above modifications it is not an easy task to run an operating system in Denali. To compensate for this drawback Denali provides its own guest operating system called \textsl{"Ilwaco"}.Denali does not target on existing application binary interfaces. Denali VMM performs all paging to and from the disk where malicious VM s can miss use the CPU time n disk bandwidth. Furthermore the virtualizing all the 'namespaces' of all machine resources can be viewed as an unnecessary overhead.

One of the well established para-virtualization techniques in use today is Xen\cite{xenArt}\cite{xenArtRepeat}. All the operating systems run on top of a hypervisor, similar to the VMM we were referring to until now. The guest operating systems need not go through the hypervisor for every communication with hardware. for validation and management purposes however the hypervisor need the guest operating systems to call the Xen API. To archive this "Xen-aware" execution the guest operating systems need to be modified in some parts. How ever these modifications are not considered as drastic as it was in Denali. Since Xen is designed to be binary compatible a Xen version Linux, also know as XenoLinux, or Windows operating systems can straight away run on top of the hypervisor.







In a Xen installed system initially the Xen is booted. Then Xen boots the Dom0 guest OS. The guest operating systems in environments called domains, which are referred to as virtual machines, are created by calling domain(). A privileged guest OS runs in domain 0 is the modified OS supported by Xen which creates and manages all the other guest operating systems in Xen(DomU). Dom) is the only guest OS which has access to real hardware. The guest operating systems uses the Xen-API to call the hypervisor, which is similar to system calls in a OS. In a hypercall, the Xen hypervisor takes control and does the tasks and returns the control to the guest OS. There are no hardware interrupts in Xen. They are replaced by a lightweight event system. XenStore is a data structure in shared memory containing the details about the which can accessed by a guest OS which is maintained by Dom0. Since devices listed in the XenStore are not real devices guest operating systems use Split device drivers\cite{xenGuide}. They are communications channels between the OS and the guest operating systems. Front end represents the real device and the back end uses the Xen shared memory to transfer data.

\section{Cloud computing}

In general term cloud computing is a technology for lower computing cost by providing computational resources in a shared infrastructure. This sharing saves the capital expenditure by a simple payment for the usages of the resources. Furthermore reduces the deployment times for companies can sign up and straight away use the resources. Virtualization facilitates the cloud computing paradigm in many ways. The cloud clients can be provided with an isolated execution environment regardless of the dynamic hardware beneath. One might argue that virtualization not mandatory in cloud computing. The argument can not dismissed however certain performance upgrades can be expected with virtualization in cloud computing\cite{cloudtrend}.

As mentioned earlier in this survey cloud computing can provide three kinds of modes, including IaaS, PaaS, Saas. IaaS products deliver a full computer infrastructure to the users via the Internet. Paas offers a full or partial application development environment that cloud clients can access and utilize on-line in collaboration with others. SaaS products provide an application such as ERP or CRM via the internet. Application integration and support, availability and flexibility improves with cloud computing.

Three kinds of clouds can be seen when deploying. That is public cloud, private cloud and hybrid cloud\cite{compareCloud}. Infrastructure is owned by only one management group in private clouds. Cloud service sales organizations own public clouds and attempts to sell the cloud computing services to the industry. Hybrid cloud means a combination of public and private clouds combined with some standards or special techniques and data and applications are transplant.

\subsection{Abicloud cloud computing platform}

Abicloud can be used to integrate both public and private clouds in a similar environment.Abicloud has a powerful web based management function. Deploying a cloud is very easy in Abicloud due to the powerful encapsulation manner. It is only a dragging virtual machine in deploying a new cloud in contrast to numerous command lines in other platforms. Abicloud is built upon the assumption that each cloud client will have different requirements. Hence a homogeneous cloud computing core is available with extensible infrastructure. Abicloud also provides various interfaces which makes it easier to support third party products. These features enable Abucloud to provide a cloud computing platform which can be customized according to the client requirements. If the Abicloud platform is used in a cloud it can be packed and redeployed at any other Abicloud platform. Abicloud is built based on Java. These features make it easy to deploy Abicloud in many platforms and flexible to use.

\subsection{Eucalyptus cloud platform}

Eucalyptus is an open source private cloud platform \cite{eucalyptus}. It implements virtualization depending on Linux and Xen, simillar to EC2.The system allows users to start, control, access, and terminate virtual machines using an

emulation of Amazon EC2’s SOAP and Query interfaces. Eucalyptus implements each high-level system component as a stand-alone Web service. Due to that web services are exposed to a well known API in the form of WSDL. Furthermore existing web-service features can be used in the operations. There are 3 high level components in Eucalyptus.


\item Instance Manager- controls the executions of the VMs

\item Group Manager- manage virtual instance networks and scheduling

\item Cloud Manager- Managers the cloud as in the users and the administrators.


An Eucalyptus cloud has only one cloud Manager. Client interface is the connection between the inside and the

outside of Eucalyptus, through which users can access all kinds of resources on the cloud.




\caption{Eucalyptus cloud computing platform \cite{compareCloud} }



\subsection{Nimbus cloud computing platform}

This is a cloud computing platform which provides Infrastructure as a Service. The basic

design pattern allows users to obtain infrastructure from the cloud and deploy the system using virtual machines. This platform is a combination of many components which can be classified in to three main categories, which are client supported modules, service supported modules and background resource management modules. This platform consist of an independent virtual machine manager called the Workspace service module. This is able to access various remote protocols. Each application has a front end implemented in WSDL which which is the protocol accessed by the workspace service module. Cloud client module enables user interaction with the now established cloud. IaaS gateway provides access to other cloud platforms such as EC2 There is a remote management interface to implement security protocols. A context agent module will manage the startup services of large clusters and support client cloud. This module can run in both the numbus cloud platform as well as EC2. Several advantages in integrating the EC2 gateway to this platform could be listed as, running the public Amazon virtual machine image on the Amazon cloud platform, checking the status of homogeneous wireless sensor network, notice the user the public IP of virtual machine through the characteristics of resources when it is available.

\subsection{OpenNebula cloud computing platform}

OpenNebula extensively use virtualization infrastructure. It permits the users to deply virtual machines on the physical devices. Then it sets the users' Data centres/ clusters to in more flexible virtual infrastructure. It can be viewed as a virtual infrastructure management tool. This can be useful to build scalable cloud computing environments. Control interfaces in OpenNebula allows users to access the services. This platform can be considered as a centralized manamgement of virtually and physically distributed devices. From the point of infrastructure users, OpenNebula is scalable and can rapid response to user’s requirements. Compared with Eucalyptus, OpenNebula is more strength in the support of private cloud platform and dynamic management of the scalability of the virtual machines on clusters. Refer Table 1 for a summery.




\hline Feature & Abicloud & Eucalyptus & Nimbus & OpenNebula\\

\hline cloud character& public/private & public & public & private \\

\hline scalability& scalable & scalable & scalable& Dynamical, scalable \\

\hline cloud form& IaaS & IaaS & IaaS& IaaS \\

\hline compatibility& Not supported by EC2 & Not supported by EC2, S3 &Support EC2 &Open, multiplatform \\

\hline deployment& pack and deploy & dynamic deployment & dynamic deployment & dynamic deployment \\

\hline deployment manner& web interface drag & commandline & commadline & commandline \\

\hline Transplant ability& easy & common & common & common \\

\hline VM support& VirtualBox, Xen, VMWare, VM & VMWare, Xen, KVM & Xen & Xen, VMWare \\

\hline structure& open platform, encapsulated core & module & lightweight components & module \\

\hline Web interface& libvirt & Web Service& EC2 WSDL, WSRF& libvirt, EC2, OCCI, API \\

\hline reliability& - & - & - & rollbackhost and VM \\

\hline OS support& Linux & Linux & Linux & Linux \\

\hline development language& ruby, C++, python & Java & Java, python & Java \\




\caption{Summary of cloud computing platforms \cite{compareCloud} }


\section{Cloud computing and virtualization}

This sections mainly focuses on few researches that has been done in related to virtualization in cloud computing.

\subsection{A virtual machine scalability for Cloud computing workloads}

Most the underlying hardware in a cloud is based on multi core possessors. VMs are created on top of these systems and applications are executed in isolation in each VM, While virtualizations adds many benefits to cloud computing, it is also important to understand the overhead of virtualization. A typical cloud computer is recognized to have four types of interactions; intra-processor, inter-processor, across a Local Area Network(LAN), across a Wide Area Network(WAN) \cite{VTCC1}. Virtualization will have following overheads in these interactions.

It is straight-forward to characterize the performance of a single operating system image based SMP system. But it is difficult to measure the performance of a cloud. Virtualization in a cloud adds an additional layer between harware and the operating system. These methods are proven to improve resource utilization capabilites however the performance improvement might not be comparable with native performance. This survey will illustrate on an evaluation the CPU, memory and network features in the VMs of Intel multi-core processors; a research conducted by M. Hasan Jamal, Abdul Qadeer, Waqar Mahmood, Abdul Waheed and Jianxun Jason Ding\cite{VTCC1}.

Their experiment environment compares a SMP system running a non-virtualized image of Linux with a Xen based virtualized kernel image. Virtualization instant executes processes in multiple VMs. The Figure 3 in \cite{VTCC1} illustrates some important aspect of CPU throughput VMs.The linear scalability trend of the non-virtualized baseline prevails in the virtualized cases as well. Hence a deduction is made that virtualization provides isolation without compromising the linear CPU throughput scalability.

The research has measured memory throughput for different data sizes. The different data sizes enable to observe the behaviour of the private L1 cache, shared L2 cache and the shared memory bus in both cases. According to the test results shown in Figure 4\cite{VTCC1} of the research at the use of the L1 cache the both cases shows the highest memoery throughput and linear scalability. However at the use of the L2 cache throughputs in both baseline and VM cases are lower than the previous case and the scale is not linear. The reasons provided to this behaviour is that after the saturation point throughput starts to decline due to L2 cache conflicts. When the data size increases it is observed that throughput reaches bus saturation level and give less throughput.

With the above results the research group has come to the following conclusion with the benchmarks they used in the test. The overhead of multi-threading is not very different to the overhead of virtualization. Hence it can be concluded that virtualization has not added any addition overhead to the system.

To measure Network I/O throughput five cases are recognized by the group. First case is both client and server in multiple pair of thread in non-virtualized SMP system; second the client and server in a single VM; third two clients in different VMs with the same host; fourth clients and servers on different hosts connected through a GigE LAN running on SMP systems (baseline) or inside VMs; firth client and server end on different hosts connected through a WAN running on SMP systems (baseline) or inside VMs.

Figure 5 in\cite{VTCC1} shows that client and server on the single VM increase throughput linearly till it hits the bus throughput limit. Here all the transactions are within a single VM hence the communication overhead is lower than in non-virtualized cases. However poor scalability is indicated for client and server on different VMs due to the saturation of the Xen bridge. Virtualization becomes a bottleneck when multiple VMs communicate. Two proposed solutions to over come the bottlenecks due to virtual bridges are XenWay and XenSocket.

\subsection{A virtual Software as a service in cloud computing}

Software as a Service is a popular feature in cloud computing. The main focus is to make the software access more simple. Although there could be unresolved matters in this service. For example the legacy systems such as desktop applications have to be redesigned to make web accessible. The web based presentation of the service might not be appealing for the users who are accustomed to a desktop environment. Another issue would be data transferring on the internet. User will be secluded to pass their sensitive information in this manner.

As a solution to these problem Liang Zhong, Tianyu Wo, Jianxin Li and Bo Li has proposed a virtualization-based SaaS enabling

architecture for cloud computing, named vSaaS \cite{VTCC3}. A detail description of the proposed architecture is as follows.

It is a six layer architecture that has been proposed. The six layers are virtual resource layer, virtual resource management layer, virtual execution layer, virtual display layer, schedule layer and user agent layer. Besides these six layers, a security mechanism and a development utility model is integrated.

The proposed architecture will deploy the software dynamically. The reason behind is that it would be costly to maintain dedicated environments for each user. Hence the software are dynamically deployed and executed without installing them on the running environment.As described in earlier sections, OS level virtualization will be sufficient for this feature. This virtualization enable the user to stream the needed the software. While the software is executing in the bank end resource pool, virtualized software can start without fully downloading and later streaming other needed sections. Another feature in the proposed architecture will be Desktop merging. Each user is given a virtual display, similar to a desktop, which will merge all the presentation windows of software instances. To merge the virtual desktop with the local desktop VClient needs to installed in the device.




\caption{vSaaS architecture \cite{VTCC3} }



This model also provides mobile device access

to the software.Since all the software is running in the back-end resources pool

the client access device in loosely coupled with the vSaaS. The Virtual execution layer in the architecture mainly assist the execution of the virtual software. At a virtual solution the software is executed in different physical layer. To present the execution the user it must all the gather and displayed in one monitor. This troublesome talk will be handled by the Virtual display layer. The user awaiting a service cannot be kept idling for a long time. At the same time the system need to balance the work load in the virtual systems and physical devices. These scheduling will be handled by the Schedule Layer. The user agent layer contains many user agents. This reduces the complexity of the user clients. The user client does not need strict requirement but can simply access its virtual desktop with light weight clients. An authentication and authorization framework based on certificate mode is provided for the whole vSaaS six layers. With a HMAC code the system integration is ensured. The implementation of the system is done as follows. There is a software repository server used to store the massive virtual software packages call vSoftware store. It provides management interfaces to publish virtual software. A separate utility called vSequncer is used to virtualize and pack legacy systems.

The vProcess Server can be deployed in Linux or Windows. VProcess use different protocols to stream the software packages. This

is responsible for the whole life cycle of the vSaaS. There is a virtual file system, virtual dynamic library, virtual registry and configuration, and user data to be handled by the vProcess. vSpace Server is the hosting environment for the virtual display instances. the result published in \cite{VTCC3}.for the above architecture proves that it is a feasible and effective solution for virtualized Software as a Service.

\subsection{Related Work}

Many other instances in various aspect in cloud computing as been taken in to research and new architectures are proposed. One popular technique related to cloud computing is grid computing. Many research has been carried out to virtualize the cloud computing platform at the grid level. Virtualized distributed computing infrastructure used in cloud technologies \cite{VTCC2}. provides an in-depth of the possibility in using the the virtualized distributed system in enhancing the performance in cloud computing.

Jon scheduling is also an important aspect in virtualization an cloud computing. A successful experiment has been carried out on the Xen Grid Engine \cite{xenschedule}. In it a new method for cluster scheduling using operating system virtualization techniques is introduced.

An Cloud Computing Open architecture has been proposed\cite{ccoa}. It integrates the service oriented architecture and the virtualization technology of hardware and software. This architecture can be used in Infrastructure clouds and Business clouds.


Virtualization and cloud computing together can lead to phenomenal creations in the IT industry. How ever as vitualization increases the complexity in the cloud could increase. However their is not much overhead added to cloud computing via virtualization, instead increase the performance. In future people might be less concern in owning their own servers but simple rely own clouds to provide infrastructure, services and platforms.