This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Grid computing is the concept of using computer in distributed sites which are interconnected together for computing and for resource sharing in order to get high performance computing. Grid computing encourages formation of virtual organizations in which groups of people, both geographically and organizationally distributed, work together on a problem, sharing computers and other resources such as databases and experimental equipment.
- Problems which could not be dealt with previously due to limited computing resources can now be countered. Eg understanding the human genome, searching for new drugs.
- Users could now have access to greater computing resources and expertise than available locally.
- Teams could now be formed which have people from different fields of study from different institutions and organizations to tackle problems that need expertise of multiple fields.
- Specialized localized experimental equipment can be accessed remotely and collectively.
- Large cooperative databases can be created to hold vast amounts of data
- Unutilized compute cycle can be employed at remote sites thereby accomplishing more efficient use of computers.
- Processes in an organization can be re-implemented using Grid Technology resulting in drastic cost savings.
Meets a number of administrative fields: Resources that are being shared are owned either by members of virtual organization or donated by others. This introduces challenging technical and socio-political challenges and encourages true collaboration.
- Shared multi-owner computing ability.
- The grid computing software has advanced security and cross-management structure.
- It has tolls to bring together computers at distributed sites owned by others.
Resources being shared:
- Storage capacities
- Application softwares
- Network Capacity
- Sensors for experiments only at few sites.
Biography of Distributed Computing
Distributed computing goes back a long time and certain forms of distributed computing even existed in the 1960's. Lots of people were interested in connecting computers for high performance computing. It started off by connecting processors/computers locally together in the 1960's and 1970's and it now extends to connecting geographically distant computers which is known as the modern day grid computing. The distributed computing technologies that are being developed concurrently since then and which rely upon each other are : Networks, computing platforms, software techniques.
Development of computing platforms over the years-
- 1960's onwards: recognized that by having more than one processor inside a single computer system increased speed could potentially be obtained.
- 1970's and 1980's: many projects involving parallel computers were rolled out with the emergence of low cost microprocessors.
- 1990's: a computing platform was formed by interconnecting group of computers through a network switch.
Development of programming clusters over the years-
Programmer uses the message passing routines between various processes to do message passing programming.
- Late 1980's-early 1990s - PVM (Parallel Virtual Machine)
- Late 1990s - MPI (Message Passing Interface)
- Late1980's onwards - harnessing the unused cycles of networked computers to do high performance computing. When not being used locally, the networked computers could be given over for remote access.
Development of software clusters over the years-
- Mid 1980's - for invoking a procedure on a remote computer a remote procedure call (RPC) was developed. In order to locate remote services, service registry was introduced.
- 1990s - CORBA (Common Request Broker Architecture ) was developed which is a object oriented version of RPC.
- 2000 - The web service was used to provide remote actions as RPC which were invoked through standard protocols and internet addressing. XML was also introduced which was adopted into grid computing soon after its introduction.
Biography of Grid Computing
Grid computing began with experiments using computers which were spread out in different sites, first in the mid 1990s.One of the first experiments was the "I-way" experiment, which was a seminal experiment conceptualized at the 1995 Supercomputing conference (SC'95), using 17 computing sites across the US. Computing power for 60+ applications was gathered using 10 existing networks.
- Globus Project
- Legion Project
This was the next project undertaken in the field of Grid Computing at that time and was led by Ian Foster who was also the co-developer of I-way demonstration and also the founder of the Grid computing concept. Globus was a middleware software grid computing toolkit. This toolkit emerged through several implementation versions but the basic structural components remained the same; security, data management, execution management, information servicesand runtime environment.
Another project conceived in 1993 was the legion project which was essentially a software infrastructure project and it used object-based approach to Grid computing. Its software development later started in 1996 and its first public release took place at the Supercomputing Conference in 1997.
Another Grid computing project funded by German Ministry for Education and Research. UNICORE - UNIform Interface to COmputing REsources. It continued with other European funding later on and became the basis of several European efforts in Grid Computing elsewhere. It had a lot of similarities with Globus.
Applications of Grid Computing
- e-Science applications - Grid computing is computationally intensive and used traditional high performance high performance computing for addressing large problems. These problems may not necessarily be one big problem but a problem that has to be solved repeatedly with different parameters. Due to the data intensive nature of grid computing large amounts of data could be processed and stored.
- e-Business applications - Grid computing could be used to improve business models and practices and sharing of corporate computing resources and databases was made easy.
Grid, cluster and cloud computing comparison
Cluster Computing Course
In this one learns about programming done for the purpose of message passing using tools such as MPI. Also programming done for the purpose of sharing memory resources using threads and OpenMP could be learnt about. This could be done given that most computers in a cluster today are multi-core shared memory systems. In cluster computing network security is not a big issue and a ssh connection at the front node of the cluster is sufficient as the user id logging into a single computing resource. All the computers are connected together locally under one administrative domain.
Grid Computing course
In this course one learns about running jobs of remote machines, scheduling jobs and distributed workflow. Also knowledge of underlying Grid infrastructure and how internet technologies applied to Grid computing is gained. Here network security is a key issue as computing resources and databases are involved.
Cloud Computing Course
The business model for this kind of computing is services provided on the servers can be accessed through the internet. Cloud computing can be traced back to early 2000s when on-demand Grid computing was emerging.
Grid Computing verse Cluster Computing
Both grid computing and cluster computing had things in common such as hands on programming experience , use of multiple computers and requirement of job schedulers for placement of jobs.
Grid computing verse Cloud Computing
Both grid computing and cloud computing use internet to access resources but cloud computing was quite distinct from original purpose of Grid computing. Grid computing focuses on collaboration and distributed shared resources whereas cloud computing is more concentrated towards placing services for users on demand.
Computational Grid Applications
- Biomedical Research
- Industrial Research
- Engineering Research
- Research in Physics and Chemistry.
Other Sample Grid Computing Projects
- SCOOP Project - Southern Coastal Observing and Prediction Program - The main aim of this project was to integrate data from various regional observing systems for real time coastal forecasts in SE.
- NEES Project - This was an environment related project conducted by NSF. NEES stands for Network for Earthquake Engineering Simulation. Its main aim was to transform our ability to carry out research vital to reducing vulnerability to catastrophic earthquakes.
- eDiamond Project - This grid computing project undertaken by the Oxford University was meant to gather and distribute information on breast cancer treatment, enable early screening and diagnosis, and medical professionals with tools and information to treat the disease. eDiamond would give patients, physicians and hospitals fast access to a vast database of digital mammogram images.
- TeraGrid - This project was funded by NSF in 2001 initially to link five supercomputer centers with its hubs in Chicago and Los Angeles. These hubs were interconnected using 40 Gb/sec optical backplane network.
- Open Science Grid(OSG) - This was started around 2005 and it received $30 million funding from NSF and DOE in 2006.
- UK National Grid Service - This service was founded in 2004 to provide distributed access to computational and database resources, with four core sites: University of Manchester, Oxford and Leeds, and Rutherford Appleton Laboratory. It grew to 16 sites by 2008 and offered free access to any academic with a legitimate need.
- European centered multi-national Grids - Several European countries joined hands in 2004 in forming Grid like infrastructures to share computing resources funded by European programs.
Grid computing infrastructure software
Objective - To create a unified environment for users to gain access to resources at various distributed sites.
- Created a secure envelop across all transactions.
- Created a single sign-on which enabled access to all available resources and run jobs without having to supply additional passwords or account information.
- Information of resources and their status was provided.
- APIs and services that enabled applications themselves to take advantage of Grid platform were developed.
- Had a very user friendly interface.
It was one of the most influential projects started in 1996 and resulted in development of an open source software toolkit developed for Grid computing. Globus was basically a toolkit of services and packages for creating the basic grid computing infrastructure. Tools were eventually added to this infrastructure with the Version 4 being web-service based. Globus toolkit had five major parts:
- Common run-time for libraries and services.
- Components to provide secure access.
- Execution Management
- Data Management
- Information - enables discovery and monitoring the use of resources and services.
Basic Globus Components
- GSI Grid Security Infrastructure
- GRAM (Globus / Grid Resource Allocation Management)
- MDS (Monitoring and Discovery Service)
It provides a security envelop around Grid resources by using public key cryptography.
This is used to issue and manage jobs.
This is used to discover resources and their status.
Used to transfer files between resources and enables large and fast data transfers with security.
Job Submission in Grid Computing
- The types of jobs that can be submitted to a Grid are uncompiled programs written in C, C++, Java program which need Virtual Java Machine and other pre-compiled application packages.
- The ways in which job can be submitted for collectively solving a problem are: Parallel programs which breaks down the problems into tasks and submits them to different computers which work on them simultaneously and Parameter sweep which runs the same job on different computers simultaneously but with different input parameters.
- Steaming - This refers to sending contents of a stream of data from one location to another as it is generated.
- Batch Submission - Jobs are submitted to system in a group and wait their turn to be executed sometime in the future.
- File Staging - Moving complete files to where they are needed. An input file which needs to be moved to where program is located and output files generated need to be moved back to the user or as input to other programs
Specifying a Job
- Direct specification of jobs that are simple in nature using "globusrun-ws-submit-cprog1arg1arg2" This executes program prog1 with arguments arg1 and arg2 on local host, causes globusrun-ws to generate a job description with named program and arguments that follow.
- Using job description file - It uses resource specification languages which gives details such as name of executable, number of instances, arguments, input files, output files, directories, environment variables, files etc. The resource requirements for this file are processor, number cores, types, speed, memory. Examples of languages for specification are RSL v1, RSLv2, XML, GT3.
Job schedulers are used to allocate work to computer resources required to meet specified job by using the maximum throughput of jobs. Some traditional scheduling policies are : first in first out, shortest job first, smallest memory first, priority based etc. The specified job should be matched with the resources available which requires both the characteristics of the jobs and the resources to be described.
Types of Jobs that can be scheduled:
- jobs that can execute on target resources by naming the input and output files
- OS (Linux) commands
- Program that have not been compiled
- array jobs that can be executed at multiple instances of time
- series of interdependent jobs
Types of computer resources that are scheduled are usually individual local computers sometimes connected in a cluster. The schedulers are usually designed to handle cluster configurations. Computer Resources characteristics schedulers will consider:
- Static Characteristics of Machines like processor type, speed, number of cores, threads, main memory, cache memory etc.
- Dynamic Machine Conditions like load on machine, available disk storage, network load etc.
- Network connections and characteristics
- Characteristics of job like code size, data, expected execution time, memory requirements, location of input files, output files
- User preferences or requirements.
Advance reservation of Jobs
Sometimes jobs may require to use several resources simultaneously and also may need network resources for connection between multiple resources. For this purpose the jobs need to reserve the resources for working in future according to a set schedule.
Advantages of advance reservation in Grid Computing
- In order to reduce network or resource contention a reserved time is set in place
- Advance reservation aids in access to collection of resources by jobs simultaneously
- Advance reservation helps in parallel programming jobs which need to communicate during execution.
- Advance reservation also helps in workflow tasks in which jobs must communicate between themselves during execution.
- If there is no reservation then the schedulers will schedule jobs from a queue with no guarantee when they actually would be scheduled to run.
It is basically a job scheduler which uses the wasted computer power of idle workstations and is hugely successful as well. It was developed at University of Wisconsin-Madison in Mid 1980's to convert collection of distributed workstations and clusters into a high throughput computing facility. Condor schedules jobs in background on distributed computers but without user needing an account on individual computers. Features:
- Finds resources
- Manages batches in queue
- Migrates between processes
- Runs jobs even if machines crash, disk space gets exhausted, software is not installed, machines are far away from each other.
It is job scheduler that schedules jobs over distributed sites and it was designed specifically for Grid Computing environment and also interfaces to Globus components. It has the ability to match jobs to resources using both static and dynamic information of resources. It provides automatic job migration and reporting and accounting facilities. It also checks for both fault tolerance and dynamic job migration. It can be installed on client machines to interact with distributed system or in a server where multiple users access it.
When working on computers that are part of a Grid, the most important feature that one should be concerned about is the presence of secured connections. The principal objective of having a secure connection is to be able to send or receive confidential information over the grid without the information being accessible to people outside the grid or someone who is not authorized to receive the information. Anyone using Grid computing to access or send confidential information has to be offered the following features by the connection:
- Data Confidentiality - Information exchange protected against spies and bugs.
- Data Integrity - Guarantee that the information was not tampered with in transit.
- Authentication - Process of validation of a particular identity as the sender or receiver of information.
- Authorization - Process of deciding whether a particular identity can be allowed access to a particular resource.
- Encryption - Conversion of original message to an encrypted message, using encryption algorithm, to disallow others to read the information during transmission.
- Decryption - Reverse process of retrieving the original message from the encrypted message, using decrypting algorithm, to make the information in understandable form.
- Non-repudiation - Making the sender of information fully responsible for the information sent by it. Accomplished by using Public Key Infrastructure in which the message is encrypted using the sender's private key and making it accessible by sender's public key.
Grid Computing Infrastructure Software
The Grid computing software has gone through several improvement cycles and started even before grid computing standards were established. The software was certainly needed as it was essential to have standardized protocols and interfaces for wide approbation of grid computing. The fundamentals for this software were set by standards bodies like:
- IETF (Internet Engineering Task Force)
- W3C consortium (World Wide Web Consortium)
- OASIS (Organization for the Advancement of Structured Information Standards)
- DTMF (Distributed Management Task force)
Standards in Web Services World
After the ratification of XML in 1998 and SOAP in 2000, web services were introduced in 2000, and the consequential development of standards like WSDL and WS. There are two types of web services; stateless and stateful. The stateless web service do not remember information from one invocation to the next and do not need to know what happened with a previous invocation by another client. The stateful web service is the one that needs to remember information from one invocation to the next.
Open Grid Services Architecture (OGSA)
The OGSA was first proposed by Foster et al in paper: "The Physiology of the Grid" and was announced as a Grid Computing Standard at GGF4 in Feb 2002. OGSA defines standard mechanisms for formulating, naming, and identifying service instances. It addresses architectural concerns relating to interoperable services for grid computing. This architecture requires stateful services but does not say how that will be accomplished.
Open Grid Services Infrastructure (OGSI)
The OGSI was introduced in 2002-03 and was the first attempt to standardize how stateful Web Services will be implemented. It altered WDSL to enable state to be specified , using a language called GWSDL (Grid Web Definition Language). OGSI included inheritance, addressing mechanism, and message passing mechanism and it was executed in Globus toolkit version 3 (GT3). But the OGSI was not found pleasing by community at large due requirement of new tools and highly object oriented approach of OGSI. Also it was found to be too much specified in one standard, hence incompatibility issues.
WS - Resource Framework (WSRF)
With WSRF the web and grid communities merged on WS-Resource Framework methodolgy. The specification for WSRF was developed by OASIS in 2004 and this replaced OGSI and makes the execution of a stateful Web service acceptable. The reason for its acceptance was that it specifies how to make Web service stateful and other feature, without drifting from the original Web services concept.
WSRF is a collection of six specifications:
- WS-Resource Properties - These specify how resource properties are determined and accessed. It can consist of data values (about current state of services), metadata (information about data) and information about whole resource.
- WS-Resource Lifetime - This specifies methods to manage resource lifetimes.
- WS-Service Group - This group specifies how to group services or WS-Resource together.
- WS-Base Faults - These specify how to report faults.
- WS-Notification - It is a collection of specifications that specify how to configure services as notification producers or consumers.
- WS-Addressing - It specifies how to address web services and provides a way to address a web service/resource pair.
In a Grid environment the users may carry out activites, which may not be supported by the traditional Globus Grid environment, such as:
- Use one or may be multiple resources to perform tasks.
- Send files from one resource to another where needed and not necessarily logging into computer systems and send files to/from different computer systems.
- Duplicate or break down large files of data among isolated resources.
- Improve programs to make them executable in different types of computers as Grid resources.
- Improve programs so that they can automatically discover resources and allocate tasks accordingly.
(Globus) Grid Security Infrastructure (GSI)
The GSI addressed the shortcomings of the traditional Globus Grid Environment by the use of GSI Communication Protocols GT3/GT4 based upon Web services security. These protocols were of two types:
- Transport Level Protocols - In this protocol, the whole message has to be encrypted before being sent and decrypted when received. This protocol was used in SSL (Secure Socket Layer) and TLS (Transport Layer Security, the successor to SSL)
- Message Level Protocols - In this protocol, only the message content, or some particular part of the message gets encrypted. As a result different authentication techniques and intermediate message processing methods could be employed. This protocol is slower than the transport level protocol but at a higher level than the transport level protocol.
It is similar to regular PKI Authentication which is a process of adjudicate whether a particular identity is actually what it claims to be. In this users are given credentials which they use to prove their identity. These credentials consist of X.509 certificate and private key. The private key is kept a secret by the owner and encrypted with a passphrase. The X.509 certificate is one of the certificate authorities for Grid computing which a group must sign and is available to all. Since the virtual organization needs to control who becomes a member of organization so groups cannot use existing commercial certificate authorities.
The users who just want to test the Grid Software without having to setup or have access to a certificate authority, there is an online certificate signing service which issues low quality GSI certificates. This service can be used for training tutorials on Grid software but is insufficient for major Grid computing work. This service does not provide any means to validate user identity.
Various computing resources distributed geographically need their identity verified in an orderly manner to join the Grid Infrastructure. They need their own host certificates signed by certificate authorities accredited by the Grid, so as to take part in Grid activities.
As we know authorization is the process of adjudicating whether a particular identity can access a particular resource and in which manner. All the users and computing resources on a particular Grid have their own valid signed certificates which provides proof of their identity and each user needs authorization to access the resources on the Grid. Employing a network accessible (LDAP) database which lists users and their access privileges, and integrates distinguished names format found in X-509 certificates.
Gridmap files are used to provide account name mapping and blanket access and the access privileges are derived from local system access control list. Delegation is often used which basically gives authority to another identity to act on someone else's behalf. Single sign-on if coupled with delegation, is used to enable user and its agents to acquire additional resources without repeated physical authentication by the user. Proxy certificates are a way of delegation introduced by Globus, which gives resource possessing proxy the authority to act on someone else's behalf. Proxy credentials include a proxy certificate and a proxy private key. Proxy private key is kept secured in an encrypted form based upon the passphrase established by the user. It is decrypted whenever the user performs PKI authentication protocol.
Security Assertion Markup Language (SAML)
Gridmap files, which povides for mapping distinguished names to local machine accounts, was a primitive way that does not scale well and also does not include any precise access control or high level control of authorization for Grid environment. For this purpose SAML was developed by OASIS which facilitated exchange of security information between business partners and to provide single sign on for web users.