Disclaimer: This essay is provided as an example of work produced by students studying towards a computer science degree, it is not illustrative of the work produced by our in-house experts. Click here for sample essays written by our professional writers.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UKEssays.com.

Replica System in Distributed File Sharing Environments

Paper Type: Free Essay Subject: Computer Science
Wordcount: 1754 words Published: 12th Mar 2018

Reference this

AN EFFECTIVE FRAMEWORK FOR MANAGING REPLICA SYSTEM IN DISTRIBUTED FILE SHARING ENVIRONMENTS

Tesmy K Jose,  Dr.V.Ulagamuthalvi

 

Abstract-An Enhanced file system called Probabilistic File Share System is used to resolve all the distributed file update issues. There are three mechanisms designed in the probabilistic file share system such as Lazy Adaptive Synchronization Approach, Standard Replica System Replay Approach and Probabilistic method. The adaptive replica synchronization and Standard Replica System Replay approaches are implemented among the Storage Servers (SSs) and it makes the Meta Data Server (MDS) free from replica synchronization. Furthermore, a probabilistic control system is deployed into the proposed work in order to managing replicas replacement, overloading and their failures where the system can be measure the possibilities of every replicas replacement, overloading and failures level according their communication overhead and physical information. If the communication overhead or physical failure probability is high then the replica system replaced from replicas environment as well as sends the notification message to its neighbor replicas with the failure system details.

Keywords- Metadata Server, Lazy Adaptive Synchronization, Standard Replica System Replay, Probabilistic Control System.

1. Introduction

As the volume of digital data grows, reliable, low-cost storage systems that do not compromise on access performance are increasingly important. A number of storage systems (e.g., libraries, tape and optical jukeboxes) provide high reliability coupled with low I/O throughput. However, as throughput requirements grow, using high-end components leads to increasingly costly systems. In general, the client contacts the metadata server (MDS), which handles all the properties of the whole file system, to get the authorization to work on the file and the information of the file’s layout. Then, the client accesses the corresponding storage servers (SSs), which handle the file data management on storage machines, to execute the actual file I/O operations after parsing the layout information obtained from the MDS. A number of existing distributed storage systems (e.g., cluster-based and peer-to-peer storage systems) attempt to offer cost-effective, reliable data stores on top of unreliable, commodity or even donated storage components. To tolerate failures of individual nodes, these systems use data redundancy through replication or erasure coding. This approach faces two problems. First, regardless of the redundancy level used, there is always a non-zero probability of a burst of correlated permanent failures up to the redundancy level used; hence the possibility of permanently losing data always exists. Second, data loss probability increases with the data volume stored when all other characteristics of the system are kept constant

Get Help With Your Essay

If you need assistance with writing your essay, our professional essay writing service is here to help!
Find out more about our Essay Writing Service

One of disadvantage of clusters is that programs must be grouped to run on multiple equipments, and it is difficult for these grouped programs to cooperate or distribute resources. Perhaps the most significant such resource is the file system. In the absence of a cluster file system, individual components of a grouped program must share cluster storage in an unplanned manner. This typically complicates programming, restricts performance, and compromises reliability. Also, the Meta Data Server is responsible for handling all the information about chunk replicas and generating replica synchronization when one of the storage servers has been updated. However, saving the recently written data to the disk becomes a blockage to the whole file system because all other threads need to remain until the synchronous flush-and sync procedure started by one of the SSs is completed.

A Probabilistic File Share System is proposed to resolve the abovementioned issues. It is used to support lazy and adaptive replica synchronization with replica replacement management among the SSs and make the MDS free from replica synchronization and failure maintenance.

2. Literature Survey

Different types of distributed file system supports chunk replication for reliability and produce high data bandwidth as same as similar replica synchronization mechanisms. A class of file system extends the traditional file server architecture to a storage area network (SAN) environment which allows the file server to access data directly from the disk through (SAN). Examples of SAN file system are IBM/Tivoli SANergy and Veritas SAN Point Direct [8,9].

GPFS allows chunk replication by partitioning space for multiple copies of each data chunk on the different Storage Servers and updates to all locations synchronously. Before the completion of write operation, GPFS used to follow the updates of chunk replicas which files had updated on the primary SSs and then updates other replicas[7].Ceph also had similar replica synchronization policy, i.e., the newly written data should be applied to all replicas stored on the different Storage Servers[5].

In the Hadoop file system, the replicated chunks are stored on the Storage Servers. Storage Server’s list will contains copies of any stripe produced and managed by Metadata Server. So, the Metadata Server handles the replicas synchronization and if new data written on any of the replicas,it will be triggered [4]. In GFS, the Metadata Server computes the location and data layout among the various chunk servers. Every chunk is replicated on multiple chunk servers and the replica synchronization is done by Metadata server (MDS) [6]. In Lustre file system, which is the parallel file system has a same chunk called replication mechanism [10].

The researchers are successively presented MinCopysets and Copysets replication techniques to enhance data durability (i.e., data loss) during retain the benefits of randomized load balancing by using derandomized replicas placement policy. However, researchers didn’t enclose the algorithm of replica synchronization and replica replacement [3,2].

3. Proposed System

3.1 Probabilistic File Shared System Architecture

The probabilistic file share system copy and give out the locations of all replicas belonging to the same file chunk to the Storage Servers (SSs) where the replicas are stored. Fig. 1 shows the architecture of probabilistic file share system. The probabilistic control system is organized to calculate the failure rate of every replica in the probabilistic file share system environment. To calculate the failure rate of replicas, our system examine each replicas for communication overhead and also obtains the CPU and memory utilization. By this our proposed system maintains better data consistency in the distributed file shared environment.

Fig. 1 Probabilistic File Share System Architecture

3.2 Data Updating

Fig. 2: Adaptive Synchronization Approach

In the case of processing a write request, the probabilistic file share system use the mechanism of lazy replica synchronization. This probabilistic system firstly completes the write operation and each update process in probabilistic file share system storage is replicated using adaptive replica synchronization. Here adaptive replica synchronization approach is used to copy the each modification in a storage management of distributed file system where primary replica updates the result into replica n and passes the acknowledgement into primary replica.

3.3 System Crash Handling

The probabilistic file share system adopts a deferred replica synchronization mechanism for reconstructing the lost file updates. i.e., it allows only the primary Storage Server to manage the latest data snapshot for reducing write latency and the synchronous process of replica to other SSs will be conducted along the timeline. The Meta Data Server buffers ascertain the latest write requests in the memory; when the number of cached requests is larger than a predefined threshold, the MDS is supposed to direct SSs to perform regular replica synchronization, so that the cached requests can be removed from the memory.

3.4 SS’s Failure and Replacement

The proposed file sharing system arranged in a probabilistic control system that examine the system details and every replica communication. The probabilistic control system keeps a replacement list to store the system details such as CPU utilization, Memory Utilization and etc. By using the abovementioned information, the probabilistic control system measures the failure rate for each replica. If the communication overhead or physical failure probability is high then the replica system replaced from replicas environment as well as sends the notification message to its neighbor replicas with the failure system details.

Figure 3: Illustrated of Probabilistic Control System

The Figure 3 shows the illustrated replica replacement management process. The following function is used to measure the failure rate of replica:

5. Conclusion

This research work proposed a new probabilistic file share system. The modified lazy adaptive synchronization approach successfully updates the data in the Storage Servers. This approach will take less I/O execution time, computation and storage compared to other approaches. The standard replica system replay approach can well handle the crashes of Storage Servers and can improve the lost data. At last, a probabilistic control system is positioned in the new probabilistic file share system. The replica failure calculation and their replacement management are extremely directed by the probabilistic control system.

 

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please:

Related Services

Our academic writing and marking services can help you!

Prices from

£124

Approximate costs for:

  • Undergraduate 2:2
  • 1000 words
  • 7 day delivery

Order an Essay

Related Lectures

Study for free with our range of university lecture notes!

Academic Knowledge Logo

Freelance Writing Jobs

Looking for a flexible role?
Do you have a 2:1 degree or higher?

Apply Today!