An analysis of fractured server distribution.

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


Todays Era is of web services plays important role in our lives , which leads to maximum usage of services that represents bulky data on Servers . These services are often implemented on collections of machines residing at multiple geographic locations such as a set of corporate data centers. This data is controlled by system called as distributed system.

In large-scale distributed systems, the likelihood of some components experiencing failures is quite large. Thus it is important for a distributed storage system to be fault-tolerant. In this paper, we focus on two kinds of failures. First, we consider the possibility that some fraction of the servers can become arbitrarily corrupted (i.e., Byzantine faulty). Second, we also consider the possibility that some of the client processes can become non-responsive (i.e., crash faulty). Replication is a well-known technique for fault-tolerance, but imposes its own costs in terms of additional storage required and the complexity of schemes for keeping

replicas consistent (cf. Chapter 7 of [1]).

Distributed system are long lived which is continued even if the machine were breakdown or are decommissioned. Failed nodes must be replaced with new machines and of course we have to Add nodes to the system for increased storage or throughput. Thus, the systems need to be reconfigured regularly so that they can continue to function. This paper provides a complete solution for reliable, automatic reconfiguration in distributed systems.

Our approach is unique because .

It delivers the abstraction of a global data consistently of the system membership. This abstraction

Make simpler the design of applications that use it, it allows different nodes to agree on which servers

are responsible for which subset of the service.

. It is aimed to work at large scale, e.g., tens or hundreds or thousands of servers. Support for large

scale is essential since systems today are already large and expected to improve further.

. It is secure against Byzantine (arbitrary) faults. Handling Byzantine faults is important because it

captures complex failure modes that have been described for our target arrangements. For

instance, a recently report says about Amazon’s S3 revealed that a bit flip in a server’s internal state triggered it to

send messages with the wrong info [2]. Furthermore, the Byzantine fault model makes it possible

to abide malicious intrusions where an attacker gains control over a number of servers.

Earlier, proposals for keeping track of a dynamic system membership doesn’t offer all three properties. Many do

not tolerate Byzantine failures, (e.g., [3],[4]). Some of them handle Byzantine faults provide consistency but the problem is they provide for minimum number of system (e.g.,[5]),

while others consistency to achieve scalability (e.g.,[6]). The one exception is Census [7]; this builds on our

Techniques for flexible reconfiguration, but uses other mechanisms to track system membership.

Our solution has two parts. The first is a membership service (MS) that keep tracks and replies to membership

changes. The MS works typically automatically, and requires only minimal human interference; this way we can decrease physical configuration errors, which causes distraction in computer systems [8]. Sporadically, the MS

Announces a new system membership; this way it provides a view of the set of available servers. The choice of strong consistency makes it easier to implement applications. We run the MS on a small group of replicas and use a

number of protocols [9], [10], [11], [12] to enable the MS to tolerate malicious attacks; we were able to take advantage of protocols developed by others but combine them in different ways. . The design provides scalability to a large number of nodes, most of which are clients of the MS. In addition, it avoids overloading the servers that form the MS by offloading expensive tasks to other nodes.

When there is a reconfiguration, the MS may need to move to a new group of servers. This way, we allow the

system to continue to work correctly. We demonstrate a design for reconfiguring the Byzantine-fault-tolerant group.

Tracking membership is only part of what is needed for automatic reconfiguration. In addition, applications need to

respond to membership changes appropriately. Therefore, the second part of our solution talks the

problem of how to reconfigure applications automatically as system membership changes. We present a storage

system, that provides Byzantine-fault-tolerant replicated storage with strong consistency. dBQS serves as an

example application that uses the membership service and takes advantage of its strong consistency guarantees.

Additionally, dBQS is important on its own for two reasons. First, to develop dBQS we had to extend existing Byzantine quorum protocols, originally designed for a static replica set, to enable them to be reconfigurable while continuing to provide atomic semantics across changes in the replica set.

Second, dBQS implements the popular DHT interface[13], but differs from previous DHTs by handling Byzantine faults and focusing on strong semantics, which can facilitate design of applications that build on a DHT interface. In addition, the techniques used to handle membership changes in dBQS could be generalized to other applications. We have implemented the membership service and dBQS. We present performance results that show that the MS is able to manage a large system and reconfigure in a reasonably short interval, and that the impact of reconfiguration on dBQS performance is small.



This section defines our module and assumptions.

We assume a system contained of nodes that can be servers implementing a storage service or clients using that service. Nodes are connected by an unreliable asynchronous network like the Internet, where Data may be lost, corrupted, delayed, duplicated, or delivered out of order. Here we make no assumptions for the system to meet its safety guarantees.

Membership Services:

This section defines the membership service (MS), which provides a reliable source of membership information. The MS defines membership changes by generating a configuration, which identifies the set of servers currently in the system, and sending it to all servers. Here Configuration of the node will be exchange with other nodes with the help of, the MS authenticates with the use signature is verified with a known public key. The MS produces configurations periodically when membership changes. The system moves in a sequence of time intervals called epochs, and we group all configuration changes at the end of epoch. It allows applications that use the MS to be optimized for long periods of stability (we assume that in storage applications epochs could store for hours, although our Calculation handle short epochs if needed), and it reduces costs associated with signing configurations or transmitting them). to avoid unnecessary data movement due to temporary disconnections, to offer additional protection against denial of service attacks, and to avoid thrashing, While trying to recover failure network gets overstresses. The idea of epochs provides uniformity: every nodes in the same epoch see the same system membership. Each epoch has a sequential epoch number. These numbers compare the existing different configurations. Epochs can terminate after a fixed duration or some number of membership changes or a combination of the two. The termination condition is a system parameter that must be set by a system administrator based on deployment characteristics

We discuss some refinements to the design to achieve better scalability in Section 3.3. Section 3.4 discusses the impact of faulty servers on our system.

@@@ Load at the MS replicas.

Load on the MS. Running the MS on a small subset of system members is crucial for scalability since the agreement protocol is quadratic in the number of nodes in the group. However, we must also ensure that server nodes acting as MS replicas are not overloaded. T, are three activities of concern. First, is communication at the end of an epoch—if the MS had to inform every node about the new configuration. We can avoid this expense by using distribution trees [21], [8]_[7]. Even if some node does not receive the message containing the delta (e.g., due to a failure of an ancestor in the tree), it will detect the failure when it receives a message with a larger epoch number (recall that application messages contain the epoch number of the sender); then it can retrieve the configuration from the sender of that message. The second potential source of overload is probing. To avoid this expense, we use committees. A committee is a group 2fMS þ 1 servers who are not members of the MS. Each committee is responsible for probing part of the server space; the committee members do these probes independently. The MS interrogates committees periodically and infrequently. If fMS þ 1 members of a committee report that a server is unresponsive, that node can be moved to the replica’s inactive list (since at least one honest server has found it to be unresponsive). We require only 2fMS þ 1 members in committees since they need not carry out agreement. Committee reports can be used as the argument of the EVICT or RECONNECT operation to prove that the request is valid. The third potential source of overload is freshness challenges. Although, freshness certificates can be valid for a long time, there could be so many clients that the MS could not keep up with the challenges. We solve this problem using aggregation. Clients send challenges to particular (non-MS) servers; different groups of clients use different servers. A server collects challenges for some time period, or until it has enough of them (e.g., a hundred), hashes the nonces, sends the challenge to the MS, and forward the signed response to the clients, together with the list of nonces that were hashed. Fig. 2 illustrates the final system architecture containing committees and aggregators.