This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
While creating and maintaining distributed applications, care must be taken about the design choices we make. One of it would be to decide on how the distributed applications are going to be interacting with each other. What would the communication model between these applications look like? How independent are we going to make these systems from each other? There is a need to make sure that the communication and synchronization of the various components across the network is efficient. The construction of distributed applications is primarily caused by factors like the uncertainties involving the state of components.
This is where selecting a suitable group communication system comes into the picture. Ideally an efficient group communication system should provide multicast of messages to a group of applications, reliable transfer of data, fault tolerance, easy usage and maintenance, should be scalable and robust.
In this paper, we are going to assess one such group communication system-Spread Toolkit. We will look at some of its features and capabilities and also some cases when it can be best made use of. An effective implementation of Spread in a media aggregator project to optimize operations has been explained.
Overview of Spread
Spread Tool Kit offers a messaging service which provides high performance and is fault tolerant over the network. Spread operates as a messaging bus for distributed components providing features like multicasting and group communication. Spread ensures reliable messaging guaranteeing delivery of ordered messages. The Spread tool kit is language independent with a variety of client API and can support many messaging formats.
Efficient Services that Spread provides & the benefits that you can reap from using Spread
Multicasting of messages from senders to receivers with no restriction on the number of users involved.
Reliable messaging service.
Scalable service which can accommodate large numbers of groups.
Avoiding overhead for applications with respect to ordering of messages by making sure the receivers in a group receive the messages in the same order as it was sent.
It provides a powerful yet a very simple API that can be used in just six simple method calls.
It follows a set of distributed algorithms avoiding single point of failure.
It also provides membership services which inform the member components about the other components that are running, hence making failure detection in the network very easy.
The optimization of Spread lets you to handle about 8000 1Kilobyte messages per second in a LAN environment.
Spread can also handle cross platform operations between Windows (2000/NT/98/95) and UNIX (BSD, Linux, Solaris, Irix, AIX, MAC OS X).
Spread has programming APIââ‚¬â„¢s for C/C++, JAVA, C#, Ruby, Perl and Python.
Ideal usage of Spread
Applications that can use Spread to great benefits:
Collaborative applications that share data among each other extensively.
Replicated servers that require shared state maintenance among many computers.
Similarly, Replicated databases which are in distributed but require being in a synchronized state can make use of Spread.
It can act as a generic message bus. Let us take an example of a system which has 30 machines with 30 processes each which need to communicate among themselves can use Spread which will act as a generic message bus avoiding the creation of 900 TCP connections to communicate. Instead the communication can be done by opening just one Spread connection per process.
Spread can effectively be used in group chat systems needing transfer of data using a light weight tool like Spread.
Technorati uses Spread in its Watchlist feature which allows for posts to be multicast to groups.
Flickr uses Spread to create a log of real time events like uploading of photos and discussions, blogs etc.
How can Spread be implemented to Optimize
The project required building a ring of four web service components which behave as media aggregator services for a social site to collect and locate data. The web services (Blog, Image, Video, and Association) were to be implemented according to a ring topology in an asynchronous way. Each service was required not to share any process space or a common file system. Traversal through the ring was possible only in one direction.
3.3 Put Results in DB
WS@Right ââ‚¬" New Thread
WS@Left ââ‚¬" New Thread
1.3. Poll DB for Results
Accept client initiated search.
1.2 Initiate the Search
2.1. Place the request as spread message
2.3 Give control back to search service
2.2 Initiate Search in the adjacent node.
3.1 Accept req though Spread Listener
3.2 Searches for data
1.4 Return result
System Operation with Spread
Figure 1.1: Architecture of the system using Spread
The client requests for either blog, image, videos or association data or all of them based on Tag, Type of Service, Time or Author. The client call could be to any of the services in the ring. The service that receives the request from the client is termed as originator or initiator. The client calls the searchService() method placing its request. The originator then calls the initiateSearch() method and then keeps polling a store in the DB for search results. The initiateSearch() method puts the search request into the spread queue and then calls the initiateSearch() method of the adjoining service. The spread listener gets activated as soon as a message is placed in the spread queue. The SpreadListener calls the searchOperation() method which begins its search for results in its corresponding DB store and puts the results in the result DB store. Meanwhile, the initiateSearch method () of the adjoining service does the same work and calls the initiateSearch() method of its adjoining service. This saves a lot of time and improves the performance of the system. We have used a dedicated Spread queue for each service to optimize the operation within the ring where in a service before beginning to process its search operation passes the request on to the adjoining service which can then in parallel begin its search operation after passing the request on to its adjoining service and so on. This optimizes the search operation and fetches the search results faster. The first thread which has been polling the result store in the DB for results once gets all the search results, gives result back to the client.
Optimizing Asynchronous Search using Spread
When we have four services communicating asynchronously, there is delay caused if a service waits for the adjacent service to finish its execution completely and send the results to this waiting service. So the execution pattern is as follows. This dependency of one service waiting for the other service can be removed with the architecture provided as above.
Let us consider an example that all the services (image, blog, video and association) have search results and take equal time 3T (2T for search and T to fetch the data) to execute.
So the time taken for the entire search is 12T.
Total time to execute = 12 T
Figure 1.2: Execution Time without Spread
Here the total time taken is sum of the time taken by all the services.
Can this be optimized? Can the total time of the system be reduced?
Yes, the total time of the system can be reduced as the searching of the data can be done in parallel. The concept of multitasking can be applied here as the searching of data is independent of each service, hence can be done asynchronously.
Making the Search calls in the ring Asynchronous.
This can be achieved by using a messaging service across the ring or by using threads. The messaging service we have implemented across the ring is Spread.
We used Spread over any other messaging system in our architecture because of the following features
It is a tool specially designed for passing light weight messages.
It is not resource hungry and hence doesnââ‚¬â„¢t take up too much of a toll on the resources used.
It is self managing. It manages all the threads it creates. The problem of synchronization issues associated with threads does not come into the picture while using Spread.
It is fault tolerant and has a highly scalable architecture.
Time Taken for Total execution after implementing optimization technique with spread.
Total time to execute << 12 T
Figure 1.3: Execution Time using Spread
We can now see that the total time taken for the ring is drastically reduced when the searches are run asynchronously and now the search process is as slow as the slowest service and not the sum of all of time taken by all the services. Thereby, we optimize the Search operation effectively using the Spread messaging system.
Choosing an appropriate group communication system while designing distributed applications is very critical. While there are many systems that are available in the market, Spread owing to its features like scalable architecture, fault tolerance, self managing ability being a light weight messaging system, when used appropriately in an architecture, can help improve performance and yield better results.