QoS prediction for web services

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


The last decade has seen a widespread adoption of web-services as the medium for platform-independent and language-neutral communication amongst applications over the web (Park et al. (2005) [17]). In the present day an increasing number of businesses are planning to developing future solutions that leverage the Web-Service technology. As the number of web-services that meet similar functional requirements increases, consumers are provided with a wider variety of web-services that practically offer the same functionality and hence recent years have seen a growing need for the availability of information addressing the non-functional requirements of these web-services such as response time, availability, performance etc. in order to make more informed decisions during the process of selection. Members of the research community argue that the success of critical business applications highly depends on the selection of Web-services that meet certain Quality of Service (QoS) properties (Shao et al. (2007) [10]). The current web-services architecture is based on three main entities as described by Ambrosi et al. (2005) in [18]:

Service Providers: They develop web-services, publishing their web-service descriptions (in the form of a WSDL file) with registries and deal with requestor invocations of the web service.

Service Requestors: They look up a service description of a required service in a service registry. Once selected, they get a copy of the WSDL file of the selected service and use it to obtain implementation details in order to bind and invoke the service using SOAP messages.

Service Registries: A UDDI registry contains listings of services advertised by different providers.

Ambrosi et al. (2005) state that the problem with the current publish-find-bind architecture is that there is no support for non-functional requirements such as availability, reliability etc. A lot of research has been done towards setting up an architecture to support the recording of QoS experienced by users on their utilization of different services but in comparison much fewer work has been done to predict the QoS a user will experience on a future invocation. In this paper we will present different approaches that use QoS ratings given by previous service users on a service to predict the QoS a current user of a service will experience on the same service. Quality of Service information can be deemed useful to a service user about to make a selection as he/she will be able to identify which services are likely to provide better quality than others.

This paper will address the issue of QoS Prediction for a service user on an unused service based on either data gathered by monitoring previous service usage or feedback ratings given by past service users. Section 2 will identify the different QoS attributes used by research works today. Section 3 will identify the different perspectives with which QoS can be realized. Section 4 will provide an overview of the area of reputation management which deals with aspects related to QoS prediction. Finally Section 5 will provide a survey of the different prediction techniques that have been proposed by research today.

2. QoS Identification

In this section we provide a list of attributes that is comprehensive across all works to the best of our knowledge. Research works up to now have only focused their attention to a subset of the attributes we have listed here and future works should ideally encompass a majority of the identified set. One of the issues that exist in this area is that formalized methods of quantifying all the attributes do not exist yet. Here we have categorized attributes using a scheme which is similar to the one used by Nurhayati et al. (2008) in [9] and by Wang & Vassileva (2007) in [12]. A list of attributes has been provided by the World Wide Web Consortium (W3C) in [20] but this list was produced in 2003 but is outdated in terms of its comprehensiveness considering research today. Here we provide a categorized listing of identified attributes aggregated from (Nurhayati et al. (2008) [9]; Wang & Vassileva (2007) [12]; W3C (2003) [20]; Ran (2003) [21]; Mani & Nagarajan (2002) [22]) followed by a brief definition of each attribute.


* Execution Time - Time taken by provider to service a user request.

* Execution Price - Amount paid to invoke the service.

* Throughput - Average rate of data transfer provided by the service hosting server.

* Latency - Delay experienced in transferring the request and reply from the user to the provider and vice versa.


* Availability - A measure of how often the service is present for a client invocation.

* Accessibility - A probability measure indicating chances of a successful instantiation upon invocation.

* Accuracy - Measure that indicates the similarity between the service reply and the objectively correct output.

* Robustness - Ability of perform correctly given insufficient or partially incorrect input.

* Capacity - Maximum number of concurrent requests that can be handled by the providing server.

* Reliability - Degree to which a service can maintain service quality agreed by provider and user before invocation.

* Regulatory - Degree to which service conforms to laws such as governmental, Service Level Agreements (SLAs) etc.

* Reputation - A measure of provider trustworthiness based on user feedback.


* Authentication - Method used for authenticating users of secured services.

* Authorization - Indication of how authorization is done for secured services.

* Traceability - Degree of logging of user-related requests and data.

* Confidentiality - Indication of level of confidentiality with which provider handles and maintains user's data.

* Encryption - Encryption mechanism used for data transfer.


* Atomicity - Degree to which a service can hide his divisibility from the outside world.

* Consistency/Integrity - Degree to which a service does not violate the provider's business model constraints.

* Isolation - Degree to which concurrent behavior can be accommodated providing individual service instances with isolated environments.

* Durability - Degree to which replies from services are made permanent.

Domain-Dependent (reserved for metrics that are specific to the domain)

3. QoS Perspectives

The term QoS is used amongst a fair amount of research works as referring to an objective unique entity amongst the three collaborating parties involved in a Web Service invocation. On the contrary, this simplifies the issue and it is proposed that QoS is subjective as portrayed by the works by Nan et al. (2008) in [8] and Nurhayati et al.(2008) in [9], an example being a provider's delivered view of the throughput attribute could be different from a service user's perceived point of view due to network congestion. Nan et al. (2008) in [8] consider the life-cycle events of a Web Service, starting from publishing to the completion of an invocation, and use them to clearly identify the different perspectives attributed to the term QoS:

Expected QoS (EQoS): It represents the minimum standard of quality desired by service users and is stated in the process of service selection.

Agreed QoS (AQoS): It represents the standard of quality that service providers are willing to guarantee and can be obtained from SLAs when providers advertise their services or just before invocation when a provider and a service user negotiate on a contract.

Delivered QoS (DQoS): It represents the quality of service which providers claim to have supplied during the period of invocation and can be obtained by monitoring by server-side agents.

Perceived QoS (PQoS): It represents the quality of a service which users claim to have perceived during the period of the invocation and can be obtained by monitoring by client-side agents.

Transmitted QoS (TQoS): It represents the network performance of the service delivery and can be obtained by monitoring by intermediate entities routing messages or trusted third party entities.

Statistic QoS (SQoS): It represents a statistic aggregated from the values of DQoS and PQoS over a period of time.

4. Reputation Management

One of the key factors that must be taken into account when using historical feedback data to make a future prediction is the trustworthiness of such data. As we will describe in detail below, both service users and providers can engage in different forms of malicious behavior to produce dishonest feedback ratings and hence influence the prediction made. Reputation Management in our context relates to identifying leering behavior in order to adjust feedback data due to, as related by Park et al. (2005) in [17], discrepancies introduced with two unknown entities interacting with each other.

Reputation Management for Web Services is a wide of research and covering it's respective approaches is beyond the scope of this paper but since authenticity of feedback data plays an important role in the prediction process we will highlight the types of threats a QoS rating system will encounter followed by a brief overview of the types of approaches used to resolve this particular issue.

Dishonest Feedback

According to Park et al. (2005) Reputation Management can be applied to a wider threat model dealing with malicious behavior on the part of the provider and that of the user but in this section we will deal with issues related only to the user i.e. we deal only with issues related to user feedback and do not focus our attention towards issues such as the provider advertising inauthentic services. Threats in this area can be divided into two categories as proposed by [17]: those by Individual Providers and those by Collusive Providers.

a) Individual Providers

These types of service providers attempt to alter the QoS prediction values in two ways. Firstly they invoke competitor services repeatedly, each time badmouthing them by giving false ratings with the goal of bringing down the QoS prediction value computed for competitors. Secondly they invoke their own services repeatedly, each time assigning high ratings to increase their own QoS prediction rating. What distinguishes this category of providers from the other is that providers in this category have no knowledge of the behaviors of other providers and that they conduct their malicious activity in an individual fashion. As an obstacle when devising reputation management models to identify malicious providers, threshold ratios must be carefully set for the number of false ratings for classifying a provider as being malicious in order to differentiate malicious providers from providers which when acting as service users seldom offer unintentional false ratings.

b) Collusive Providers

In this category service providers work together to achieve similar objectives as mentioned for the first category. An example would be providers offering services in different areas and not in direct competition invoking each other's services and each time assigning false ratings aimed at boosting each other's prediction value. In some cases, this malicious behavior can be really harmful to other service users who under the impression of a high prediction rating invoke the service and in turn are critically infected by a virus injected into the reply by the provider. As a greater obstacle than the one mentioned in the previous category, Reputation Management models are faced with challenge of dealing complex cheating patterns where providers are aware of thresholds mentioned in the previous section and oscillate between behaving honestly and dishonestly. In such a scenario a malicious provider would start behaving honestly when he/she realizes that he/she is coming close to the thresholds used for classifying a provider as being dishonest and then once the provider realized that he/she has entered a safe zone after a period of being honest he/she starts behaving dishonestly again. As a result of this behavior the prediction process is periodically altered with bogus data. What makes this challenge even greater is that providers may collude so that at all times some providers are being honest while others are being dishonest leading to a permanently altered prediction process.

Some Reputation Management Approaches

Most of the research works that relate to reputation management dealing with dishonest feedback can be classified as decentralized approaches.

Vu et al. (2005) in [16] propose the employment of trusted third parties that themselves provide QoS reports for a percentage of registry offered services. The proposed algorithm compares consumer reports with "trusted reports" from the third party agents in a 2 step procedure to identify the trustworthiness of consumer reports. In the first step certain ratings are classified as being false based on a deviance from trusted values that is greater than a specified threshold. In the second step a clustering scheme is used to assign ratings to groups followed by weights being assigned to ratings in each group where a weight assigned to all ratings in a group indicates the probable correctness of ratings in that group.

An approach proposed by Malik & Bougetta (2009) in [3] gives users the ability to view individual ratings and proposes the aggregation of feedback as a subjective value relative to a particular user in comparison to a global aggregated value that is consistent across all services. It specifies a procedure where when users execute a search query they are returned not only a set of functionally matching services but in addition a set of users that have previous invoked the service and have feedback data. Then the current user can contact other listed users to obtain feedback results. To identify rater trustworthiness raters are assigned a "credibility measure". The credibility measure is used as a weight when feedback ratings are intended to be aggregated. In [3] the credibility measure forms the crux of evaluating feedback ratings and is calculated based on:

a) A "majority-rating" policy to deal with unfair ratings where if the current user's rating on a service is different from the majority of ratings assigned by other users then the current user's credibility is decreased and increased otherwise.

b) To encompass the case where outlier values represent users that first recognize the deviation in QoS for a service, user trustworthiness evaluated based on previous ratings is used to adjust user credibility.

c) In addition to rating services after an invocation, service users are required to identify the usefulness of feedback ratings. A "propensity to default" measure is calculated as the ratio of the number of times the ratings were found useful in comparison to the total number of invocations. Colluding malicious behavior by service users can be identified as a high ratio. This ratio is incorporated into the credibility measure hence lowering the credibility of malicious colluding users.

d) The approach proposes that the credibility is subject to change due to irregularities introduced over time.. Differences case scenarios can found in work [3] and the credibility measure is adjusted in each scenario.

Khorsravifar et al. (2009) in [4] suggest a solution to dealing with dishonest feedback in a very similar problem where feedback data is used to compute Service Provider Reputation i.e. its trustworthiness in complying to QoS-related attribute values specified in its Service Level Agreement (SLA). Although the presented approach does not specifically target QoS prediction, methodologies used for countering dishonest behavior can be directly applied over to our case scenario. [4] focuses on formulating a methodology for managing the reputation of Communities of Web Services (CWSs). In such an environment every web service must be registered with a service community where a community has the characteristic of encompassing functionally matching or similar services. [4] claims that this architecture has the probability of giving rise to communities that offer similar services leading to competition amongst different communities. [4] presents an incentive-based technique to mitigate any collusion of communities with service users with the intention of using feedback data to support community reputation which is based on QoS-related attributes. It refers to the cases of a community unjustly improving its own reputation level and badmouthing competitor communities as positive and negative alteration respectively. To monitor the system [4] employs a special Controller Agent Cg.

To handle the case of positive alteration Cg checks recent feedback ratings periodically and if the QoS-based community reputation has increased beyond a threshold point in a certain time slot then firstly the feedback values in the violating time slot are frozen and secondly the concerned community agent is informed that it will be put under special observation. After the observation period has ended, if the concerned community has been able to maintain the enhanced reputation level of the previous time slot in the observation period then previous time slot values are unfrozen and a new reputation value is evaluated leaving no traces of monitoring behavior. If the concerned community is not able to reproduce the same enhanced results in the observation period then the new reputation value is evaluated using the reputation value that the concerned community had in the time slot before the violating time slot and the reputation value in the observation period. Finally the newly calculated reputation value is decreased by some applied penalties.

To handle the case of negative alteration no monitoring mechanism is set in place and communities are given the responsibilities of reporting malicious feedback attacks. Once a report is received, similar to the procedure above, a violation time slot is identified and an observation period is started. At the completion of the observation period if negative fake feedback data is noticed during the observation then suspected negative feedbacks are removed from the violation time slot and the observation period which are then used to compute the new reputation value. In addition the newly computed reputation value is increased by an applied reward.

5. Prediction Techniques

As stated earlier, a prediction technique in this context can be explained as a methodology that analyzes historical QoS data from previous users of a particular service and uses it to predict the QoS that will be experienced by a current user on the particular service. Approaches provided by the presented research works can be divided into two types: Data Mining-based and Simulation-based. Data Mining-based approaches aim to find similarities amongst the recorded QoS data and exploit these similarities to produce a vector of QoS-related attribute values that they predict will most likely be the user's objective rating after the service invocation. In contrast, Simulation-based approaches, which have been employed only recently, aim to model the system by assigning QoS attributes discrete state spaces and predicting the probabilities that QoS attributes will taken on particular values from their state space. As this research area has been identified and become prominent only recently only a few works exist that solely focus on the concept of QoS prediction but this concept has been addressed as a subset of other works attempting to address related issues such as Service Provider Reputation Management based on QoS and Web Service Composition based on QoS. In our presented works we focus on the uniqueness of prediction techniques to bring together the state-of-the-art in the subject area.

Below we have categorized approaches of each type into sub-sections where each sub-section provides a brief overview of a particular technique followed by research works proposing an approach applying that technique. Due to the complexity and depth of some techniques we have only provided details required for understanding works in their respective sub-section.

5.1 Data Mining-based Approaches

5.1.1 Simple Regression & Clustering

Cluster analysis or Clustering (Dubes & Jain (1980) [28]) is a data mining technique where recorded observations are grouped into clusters such that observations in a cluster exhibit some form of similarity. In the context of our problem scenario the observations refer to historical QoS-related feedback ratings.

An important term in the context of clustering is the proximity measure [28] or distance measure which is used as a means of measuring similarity. The Euclidean distance [28] is used as the distance measure in the research works (Deng & Xing (2009) [5]; Vu et al. (2005) [16]) we present in this section and is calculated between two QoS vectors q1 = (x1, x2, x3) and q2 = (y1, y2, y3) using the Pythagorean formula i.e.

D = The minimum distance [28] between two clusters A and B is given by: min {D (x, y): x in A, y in B}.

Based on the research works we present in this section, we highlight two types of clustering algorithms:

Agglomerative [28] or Hierarchical Clustering algorithms: These algorithms construct clusters in an iterative fashion. Algorithms of this category can be classified into two types: merging [28] or splitting [28]. A Merging algorithm starts of by creating an individual cluster for each observation and then iteratively works towards reducing the number of clusters by merging clusters together based on the distance measure. A Splitting algorithm starts of with one cluster containing all observations and then iteratively works towards increasing the number of clusters by breaking individual clusters also by using the distance measure.

Partitional Clustering [28] algorithms: Algorithms of this category construct clusters in one go using the distance measure in the grouping process. A k-means clustering algorithm (Choo et al. [13]) is a type of partitional clustering algorithm where observations must be grouped into a predefined (k) number of clusters. In a k-means algorithm, cluster centroid values are used as representatives of clusters and observations are arranged into a cluster in a fashion such that an observation is placed in a cluster where the distance between it and the cluster centroid is minimal across all clusters. Clusters produced by a k-means clustering algorithm are limited to being convex in shape [13].

Early works in the area of QoS prediction for web services based on historical data provided rather simple solutions to the issue ignoring most of the factors involved in making a prediction. These solutions completely ignored the fact that consumer differences need to be taken into account when a prediction is to be made. They also suffered from the drawback that outlier values greatly impacted the computed prediction value hence making it really easy for a malicious service user to alter the prediction process. A majority of early approaches computed the predicted QoS value using a simple regression analysis technique taking as input average values for QoS of the service for time instances in the past.

To give us an idea of how early works were formulated we start of by presenting the approach proposed by Ambrosi et al. (2005) in [18]. This approach proposes a hybrid technique employing linear, multi-linear and curvi-linear regression models. It provides an architecture where web-services are allocated to a pool and each pool has a monitoring agent that is responsible for recording QoS attribute values for each invocation of each service in the pool. Periodically the monitoring agent provides an aggregated statistic for each QoS-related attribute of each web-service in the pool. In particular, for each web service it takes the recorded values for a QoS attribute and calculates the central tendency ict and the index of dispersion ioe for the QoS attribute for that period ∆t. These values are are computed as the sample median and semi-interquantile range respectively. For the purpose of prediction [18] introduces the notion of a learning period L = {∆t1, ∆t2, ∆t3,...,∆tm}. At the end of every ∆ti the central tendency is calculated as the sample median and using each of the regression techniques mentioned above. The difference between the sample median and each of the values obtained by regression techniques is then calculated and these differences signify the errors introduced by using the respective techniques. The regression technique having the lowest sum of errors across all ∆ts in the learning period is chosen as the regression technique to be employed for prediction using past ict and coe values. Finally [18] proposes that the learning algorithm should be repeated periodically to ensure the accuracy of the prediction process.

Introduced in the previous section, Vu et al. (2005) in [16] employ a reputation management scheme followed by a convex k-means clustering technique and finally by a linear regression technique on QoS ratings taken over a period of time to make a QoS prediction. This research work can be formalized into three phases where the first two phases are separately applied to each group of ratings recorded at a particular time instant. In the first phase a reputation management scheme is used where trusted third parties are employed to provide "trusted" ratings for a subset of the services. These ratings are used to identify some users as behaving honestly and others to be behaving dishonestly based on predefined thresholds set on deviances from trusted ratings. In the second phase a k-means clustering algorithm is used to place feedback ratings into clusters with the condition that the minimum Euclidean distance between two clusters must be greater than a threshold value. At this point reports have been grouped into clusters and some reports have been classified as being false due to the dishonest behavior exhibited by the user producing them. Once clusters have been obtained, a weight in the range [0, 1] is assigned to each cluster. A trust-distrust propagation approach mentioned in the research work is applied to infer that some clusters contain honest ratings and get a weight of 1 and also that some clusters contain dishonest ratings and hence are given a weight of 0. The remaining clusters are assigned a weight based on their size and other attributes. In the third phase the average QoS for a particular time instance in the past, is calculated as the ratio of the product of cluster means and their associated weights over the sum of all the weights. Averages are calculated for all time instances measured in the past .The value of the QoS vector at a future time instance is predicted using linear regression which takes as input the average values calculated for QoS at all recorded time instances. Although a promising approach [16] has a few short comings. Firstly it requires that third-party agents be employed at all times which is not always feasible. Secondly it places too much emphasis on output from third parties and hence service users operating in different environments may not get an accurate prediction.

A criticism faced by simple regression and clustering based approaches as the ones mentioned in this section is that they suffer from the drawback of viewing the predicted QoS vector as an objective rating that is applicable to all service users and hence do not take account of service user differences.

5.1.2 Collaborative Filtering

This category of approaches involves the mining of historical feedback data with the objective of finding similarities amongst entities (users or services) and using these similarities to predict user experience on an unused service as stated by Shao et al. (2007) [10]. The motivation behind this category of approaches, as stated by Zheng et al. (2009) in [2], is the argument that viewing QoS data as objectives values and using a simple aggregation function is inadequate as it lacks the ability to take account of consumer differences such as the programming language, network environment etc. A number of works [2, 10] who employed Collaborative filtering methods have used the Pearson Correlation Coefficient (PCC) as the measure of similarity amongst entities since PCC is easy to implement and gives high accuracy [2]. PCC, as presented below, has the range [-1, 1] with positive values indicating a positive correlation amongst entities and negative values indicating negative correlation. Collaborative Filtering techniques can be divided into two sub-categories [2]:

User-Based: Similarities are found between different users. A current user's QoS prediction is obtained by computation on historical QoS values of other similar users. PCC in its general form in research works is given by:

PCC for computing similarity between two users a, u [2].

I - Subset of services invoked by both a, u.

ra,i - QoS rating on service i by user a on attribute x1.

a - Average of QoS ratings by user a on attribute x1.

ru,i - QoS rating on service i by user u for attribute x1.

u - Average of QoS ratings by user u on attribute x1.

Prediction for a QoS requirement x1 on a service i by a user α is calculated as the sum of α and the expected deviation from α. In essence the amount user α's rating is going to deviate from its average rating α is determined by the amount the rating on service i by another user u , where Z is the set of users for who PPC has been calculated, has deviated from the average rating of u. The impact of the deviation witnessed by u is dependent on how similar u is to α which is determined as mentioned by the PCC. The expected deviation is computed, in its basic form in research works, as a weighted mean of the deviations of recorded values for x1 on a service i by other users u i.e. ru,i and their average ratings for QoS requirement x1 which is u with the weight being the PCC .i.e.

P(r α,i) = a +

Predicting using PCC (general form derived from [2]).

Item-Based: Similarities are found between different services. A current user's QoS prediction on a particular service is obtained by computation on historical QoS values given on similar services by the same user. PCC in its general form in research works is given by:

PCC for computing similarity between two services i, j [2].

U - Subset of service users that have invoked both services i,j.

ru,j - QoS rating on service j by user u for attribute x1.

j - Average value of QoS ratings on service j for attribute x1.

ru,i- QoS rating on service i by user u for attribute x1.

i - Average of QoS ratings on service i for attribute x1.

Prediction for a QoS attribute x1 on a service i by a user α is calculated as the sum of i and the expected deviation from i. In essence the amount user α's rating is going to deviate from the average rating i is determined by the amount α's rating on another service j, where Z is the set of services for which PPC has been calculated, has deviated from the average rating of j. The impact of the deviation on j is dependent on how similar j is to i which is determined as mentioned by the PCC. The expected deviation is computed, in its basic form in research works, as a weighted mean of the deviations of values recorded by α for other services j i.e. r α,j and the average ratings for QoS attribute x1 on service j which is j with the weight being the PCC.

P(r α,i) = i +

Predicting using PCC (general form derived from [2]) ­­­­­

Early research works using Collaborative Filtering techniques use relatively simple algorithms for similarity computation and the prediction process. Manikrao & Prabhakar (2005) in [19] use an item-based collaborative filtering method in a two stage procedure to make a prediction. In the first stage similarity between two services is defined as the difference between the arithmetic means of the ratings of the two services across all users that have invoked both services. In the second stage a prediction for a QoS attribute for a service i by a user is calculated as the weighted sum of the user's ratings on similar services, where the a weight is the computed similarity value from the first stage divided by the sum of the all similarity values. This approach suffers from the drawback that one outlier rating would greatly impact the average rating of a service and hence the similarity amongst services.

Shao et al. (2007) in [10] proposes a user based collaborative filtering method for predicting the QoS values. It formalizes its approach into a four step procedure: 1) Data Preparation 2) Normalization 3) Similarity Mining 4). The first two steps are geared are geared towards structuring the data into a format that is suitable for applying the collaborative filtering technique. Step 3 and 4 form the heart of this approach. Step 3 calculates the PCC using the general formula mentioned above. Once the PCC has been calculated, [10] argues that the sign of the PCC should be taken into consideration when aggregation takes place using the general prediction formula. In Step 4 this issue is addressed by the proposal of an enhanced computation scheme for the expected deviation where positively correlating users and negatively correlating users are treated separately. [10] calculates the weighted average deviation for both sets of users separately and both values are finally added to give the cumulative deviation. Two issues need be highlighted about this work. Firstly the values of unrelated users can effect the prediction made if included in the aggregation methodology. To address this issue Shao et al. propose an optimization technique in a future work where users ratings will be only be considered if their PCC value with the current user is greater than a predefined threshold. Secondly this system will give a fairly inaccurate reading when users are new to the proposed system and have not given ratings on enough services to build them a significant size set of similar users.

In a recent work [2] by Zheng et al. (2009), a hybrid scheme comprising of both the user-based and item-based method is used to make a prediction. This approach proposes to have two advantages. Firstly it claims that using either method individually will not take advantage of valuable information made available and hence will make a less accurate prediction than if a hybrid scheme would have been employed. Secondly a prediction can be made if similarity is not strong amongst users using the item-based scheme and vice versa. To achieve its objective the approach has been formalized into three phases. In the first phase the item-based PCC and the user-based PCC is computed using the general formula. Proprietary to this approach, each PCC is then enhanced with a "similarity weight" aimed at reducing similarity resulting due to a small number of similar values. In the second phase a TopK algorithm is employed to find up to k most similar neighboring users for the user-based method i.e. up to k users having highest PCC and having PCC > 0. In the process of selecting neighbours only those neighbors having a PCC > 0 are selected hence if less than k neighbors hold this property with the current user then the neighbor set produced by this method has size less than k. A similar TopK algorithm is employed to find the k most similar services form the item-based method. In the third phase prediction is made using only the similar neighbors collected for each method. A prediction is computed using the general prediction formula for both methods separately and then the predictions are summed using a weighting scheme to get a final predicted value. The assigned weights are tuned by the user and indicate the degree the proposed system should rely on each method. The case of both the similar neighbor sets being empty is handled by giving a prediction value that is the weighted sum of the average of the ratings given by the invoking user and the average of the ratings given to the service to be invoked by other previous service users. Weights in this case again represent the amount we want to rely on each of the values. The user-based segment of this approach differs from the approach mentioned before in that firstly this approach only uses ratings by users having a high PCC with the user for who the prediction is being made and secondly this approach calculates the deviation using the general formula.

5.2 Simulation-based Approaches

5.2.1 Bayesian Learning Models

Bayesian learning techniques (Neal (1999) [25]) are used in modeling systems exhibiting uncertain behavior. In essence Bayesian learning aims to model a system based on existing knowledge and improving the model as new evidence is found. It can be classified as a statistical technique where the uncertainty is modeled as a set of unknown parameters θ having a probability distribution [25]. The Bayesian Learning process is started with a prior probability distribution [25] or prior P(θ) usually chosen based on subjective beliefs about the uncertainty[25]. As more data (in the form of observation x(1), x(2), x(3)... x(n)) is collected the prior is improved to give us the posterior distribution[25], conditional distribution of the uncertainty based on the observed data, by Bayes rule[25]:

The impact of the observed data is given by the likelihood function [25] in terms of the unknown parameters θ:

A popular approach to find the value of θ is the maximization approach [25] which aims to find the values of θ that maximize the likelihood function.

Prediction for x(n+1) is done using the formula [25]:

A Bayesian network (Jensen (2007) [14]) is a directed acyclic graph (DAG) that is used to model probabilistic relationships amongst a group of variables. In such a network variables are represented as nodes and arcs amongst nodes represent conditional dependencies amongst the connected nodes [14]. Although continuous variables are possible we limit our scope for this research work to variables that can be defined discretely. Each node is assigned a conditional probability function that determines the probability that the variable represented by the node will take on a particular value from its state set. The probability function takes as input values assigned to variables that are associated with nodes which have a parent-child relationship with the current node. The conditional probability function for a node is represented at a conditional probability table (CPT) [14] constructed at that node specifying the probabilities of all possible combination of values for the variable represented by the node and its parents. A CPT table entry for a node associated with variable represents the probability that the variable is a particular value given that each of its parents is already assigned a particular value. Two important aspects of Bayesian Network Modeling are Parameter Learning and Inference.

Given a problem scenario to be modeled using a Bayesian network, some CPT entries may be calculated using prior available knowledge of the modeled system where as others need to be determined. The Parameter Learning [14] (or Training) aspect aims to find out more details about the relationships amongst represented variables by using the observed data to estimate value for the unknown CPT entries.

The most common task for Bayesian Networks is probabilistic inference. The term Generative Models (Jebara (2003) [27]) is frequently used when referring to Bayesian Networks as they model the uncertainty between different variables as a joint probability distribution giving the ability to artificially generate samples using different configurations for the involved variables.

Wu et al. in [15] suggest that a significant number of works have ignored the QoS requirements requested by users, during the process of service search, in the process of service QoS prediction. They claim that these requirements are important especially when the provider is offering a service at discrete QoS levels set out in the SLA and using a simple aggregation technique such as an arithmetic mean in this case does not provide an accurate estimation of service capability. Wu et al. in [15] propose an approach where users requirements are taken into account when making a QoS prediction. [15] models the dependencies between users' QoS requirements and the provider's service capability as a Bayesian network and uses fuzzy logic to make an inference about service capability. In [15] the QoS attributes have discrete values (low, medium, high) and the service capability (SC) has discrete values (poor, moderate, good, excellent). During the period of service invocation values for each of the QoS attributes are observed by monitoring tools and sent by to the central server at the end of the invocation. The problem is reformulated as follows: Given values for each of the QoS requirements from the user search query predict the probabilities that the service capability takes on a particular value in its associated state space.

In [15] the Bayesian network set up has one root node which represents service capability and the rest are leaves modeling QoS attributes i.e.

In the diagram above three QoS attributes are shown: Availability (AV), Reliability (RE) and Duration (DU). Operation Type (OP) is a special node which takes account of the hypothesis that a service could offer multiple operations and hence has a discrete state space consisting of the offered operations. An entry of the CPT table associated with a QoS attribute leaf node represents the conditional probability of the respective QoS attribute taking on a value in its state space given a value given a value for SC and is calculated for example as given by [15] as:

P(AV = "High" | SC = "Excellent")

where the joint probability is calculated by [15] as P(AV = "High" | SC = "Excellent") = where m is the number of times when both AV is requested to be high by the user and SC is assessed to be excellent after service invocation and n is the total number of service invocations to date. The prior distribution is given by P(SC = "Excellent") = where k is the number of times SC is assessed to be excellent and n is the total number of service invocations. Once entries have been computed for all the CPT tables associated with the QoS attribute leaf nodes, [15] claims that we can now predict the probability that the service capability takes on a value from its state space given the value for a QoS attribute i.e. P(SC = "Excellent" | AV = "High") using Bayes rule .

[15] states that the proposed system is continuously learning online and hence being trained by the monitored data sent back at the end of every invocation. Monitored data is used to recompute and improve the CPT entries at the end of every invocation. What needs to be highlighted in the formulas of the probabilities used to compute the CPT entry is the notion of assessment of SC. The value of SC from its state space is computed in [15] by a set of compliance values, one for each attribute, each of which is computed as the normalized differences between the server delivered and the user requested QoS values for that attribute. [15] claims that a direct mapping using a simple function from compliance values to SC values is too coarse grained as small deviations from threshold compliance values in this case would assess the SC to be in a different category. As a solution to this issue [15] proposes to use fuzzy logic and provides a fuzzy representation for both the compliance value and the service capability. In [15] the set of compliance values are passed through a set of fuzzy inference rules each one providing a value for SC if the antecedents of the rule are met. In the case where antecedents for multiple rules are met by the compliance values, [15] suggests the employment of a Mamdani min-max inference approach to get the conclusion fuzzy set and a defuzzifing center of gravity method to get the crisp value.

A simpler approach employing Bayesian Learning is presented by Li et al. in [11] which deals with Latency and Domain-Specific QoS attributes. This work proposes an approach for predicting the probability that service providers will comply with values for QoS-related attributes advertised in the Service Level Agreements (SLAs) at the time when the service was published. [11] introduces the calculation of a quantitative factor, Cre, using objective feedbacks given by client-side agents, which are deemed trustworthy, each at the end of a service invocation. [11] states that the state-space of a feedback rating is limited to 1 or 0 representing compliance or violation respectively. Calculation of Cre has been separated into two levels, CoA expressing the probability of a single QoS attribute complying with the SLA and CoWS representing overall probability of all QoS attributes meeting compliance levels.

[11] assumes that prior knowledge of previous values of CoA, represented as ѳ, is available and is given by the set { ѳ1, ѳ2 ...}. Using this assumption [11] claims that the number of times x a service complies with the SLA on a particular attribute over n consecutive invocations follows a binomial distribution b(n, ѳ) and it can also be implied that the distribution is a beta-prior ( π(ѳ) ~β (a,b)). According to Bayesian Learning theory if the prior distribution is a beta distribution then the posterior distribution is also a beta-distribution. [11] uses this lemma from Bayesian Learning to give the posterior distribution as ( h(ѳ|x) ~β (a + x, n + b - x) ) and proposes the following formulas to estimated a and b based on average and the standard deviation δ calculated from the set of previous values for ѳ:

a = x ( - 1) b = ) x ( - 1)

Finally [11] computes CoA as the posterior average estimation of ѳ given by the formula:

CoA = ѳ̂E =

CoWS is computed as the weighted sum of the CoA values computed for each of the QoS attributes. Weights in the computation represent service user preferences where heavier weights are assigned to CoA values that represent attributes that are of higher priority to a subjective user.

A recent work [1] by Vu & Aberer (2007) employs Bayesian Network Modeling for the purposes of QoS prediction. The motivating problem can be explained in terms of the example scenario presented in the work. A data hosting service has domain-dependent QoS attributes: Maximum number of concurrent downloads M, maximum download speed D and maximum upload speed U. The maximum values of these domain-dependent attributes are determined by the contextual factors subscription price P and the type of internet connection I. Each of the variables just mentioned is discrete-valued and has an associated finite state space. Service Users are allowed to provide a feedback rating on a subset of the set of QoS attributes and user trustworthiness can itself be categorized into a finite state space. In the example scenario presented by [1] a trusted user provides feedback just on U and another unknown user, whose trustworthiness is not known, provides feedback on M, D and U.

[1] proposes that little emphasis has been placed by research works on certain areas in the field of QoS prediction. In particular it claims that service users are more interested in the domain-specific attributes of QoS rather than those that are similar across all domains. Firstly it states that client contextual factors and related constraints play an important role in the quality of service perceived by users and hence dependencies between contextual factors and quality attributes must be carefully taken into account. [1] claims that this is a complex issue to deal with as contextual factors influence quality attributes in a non-deterministic fashion. Secondly it states that not all users will provide a QoS rating for all QoS attributes and a mechanism is needed to incorporate such user behavior. Thirdly it identifies a need for a measure to handle feedback from users with malicious motives. To incorporate the above issues [1] proposes to model the system using a fixed-structure Bayesian Network and use it to predict the probability that the service will offer a particular set of values for quality attributes given a set of values for the contextual factors.

The approach can be formulated into three distinct steps:

Step 1: Generating a QoS model of the service.

Step 2: Training the model based on feedback from users. This step involves estimating the remaining CPT entries from Step 1 using user feedback and identifying malicious behavior.

Step 3: Making a prediction based on the trained model.

The first step involves setting up a DAG to model the different entities (quality attributes, contextual factors, behaviors of untrusted users and their feedbacks) and specifying the relationships between them. According to [1], the example scenario presented above has the following Bayesian network model:

Once a DAG modeling the system has been created Step 1 aims to fill the maximum possible number of CPT entries for nodes in the DAG. [1] claims that prior knowledge of the problem scenario enables us to create a knowledge base KB of well-known constraints amongst the different nodes representing the system variables. Each constraint is then used by [1] to fill out some CPT entries. The formal definition of a QoS-generative model given by [1] is

"The QoS generative model of a service is a Bayesian network <U, D, θ> with node set U, edge set D, and parameter θ. The tuple <U, D> is a directed acyclic graph where D = {<p, x>, x Є U, p Є πx} defines probabilistic dependencies among nodes in U. The parameter θ is the set of unknown conditional probability table (CPT) entries Pr(x*| πx*) of each node x Є U."

In this definition Pr(x*| πx*) refers to the probability that x takes on a value from its state space given that its parents πx have gotten particular values from their respective state spaces. As stated above, the purpose of the second step is to analyze feedback data with the goal of identifying dishonest behavior and approximating the remaining CPT entries. In the first phase of Step 2 dishonest users are identified as those users whose feedback ratings violate domain constraints in the KB. The proposed system handles violations by eliminating the respective ratings and removing the nodes representing the trustworthiness of the user and his/her feedbacks from the DAG. In the example scenario mentioned above if the unknown user behaved dishonestly then b1, M1, D1 and U1 would be removed.

Before undergoing the estimation process, to account for the possibility that a feedback rating may only be provided for a subset of quality attributes and contextual factors the remaining feedback ratings are grouped into samples where each sample consists of ratings that have been recorded within a specific time frame. For the purpose of estimation [1] proposes to use Maximum Likelihood Estimation (MLE) techniques. [1] creates a log likelihood function LL(θ) with the goal of selecting values for θ that maximizes the LL(θ) of getting the set of collected samples or the training set. [1] claims that QoS-generative models can get really complex and samples may contain missing values hence traditional techniques cannot be used. In light of this issue, [1] proposes to employ an algorithm which uses an Expectation-Maximization technique (Neal & Hinton (1999) [23]) for estimation. For this technique some initial values must be provided for unknown CPT entries and [1] proposes to use subjective prior beliefs to provide an initialization set.

In the third step [1] claims that once CPT entries have been calculated or estimated and given values for the client's contextual factors, the problem of computing the required probabilities is a simple task and proposes to make a prediction using Junction Tree Algorithm (JTA) (Huang & Darwiche (1996) [24]).

5.2.2 Markov Models

A Stochastic process (Medhi (1994) [26]) can be defined as a non-deterministic process where future evolution can take different paths and each path is associated with a probability measure i.e. given a starting state, the next state to which a transition will be made is non-deterministic and routes to all possible next states are assigned a probability called the transition probability.

A Markov process [26] is a type of stochastic process where the next state is only determined by the present state and is independent of all visited states in the past i.e. it is memory-less.

A Markov Chain [26] is a type of Markov process where the number of states is finite.

A Discrete-Time Markov Chain [26] is a type of Markov Chain where state changes are only made at discrete time intervals.

A Semi-Markov Process [26] is a type of Discrete-Time Markov Chain where the duration of time spent in a particular state is itself non-deterministic. In this type of process the transition probability is also dependent on the duration spent in a particular state (Dai et al. (2009) [6]).

The duration d consisting of the time spent in a previous state i up to the following state j for a Semi-Markov process follows a distribution Fij(d) [6]. Hence the time spent in a particular state i is given by the distribution [6] Hi(d) = Fij(d) * Pij

Average duration at a state i is given by µi [6]. For a system of j states a stationary distribution exists and is given by [Ï€1, Ï€2,..., Ï€j] [6].

The steady state occupancy probability [6] at a state i is given by:

An approach for predicting the change in data transmission speed (1 / latency) is presented by Dai et al. (2009) in [6] which is a research work on QoS-based Web Service Composition. Web Service Composition based on QoS aims to select component services in such a manner that the composite service meets user QoS constraints. Unlike previously presented works which aimed at predicting a particular value for a QoS attribute, this approach, given an estimated QoS value for a component service at the time of composite service invocation and a maximum allowed deviation value, provides a mechanism to predict whether a component service will incur a QoS violation when the component service is actually invoked in the sequence specified for the composite service. A violation is defined as the QoS level being below the specified threshold. In this work only the data transmission speed is studied.

Dai et al. (2009) claim that the data transmission speed exhibits stochastic behavior in the sense that speed can change frequently giving high values during periods of low network utilization and low values either for short periods due to network congestion or for longer periods due to network hardware failure which would require manual intervention. With this behavior in mind, Dai et al. (2009) in [6] propose the modeling of data transmission speed as a Semi-Markov Process having the following states:

a) Qualified: data transmission speed >= threshold

b) Soft Damage: 0 < data transmission speed < threshold

c) Hard Damage: data transmission speed = 0

The threshold is evaluated as the data transmission speed value recorded by 70% of the observations.

The algorithm proposed in [6] takes as input a set of recorded observations and ve. ve is calculated as the minimum allowed data transmission speed. A QoS observation in the observation set is specified as the 3-tuple qc = <tob, v, stob> where tob signifies the observation timestamp, v signifies the data transmission speed and stob represents the state at that timestamp.

FVi(v) is presented as the distribution of the data transmission speed while being in state i.

Using the above information [6] reformulates the problem as follows: given the current time ti, current state si and the duration d that the service has been in si, calculate the probability that the data transmission speed vf is greater than ve at a future time tf where the state which vf falls into is j. In such a scenario a low computed probability would indicate a higher likelihood of a QoS violation and the composition algorithm can take appropriate measures.

To evaluate the probability [6] first distinguishes between two cases where the first case assumes state i is the same as state j and the second case assumes that both states are unequal. Based on this distinction, for each case [6] derives a probability function for vf being greater than ve. Probabilities for each case are computed by analyzing all possible transition scenarios between ti and tf that would allow satisfactory results at tf, formulating a probability for each scenario and finally calculating the probability of a case as the aggregation of the probabilities assigned to each scenario. Probabilities for both cases depend on the distributions Hi(d), FVi(v) and the steady-state occupancy probabilities, all of which are determined by a combination of lemmas for Semi-Markov processes[6] and probability theory.

6. Related Recent Work

A very similar motivating problem is addressed by Deng & Xing (2009) in [5]. A subset of this research work aims at predicting the QoS level of a Web Service Group which is a group of services having the same functional requirements. A Hierarchical merging clustering scheme is used for the accomplishing the objective. As input the algorithm takes a QoS vector for each Web Service which is considered to be the objective rating of the service. As discussed above n clusters are created, one for each service in the group. Once the clusters have been created the algorithm iteratively merges clusters using the methodology which specifies that in each iteration two clusters having the minimum distance are merged. The "Maximum Similar Degree" (MSR) is a notion defined in the research work as the ratio of the size of the largest cluster over the total number of services in a group. The MSR is computed at the end of each iteration and the iteration process completes when the calculated MSR is greater than a pre-defined threshold. Finally the cluster with the largest size is identified and each element of the QoS vector is predicted as the arithmetic mean of the corresponding elements of the QoS vectors in the cluster.

Porting this over to our scenario where a rating for a particular service needs to be made using a set of previous ratings, individual clusters could have been created for each rating and the exact same approach could have been applied specifying a threshold value for the comparison with the MSR. The MSR would have been calculated as the ratio of the size of the largest cluster over the total number of historical ratings available.

Songhua et al. (2009) in [7] offer a solution to a similar problem where users need to be provided with QoS predictions on connections with different servers having replicated copies of web content they desire. Songhua et al. (2009) claim that when users are searching for web content they are supplied by the search engine with a list of providers each of which can offer a connection to the user with a possibly variable level of service quality, in particular the bandwidth and the connection time. Such variations in service quality could be introduced due to the reasoning that the user and the provider are subscribing to different ISPs or are part of different sub networks. Considering this issue, Songhua et al. (2009) in [7] propose a solution that takes account of the network connection condition (i.e. the IP address, the subnet address, the ISP) of both the user and the provider to make a prediction on the bandwidth and connection time QoS attributes for a connection between the two parties. In [7] the relationship ψ between the QoS attributes and the set of network connection conditions of the user and the provider is modeled as a function:

ψ (P(Ci),P(Sj)) → (ti,j , bi,j)

Here P(Ci) and P(Sj) represent the network connection conditions of the user and the provider respectively, ti,j represents the connection time and bi,j and the bandwidth of the connection amongst the two parties respectively.

[7] uses the motivation that if we can learn about this relationship we can use similarities between two connections to predict the QoS of one connection based on the QoS experienced by the other. To learn about this relationship using a training set of input/output pairs recorded from previous interactions between different users and providers, [7] proposes the employment of a group of neural networks. Each neural network in the group is responsible for learning the relationship exhibits by datums in one category of training data and is specified in [7] as a multilayer perceptron with two hidden layers which uses a back-propagation method with a thousand iterations for learning. Initially there is only one neural network in the group and one category of data. Training data passed through the neural network is divided according to the error produced when the neural network is used for predicting the output. Error ratings are discretized using a frequency based method into one of a fixed number of error levels. Based on their discretized error levels, error ratings are clustered using a k-nearest clustering algorithm. As a result of the clustering process the training set might be divided into smaller categories. For each new smaller data category created a new neural network is introduced to the neural network group. This process executes over the training data set in an iterative fashion until the introduction of a new neural network does not reduce the errors produced beyond a certain threshold. Once the system has been trained, in [7] a decision tree based algorithm is employed to decide which neural network in the neural network group to utilize for predicting values for the QoS attributes given a set of values for the network connection conditions of the user and the provider.


In this paper we presented the different techniques that have been employed in predicting the Quality of Service (QoS) that a user will incur on an unused service. Such a prediction is made based on historical QoS data gained from previous service users. Techniques have been divided into Data Mining-based and Simulation-based approaches. In addition we provided a comprehensive listing of the QoS-related attributes that have been used across the different works in the subject area. An issue that still has not been incorporated in a significant number of research works is the different perspectives with which QoS can be viewed. Most works provide a simplified view of the issue by just considering a single perspective and in this paper we used the life-cycle of a web-service to identify the possible perspectives. Finally we provide an overview of the trust factor involved with user data and how it can affect the prediction process and also analyzed some of the recent works addressing this matter. In summary we have provided the state-of-the-art in prediction techniques along with all the surrounding issues that would need to be addressed by a globally implemented prediction system.


[1] Vu, L., -H., Aberer, K. (2009). Towards Probabilistic Estimation of Quality of Online Services. Proceedings of the 2009 International Conference of Web Services, ICWS' 2009, pg 99-106.

[2] Zheng, Z., Ma, H., Lyu. , R., M., King, I. (2009). WSRec: A Collaborative Filtering Based Web Service Recommender System. Proceedings of 2009 IEEE International Conference on Web Service, ICWS 2009. pg 437-444

[3] Malik, Z., Bouguettaya, A. (2009). RateWeb: Reputation Assessment for Trust Establishment among Web Services. VLDB Journal, 18 (4), pg 885-911.