This paper describes a conceptual risk-based decision support methodology for the operation of water distribution systems (WDS). More specifically, the methodology focuses on providing near real-time support to network operators to improve their response to failure conditions, such as pipe bursts, equipment failures, etc. The risk related to the impact of failures in WDS is used to guide, in near real-time, the operator's approach and response to the failures. The impact metric of each failure is formed by a value tree comprising several basic impact factors affecting either the water utility or the customers. The impact assessment is undertaken by using a pressure-driven hydraulic model coupled with a Geographic Information System (GIS) considering explicitly the vulnerability of individual types of customers within the areas (potentially) affected by the failures under investigation. An intervention manager is also developed to assist the operator in testing the appropriateness of proposed interventions/responses to each failure. The intervention manager provides suggestions for improved courses of action determined through offline analyses of the system, stored in an interventions knowledge base in the form of ordered sets. The reduction of risk for each intervention is evaluated and compared to a “do nothing” alternative explicitly supporting risk-aware, near real-time decision making.
Water utilities all over the world are obliged to supply potable water of ever increasing quality and meet minimum service requirements in terms of pressure and quantity of delivered water. In addition, they are required to minimise their carbon footprint and operate with economic efficiency. Operation of water distribution systems (WDS) is thus a very complex process that, until recently, relied extensively on the experience and expertise of the operators in trying to reach an optimum balance between the above conflicting objectives.
With recent improvements in automation, monitoring technologies and computing power, optimisation methods are being introduced into the operation of WDS. Ulanicki et al. (2000) proposed a methodology for centralised and decentralised control of pressure reducing valves (PRV) to reduce leakage. Bounds et al. (2003) applied an optimal pump-scheduling algorithm based on automatic network skeletonisation to minimise energy costs. A current shift in trends of operation of WDS can be seen by the attempts of water utilities to move from offline control rules, typically developed heuristically over time, towards near real-time control based on Supervisory Control And Data Acquisition (SCADA) and data logging technologies. Jamieson et al. (2007) developed a tool combining genetic algorithms (GA), artificial neural networks (ANN), demand forecasting and SCADA to provide real-time, near-optimal control of pumps and valves in a WDS.
In terms of unplanned interruptions, similar trends can be observed where efforts are made to resolve failures in a WDS before they start to affect the customers. Some of these challenges are being addressed by the EPSRC-funded project NEPTUNE (Savic et al., 2008) aimed at improvements in energy efficiency of water delivery, better customer service and reduction of leakage. As part of the project an integrated decision support system (DSS) is being developed to enhance the understanding of the behaviour of WDS under normal conditions and also, more importantly, to facilitate support for operators' actions under abnormal conditions, i.e. when failures occur.
Research in the field of WDS operation under failure conditions has, thus far, been restricted primarily to strategic reliability analysis (e.g. Xu and Goulter, 1999); hence methodologies to support the operators in these situations are currently lacking. It has been recognised that in order to represent the failure more realistically, some measure of “risk”- considering both the probability and the impact of a failure - should be used instead of a purely frequentist approach focusing only on the occurrence of the failure (Kapelan et al., 2006). The concept of risk is becoming more regularly employed in water systems design and rehabilitation; however, it has not been yet been incorporated within the control room environment.
Expert systems, although very popular in the 1970s and 1980s, have failed to deliver their promise of fully automated control rooms (Bell, 1985). At the same time a significant increase in data availability is being observed by the development and deployment of sensors monitoring (inter alia) pressure and flow in a WDS. This increase of data coming into the control room is starting to overwhelm the operators, particularly for large systems. Attention is therefore now paid to supporting operators (rather than substituting them) by providing them with relevant information (extracted from the massive data inflows), in a readily comprehensible form, on demand, while allowing the final decision(s) to be made by the operators themselves. (Angehrn and Jelassi, 1994)
In this paper, an overview of a DSS for the operation of WDS under failure conditions is presented. The major components of the system are described with a focus on the risk and intervention management components that form the core of the system. The discussion includes a presentation of the current state of the associated research, an identification of key challenges faced and suggestions for future work and research directions.
- WDS OPERATION
Dealing with failure conditions is one of the primary functions of WDS operators. However, the process of discovering that the WDS is not functioning normally, investigating the problems and deciding on how to deal with them is still difficult, even with the recent progress in monitoring and communication technologies. Data coming from sensors and notifications from customers in the form of phone calls are the two main indicators that a problem has occurred in a WDS that warrants further investigation and possibly repairs. The operator then has to check and process information coming from various systems in order to assess whether the perceived problem in the network is real, rather than a consequence of malfunctioning monitoring and communication devices. The investigation depends strongly on the internal business processes of the particular water utility but frequently requires a field technician to be sent out to visually inspect the situation at a particular location and confirm (or not) the potential problem. A simplified work flow capturing the steps involved in the operation of WDS when an anomaly is detected
Furthermore, in situations where several alarms are occurring simultaneously, the operator is forced to prioritise both investigative and intervention actions with dynamically changing information about the potential incidents. The purpose of an integrated DSS is to filter and generate alarms in a more intelligent fashion, to partially automate the process of investigation by mimicking the behaviour of an operator (while taking into account the potential risks and threats associated with an alarm) and to assist in the prioritisation of both investigative and intervention actions. A DSS which operates on the basis of risk assessment of failure conditions could comprise of several fundamental modules whose interaction is
- The Detector module is responsible for recognition of anomalies in time series data and customer contacts. When a sufficient level of confidence is gained that an anomaly is a true event an alarm is raised to notify the operator. The detector also identifies a set of potential incidents that could be the cause of a particular anomaly.
- The Risk Evaluator (RE) processes the inputs from the detector and assesses the risks caused by potential incidents (based on the likelihood of occurrence and potential impact on customers) also considering the operator's attitude towards risk. It then proceeds to aggregate these partial risks in order to calculate a single measure reflecting the overall risk of an anomaly, which is then used to prioritise the alarm it triggered.
- The Intervention Manager (IM) generates a set of possible responses to a particular incident. In addition to proposing pre-generated solutions (from a knowledge base), it also enables the operators to develop their own solutions by modifying existing ones or by creating a completely new response, which is then stored in the knowledge base for future use. It interacts closely with the RE to estimate the reduction of risk after the implementation of a chosen response.
- The Graphical User Interface (GUI) is used by the operator to interact with the DSS, prioritise actions, interactively access information coming from the field and to explore alternatives showing how to best respond to failure conditions. It further serves as a means of presenting spatial-temporal data in the form of risk maps generated by the RE corresponding to levels of risk of a particular incident.
The rest of the paper focuses on the description of the RE and IM modules, which form the core of the proposed DSS architecture.
- RISK EVALUATOR
For the purpose of this work risk is defined as a set of triplets comprising of risk scenario, probability and impact (Kaplan and Garrick, 1981). The task of the RE is to evaluate the probability of occurrence of a particular potential incident, under a particular risk scenario (defined below) and to estimate its impact over a specified period of time (typically 24 hours). The RE is also utilised if an intervention is proposed to mitigate the impact of a particular incident for computing the subsequent reduction of risk (i.e. reduction of the impact) for the same (or alternative) risk scenarios.
The estimation of risk associated with an alarm for the purpose of prioritisation of actions is shown in Figure 3. The risk is estimated by generating a set of the most likely causes (potential incidents) of the anomaly, calculating the probability of occurrence and impact of each of the potential incidents within the set and aggregating the overall risk of the set - for a given risk scenario. To incorporate the operator's attitude towards risk into the process of prioritising alarms, an aggregation function based on Yager's ordered weighted averaging (OWA) operators is used (Yager, 1992) expressing operator's level of risk-aversion.
Once the priority of an alarm has been established using the means described above the operator can inspect the potential incidents used to calculate the risk score of the alarm. The real incident (cause) which has triggered the investigation should (ideally) be a member of the set of potential incidents and have a higher probability of occurrence than any other potential incident (cause).
In this work, the “risk scenario” is defined as the ensemble of: (1) a potential (i.e. assumed) incident (in terms of its type, location, timing, etc.), (2) the known initial, i.e. current network conditions (pressures/flows, tank levels, statuses of automatically regulating devices, etc.) and (3) the assumed future network conditions (e.g. forecasted nodal demands and assumed statuses of manually controlled devices) over some risk analysis horizon (e.g. next 24hr hours). The ‘do nothing' impact of a potential incident on different stakeholders (water utility and customers - see below) can then be evaluated over this time horizon by utilising the relevant pressure driven hydraulic model (e.g. impact measured in terms of water not delivered, etc.). Note that risk scenario can potentially be used as a tool for handling various uncertainties inherent in our understanding and modelling of the actual WDS (e.g. uncertain forecasted demands).
Various types of incidents can occur in WDS (e.g. water quality problems, deliberate acts of terrorism, hydraulic failures, etc.). In this work, of primary concern are the following incidents: pipe bursts, equipment failures and power outages.
In the past researchers have focused primarily on the detection of anomalies in pressure and flow data obtained from the network (Mounce et al., 2002). The problem of identification and location of a particular incident causing an anomaly is, however, far from trivial. The correct identification of incidents causing alarms is fundamental for the success of a DSS such as the one developed here and is further complicated by an incomplete knowledge of the system behaviour. This lack of information is due to, for example, accuracy of measurements, calibration of models, stochastic water consumption, ongoing maintenance work, etc. More often than not, there is a need to consult several sources of information, based on different data and approaches (from asset data, to real time data to customer calls). Several such methodologies are being developed within the NEPTUNE project, however, ultimately their output needs to be combined and their results reconsolidated in order to improve situation knowledge and to handle uncertainty and potential conflict
Dempster-Shafer (D-S) theory of evidence (Shafer, 1976) has proven to be a powerful method for dealing with uncertainty and has already been successfully applied in many other industries (Sentz and Ferson, 2002) and also in the water sector (Sadiq et al., 2006). In this work it is utilised to combine probabilities of correct identification of a potential incident ,generated by several independent bodies of evidence and to compute levels of belief and plausibility (i.e. lower and upper bounds for these probabilities). Furthermore, the credibility (w1, w2,..., wN) of each body of evidence is dynamically adjusted based on the quality of evidence it provides and also its performance in terms of its success rate of correct identifications, (e.g. using entropy and specificity measures).
Apart from the static probability based on the strategic asset data analysis (e.g. burst frequencies), all the other basic probabilities, as shown in Figure 4, generated by (near) real-time sub-systems are time dependent and can dynamically change as new evidence becomes available. In the case of the probability of identification of an incident, the updating capability of D-S theory is effectively used to incorporate new evidence in order to reflect the current state of knowledge of the system. The updating process will utilise new data obtained from the WDS in order to increase or decrease the belief that a particular incident is the true cause of the problem.
Estimating the impact of WDS failure is complex since it involves social aspects and can be perceived differently by each stakeholder. Any disturbance in water supply can cause inconvenience to the customers in terms of low pressure or no water, interruption to industrial customers, damage to properties and, ultimately, loss of life in the case of fire (Filion et al., 2007). The impact model employed in this research builds upon a list of basic impact factors (i.e. water and energy losses, supply interruptions, low pressure problems, discolouration and damage to third parties) as shown in Figure 5. The impact factors have been classified into two broad categories representing the parties of main interest in this research to form a value tree similar to Michaud and Apostolakis (2006).
The first category of impact factors affects directly, or indirectly, the water utility and the other affects the customers. The impact of failures (potential incidents) is simulated using a pressure-driven version of EPANET (Rossman, 2007) and a GIS is applied to relate the physical effects of failures to the customers. GIS has been suggested as a powerful visualisation tool for water resources problems, particularly suitable for use in DSS applications (Watkins and McKinney, 1995). However, combining hydraulic models with a GIS is not straightforward and one faces many difficulties and challenges. The primary source of lack of correspondence between hydraulic models and a GIS stems from the different purpose of use of the two. GIS is meant to serve as spatial database whereas a model is focused on reproducing the hydraulics of the system and thus the pipe network is frequently simplified (skeletonised). Although, hydraulic models are often created based on available GIS asset data and customer records, the reverse process of correlating elements (e.g. pipes) with those in the GIS and assigning customers to demand nodes has been found challenging and introduces other uncertainty that needs to be reflected in the impact assessment (i.e. as part of the risk scenario introduced before).
Rather than calculating the impact of a failure at the time of detection, the impact model estimates the development of the incident over a specified period of time using demand forecasts to predict future water consumption. Water utilities in the UK are obliged to report their performance to the Water Services Regulation Authority (OFWAT) on a yearly basis (OFWAT, 2008). Some of the indicators monitored by the regulator consider the quality of service provided by the water utility. The DG2 (low pressure) and DG3 (interruptions) indicators, although being important for the water utility, do not consider the character and sensitivity of individual customers, thus, are unsatisfactory for a comprehensive impact assessment. Customers in this work are hence classified into the following groups:
- commercial (shops, businesses, etc.),
- industrial (factories, etc.),
- critical (hospitals, schools and other vulnerable customers) and
- sub-zones representing whole DMAs whose supply depends on an affected network and for which service could thus be compromised.
Each type of customer is assigned a particular weight reflecting its criticality as it is perceived by the operators and management of a particular water utility. The weights reflecting the criticality of specific customer groups as well as the weights indicating the importance of impact factors and impact categories, as shown in Figure 5, are going to be obtained from industrial partners using the Analytic Hierarchy Process (Saaty, 1980).
The impacts of each incident are presented to the operator using the GIS, visually indicating the spatial scale of the impact and the number and nature of affected customers. Figure 6 is a sample screen of the envisaged DSS showing the impact of a pipe burst at peak hour affecting a part of a DMA. The DSS is further able to display such impact maps for any time within a 24-hour window beginning from the time at which an alarm was raised.
The impact model in its current form is not fully developed and the impact factors displayed using shading in Figure 5 (i.e. damage to third parties and discolouration) have not yet been incorporated. It is, however, envisaged that these will be included and possibly further extend the existing set of basic impact factors and categories to account for environmental impacts in the next stage of the project.
- INTERVENTION MANAGER
The current effort on the intervention management module is concentrated on valve manipulation for isolating parts of a WDS to contain an incident to allow repairs. The module consists of the pre-generated knowledge base, developed using the techniques presented by Jun and Loganathan (2007) for the identification of segments of a WDS affected by valve closures. Particular attention is paid to considering the isolating valve size, age and perceived condition, valves on smaller diameter pipes and those which are older and those receiving less maintenance and, therefore, more likely to be inoperational. The effects of these factors are studied and contained in an offline knowledge base, which is anticipated to be updated periodically as further data. particularly from valve exercise programs, becomes available. Other types of responses, such as the manipulation of pumps, provision of by-pass, booster pumping and the use of spare or reserve capacity will subsequently be considered and incorporated into the module.
Decision support tools were principally developed in the past to address strategic design and rehabilitation issues in WDS (e.g. Makropoulos et al. (2003)). With the recent innovations in monitoring technologies, attempts have been made to apply them to near real-time environments. This, however, introduces new challenges in terms of strict constraints on computational time, dynamically and stochastically changing the state of the network and other uncertainties stemming from a lack of knowledge of the system and its operation. The situation is further complicated by the need to integrate data sourced from several independent systems (e.g. GIS, trend database, hydraulic models, etc.).
A risk-based approach for the development of a DSS is proposed in this paper, which offers a way of supporting the operation of a WDS under normal and particularly in failure conditions. The approach considers both the frequency of occurrence of failures and (importantly) the impact of failures to customers which is of growing importance to the water industry. The broad risk assessment process proposed in this work will allow the operators to explicitly visualise and accommodate a wider range of risks and to assist them in prioritising actions and interventions more effectively.
The methodology presented introduces a novel concept in risk-based operation for WDS under failure conditions, proposes a new definition of risk - appropriate for operational conditions - and extends existing impact models to account for further impact classes. The work presented above is very much a “work-in-progress” and will be further developed, extended and formalised to reflect the current needs of water utilities both in the UK and internationally. In the next stage of the NEPTUNE project, the decision support methodology will be implemented in the form of an integrated DSS and validated in a real-world WDS control room environment.
This work is developed within the NEPTUNE research project funded by the UK's Engineering & Physical Science Research Council (EPSRC) and Industrial Collaborators. The authors would also like to thank Lewis A. Rossman for kind provision of the pressure-driven modification of EPANET.
Angehrn, A. A., and Jelassi, T. (1994). "DSS research and practice in perspective." Decision Support Systems, 12(4-5), 267-275.
Bell, M. Z. (1985). "Why Expert Systems Fail." The Journal of the Operational Research Society, 36(7), 613-619.
Bounds, P. L. M., Ulanicka, K., and Ulanicki, B. "Optimal scheduling of South Staffordshire water supply system using the FINESSE package." Advances in Water Supply Management (CCWI 2003), Imperial College London, UK, 283-292.
Filion, Y. R., Adams, B. J., and Karney, B. W. (2007). "Stochastic Design of Water Distribution Systems with Expected Annual Damages." Journal of Water Resources Planning and Management, 133(3), 244-252.
Jamieson, D. G., Shamir, U., Martinez, F., and Franchini, M. (2007). "Conceptual design of a generic, real-time, near-optimal control system for water-distribution networks." Journal of Hydroinformatics, 9(1), 3-14.
Jun, H., and Loganathan, G. V. (2007). "Valve-controlled segments in water distribution systems." Journal of Water Resources Planning and Management-Asce, 133(2), 145-155.
Kapelan, Z., Savic, D. A., Walters, G. A., and Babayan, A. V. (2006). "Risk- and robustness-based solutions to a multi-objective water distribution system rehabilitation problem under uncertainty." Water Science & Technology, 53(1), 61-75.
Kaplan, S., and Garrick, B. J. (1981). "On The Quantitative Definition of Risk." Risk Analysis, 1(1), 11-27.
Makropoulos, C., Butler, D., and Maksimovic, C. (2003). "Fuzzy Logic Spatial Decision Support System for Urban Water Management." Journal of Water Resources Planning and Management, 129(1), 69-77.
Michaud, D., and Apostolakis, G. E. (2006). "Methodology for Ranking the Elements of Water-Supply Networks." Journal of Infrastructure Systems, 12(4), 230-242.
Mounce, S. R., Day, A. J., Wood, A. S., Khan, A., Widdop, P. D., and Machell, J. (2002). "A neural network approach to burst detection." Water Science and Technology, 45(4), 237-246.
OFWAT. (2008). "June return 2008 reporting requirements." available from: http://www.ofwat.gov.uk/aptrix/ofwat/publish.nsf/content/jr08_reporting_requirements [accessed 27 April 2008].
Rossman, L. A. (2007). "Discussion of ``Solution for Water Distribution Systems under Pressure-Deficient Conditions'' by Wah Khim Ang and Paul W. Jowitt." Journal of Water Resources Planning and Management, 133(6), 566-567.
Saaty, T. L. (1980). The Analytic Hierarchy Process, MCGraw Hill Int., New York.
Sadiq, R., Kleiner, Y., and Rajani, B. (2006). "Estimating risk of contaminant intrusion in water distribution networks using Dempster-Shafer theory of evidence." Civil Engineering and Environmental Systems, 23(3), 129-141.
Savic, D. A., Boxall, J. B., Ulanicki, B., Kapelan, Z., Makropoulos, C., Fenner, R., Soga, K., Marshall, I. W., Maksimovic, C., Postlethwaite, I., Ashley, R., and Graham, N. "Project Neptune: Improved Operation Of Water Distribution Networks." The 10th Water Distribution System Analysis Symposium, Kruger National Park, South Africa.
Sentz, K., and Ferson, S. (2002). "Combination of Evidence in Dempster-Shafer Theory." SAND 2002-0835, Sandia National Laboratories.
Shafer, G. A. (1976). A mathematical theory of evidence, Princeton University Press, Princeton ; London.
Ulanicki, B., Bounds, P. L. M., Rance, J. P., and Reynolds, L. (2000). "Open and closed loop pressure control for leakage reduction." Urban Water, 2(2), 105-114.
Watkins, D. W., and McKinney, D. C. (1995). "Recent developments associated with decision support systems in water resources." Reviews of Geophysics, 33(S1), 941-948.
Xu, C., and Goulter, I. C. (1999). "Reliability-Based Optimal Design of Water Distribution Networks." Journal of Water Resources Planning and Management, 125(6), 352-362.
Yager, R. R. (1992). "Decision Making Under Dempster-Shafer Uncertainties." International Journal of General Systems, 20(3), 233-245.