This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
In view of worldwide concern for the sustainability of groundwater resources, basin-wide modeling of groundwater flow is essential for the efficient planning and management of groundwater resources in a groundwater basin. The objective of the present study was to evaluate the performance of finite difference-based numerical model MODFLOW and an artificial neural network model developed in this study in simulating groundwater levels in the Kathajodi-Surua Inter-basin within Mahanadi deltaic system of eastern India. Calibration of the MODFLOW was done by using weekly groundwater level data of 2 years and 4 months (February 2004 to May 2006) and validation of the model was done using one year groundwater level data (June 2006 to May 2007). Calibration of the model was performed satisfactorily by a combination of trial and error method and automated calibration code PEST. Groundwater levels at 18 observation wells were simulated for the validation period. Moreover, an artificial neural network (ANN) model was developed to predict groundwater levels in 18 observation wells in the basin one time step (i.e., week) ahead. The inputs to the ANN model consisted of weekly rainfall, evaporation, river stage, water level in the drain, pumping rate of the tubewells and groundwater levels in these wells at the previous time step. The time periods used in the MODFLOW were also considered for the training and testing of the developed ANN model. Out of the 174 data sets, 122 data sets were used for training and 52 data sets were used for testing. The simulated groundwater level by MODFLOW and ANN model were compared with the observed groundwater levels. It was found that the ANN model provided better prediction of groundwater levels in the study area than the numerical model for short-horizon predictions.
Keywords: Groundwater flow modeling, MODFLOW, Artificial neural network, Deltaic aquifer system, Kathajodi-Sura Inter-basin.
Groundwater is an invaluable natural resource used for variety of purposes like domestic, agricultural and industrial uses. Most of the water of our planet (97%) occurs as salt water in the oceans (Bouwer, 2000). Of the remaining 3%, two-thirds occur as snow and ice in polar and mountainous regions, which leaves only 1% of the global water as liquid freshwater. Most of this (more than 98%) occurs as groundwater, while less than 2% occurs in the more visible form of streams and lakes. During the last few decades, groundwater has become an important source of freshwater throughout the world. It is estimated that groundwater provides about 50% of the current global domestic water supply, 40% of the industrial supply, and 20% of water use in irrigated agriculture (World Water Assessment Program, 2003). However, the aquifer depletion due to over-exploitation and the growing pollution of groundwater are threatening our eco-systems (Bouwer, 2000; Shah et al., 2000; Sophocleous, 2005; Evans and Saddler, 2008). Hence, the key concern is how to maintain a long-term sustainable yield from aquifers (e.g., Hiscock et al., 2002; Alley and Leake, 2004).
The total annual replenishable groundwater resource of India is about 43 million hectare meter (Mham). But in spite of national scenario on the availability of groundwater being favorable, there are pockets in certain areas of the country that face scarcity of water. This is because the groundwater development over different parts of the country is not uniform, being quite intensive in some areas (CGWB, 2006). Excessive pumping has led to alarming decrease in groundwater levels in several parts of the country like Gujarat, Tamil Nadu, West Bengal, Orissa, Rajasthan, Punjab and Haryana (CGWB, 2006; Mall et al., 2006). In recent studies using GRACE satellite data, it was found that the groundwater reserves in the states like Rajasthan, Punjab and Haryana are being depleted at a rate of 17.7 ï‚± 4.5 km3/yr. The same data suggest that between August 2002 to December 2008, the region lost 109 km3 of groundwater which is double the capacity of India's largest reservoir Wainganga and almost triple the capacity of Lake Mead, the largest man-made reservoir in the United States (Rodell et al., 2009). This in turn has increased the cost of pumping, caused seawater intrusion in the coastal areas and has raised questions about the future availability of groundwater.
In order to avoid the overdraft and declining groundwater level, it is important to understand the behavior of an aquifer system subjected to artificial stresses. Simulation modeling is an excellent tool to achieve this goal. Groundwater simulation models are useful in simulating groundwater flow scenarios under different management options and thereby taking corrective measures for sustainable use of water resources by conjunctive use of surface water and groundwater. During last 20 years various studies have been taken up for groundwater flow simulation in different basins using MODFLOW and other models (Reichard, 1995; Onta and Das Gupta, 1995; Ting et al., 1998; Reeve et al., 2001; Lin and Medina, 2003; Rodriguez et al., 2006; Zume and Tarhule, 2008; Al-Salamah et al., 2011). But the physically based groundwater simulation models are very data intensive, labour consuming and time consuming. Empirical models generally require less data and less effort in comparison to physically based models. Artificial Neural Network (ANN) models are one of such models, which are treated as universal approximators and are very much suited to dynamic nonlinear system modeling (ASCE, 2000). The ability to learn and generalize from sufficient data pairs makes it possible for ANNs to solve large-scale complex problems. A few studies have been done on the use of neural networks for groundwater level forecasting (Coulibaly et al., 2001; Coppola et al., 2003; Daliakopoulos et al., 2005; Nayak et al., 2006; Uddameri, 2007; Krishna et al., 2008; Banerjee et al., 2009; Mohanty et al., 2010 and Ghose et al., 2010).
But only one study (Coppola et al., 2003) has been reported till date on comparison between the physically based models like MODFLOW and empirical models like artificial neural networks. Copploa et al. (2003) developed a neural network model for predicting water levels at 12 monitoring well locations screened in different aquifers in a public supply well field, Florida, USA in response to changing pumping and climatic conditions. The developed neural network model predicted the groundwater level more accurately than the calibrated numerical model at the same location over the same time period. In the present paper, a groundwater flow simulation model has been developed using Visual MODFLOW, an empirical ANN model has been developed for forecasting groundwater level and comparison between both the models has been done. For this, a study area has been selected at Kathajodi-Surua Inter-basin within Mahanadi deltaic system of the state of Odisha, India.
2. STUDY AREA
The study area is a typical river island within Mahanadi deltaic system of eastern India and is surrounded on both sides by the Kathajodi River and its branch Surua (Fig.1 and Fig. 2). It is locally called as 'Bayalish Mouza' and is located between 85o 54' 21" to 86o 00' 41" E longitude and 20o 21' 48" to 20 o 26' 00" N latitude. The total area of the river island is 35 km2. The study area has a tropical humid climate with an average annual rainfall of 1650 mm, of which 80% occurs during June to October months. The normal mean monthly maximum and minimum temperatures of the region are 38.8o C and 15.5o C in May and December, respectively. The mean monthly maximum and minimum evapotranspiration rates are 202.9 mm and 80.7 mm in May and December, respectively. Agriculture is the major occupation of the inhabitants and groundwater is the major source of irrigation in the area. There are 69 functioning government tubewells in the area, which constitute major sources of groundwater withdrawals for irrigation. These tubewells were constructed and managed by the Orissa Lift Irrigation Corporation, Government of Orissa, India. Now, they have been gradually handed over to the local water users' associations. There is no water shortage during the monsoon season in the study area, but in the summer season, the farm ponds dry up and the groundwater from tubewells is not sufficient to meet the entire water requirement of the farmers.
The river basin is underlain by a semi-confined aquifer which mostly comprises coarse sand. The thickness of the aquifer varies from 20 to 55 m and the depth from 15 to 50 m over the basin (Mohanty et al., 2012). The aquifer hydraulic conductivity varies from 11.3 to 96.8 m/day, whereas the values of storage coefficient range between 1.43 Ã- 10-4 and 9.9 Ã- 10-4.
3. MATERIALS AND METHODS
3.1 Data Collection and Analysis
Daily rainfall data of 20 years (1990-2009) and daily pan evaporation data of 4 years (2004-2007) were collected from a nearby meteorological observatory at Central Rice Research Institute (CRRI), Cuttack, Orissa located at about 2 km from the study area. The recharge from rainfall was estimated by the empirical method suggested by Rangarajan and Athavale (2000) for alluvial geological provinces of India. The recharge from the return flow from irrigation was estimated according to the guidelines of Central Ground Water Board, New Delhi, India (CGWB, 1997). The river-stage data available at an upstream site named Naraj (Fig. 1) were collected from the office of Central Water Commission (CWC), Bhubaneswar, Orissa.
The lithologic data at 70 sites over the study area were collected from Orissa Lift Irrigation Corporation (OLIC) Office, Cuttack, Orissa. The lithologic data were analyzed in detail, which along with other field data were used for developing numerical groundwater-flow model of the study area. Since no groundwater data were available in the study area, a groundwater monitoring program was initiated by the authors. Monitoring of groundwater levels in the study area was done by selecting nineteen tubewells in such a way that they represent approximately four west-east and four north-south cross-sections of the study area (Fig. 2). Weekly groundwater-level data at the nineteen sites was monitored from February 2004 to October 2007, which was used for studying the groundwater characteristics in the study area, calibration of groundwater-flow simulation model, and training of neural network model for groundwater level forecasting.
3. 2 Groundwater Flow Simulation using Visual MODFLOW
A groundwater flow simulation model was developed using Visual MODFLOW for simulating groundwater scenario in the study area. Visual MODFLOW, which integrates the MODFLOW for simulating the flow, MODPATH for calculating advective flow pathlines, MT3D/RT3D for simulating the transport and SEAWAT for simulating coupled flow and transport processes is not only a versatile and robust model for simulating groundwater flow, but also an easily accessible model and is used by the researchers worldwide (Ting et al., 1998; Wilsnack et al., 2001; Fleckenstein et al., 2006). MODFLOW is a modular three-dimensional finite difference groundwater flow model (McDonald and Harbaugh 1988), which simulates transient/steady groundwater flow in complex hydraulic conditions with various natural hydrological processes and/or artificial activities.
3.2.1 Conceptual model
A conceptual model of the study area was developed based on the hydrogeologic information and field investigation. The lithologic investigation indicates that a confined aquifer exists in the river basin. The thickness of the aquifer varies from 20 to 55 m and its depth from the ground surface varies from 15 to 50 m over the basin. The upper confining layer mostly consists of clay whereas the aquifer material comprises of medium sand to coarse sand. There are patches of medium sand and coarse sand within the clay bed which makes it act like a leaky confined aquifer. There are some clay lenses present in the confined aquifer. To simplify the model for simulation, those clay lenses were ignored while developing the conceptual model of the study area. The eastern boundary is bounded by the Kathajodi River and the western boundary is bounded by the Surua River (Fig. 2). Therefore, these boundaries were simulated as Cauchy (head-dependant flux) boundary conditions. The conceptual model of the study area at Section J-J' (Fig. 2) is shown in Fig. 3(a,b), which provides a basis for the design and development of the numerical model of the study area using Visual MODFLOW software.
3.2.2 Governing equation
Based on the conceptual model of the study area, a three-dimensional groundwater flow model was developed for simulating flow in the confined aquifer under study. The following governing equation was used for simulating transient groundwater flow in the heterogeneous and anisotropic confined aquifer of the study area (Anderson and Woessner, 1992):
Where, Kx, Ky, and Kz = aquifer hydraulic conductivities in x, y and z directions, respectively [LT-1]; h = hydraulic head, [L]; W = volumetric flux per unit volume representing sources and sinks of water in the aquifer system ('+' for source and '-' for sink), [T-1]; Ss = specific storage of the aquifer, [L-1]; and t = time, [T].
3.2.3 Discretisation of the basin and model design
The study area was discretized into 40 rows and 60 columns using the Grid module of Visual MODFLOW software (Fig. 4). This resulted in 2400 cells, each having a dimension of approximately 222 m ï‚´ 215 m. The cells lying outside the study area were assigned as inactive cells. The hydrogeologic setting of the study area as conceptualized earlier was divided into two model layers with the lower one representing the confined aquifer. The thickness of the two layers at different points was assigned considering the hydrogeologic framework of the basin. The data on surface elevation, bottom elevation of the top layer and bottom elevation of the aquifer layer at available 19 sites were imported to the MODFLOW software from the database prepared using MS-Excel files. Similarly, the location of pumping wells, observation wells and weekly groundwater levels of the model period were also imported from the MS-Excel databases.
3.2.4 Boundary conditions
The Kathajodi and Surua rivers completely surround the basin from the east and west directions, respectively making this study area a complete river island. Therefore, the boundaries of the groundwater basin were modeled as head-dependent flux or Cauchy boundary condition. The river heads were assigned as varying head boundary conditions using the 'River Package' of Visual MODFLOW software. The base of the aquifer was modeled as a no-flow boundary, because it consists of dense clay. The river boundary around the study area as modeled in Visual MODFLOW software has been depicted in Fig. 4. The water flux between the rivers and the aquifer was simulated by dividing the rivers into 10 reaches. The input parameters such as river stage at different time steps, river-bed elevation, river-bed conductivity, river-bed thickness, and river width at the upstream and the downstream site for all the river reaches were assigned. MODFLOW linearly interpolates these values between both the ends of a river reach.
3.2.5 Initial conditions
Initial conditions refer to the head distribution everywhere in the system at the beginning of the simulation and thus are boundary conditions in time. It is a standard practice to select as the initial condition a steady state head solution generated by a calibrated model (Anderson and Woessner, 1992). In this study, steady state head solution of 1st February 2004 groundwater level was used as the initial condition for the calibration period and steady state head solution of 4th June 2006 groundwater level was used as the initial condition for the validation period.
3.2.6 Assigning model parameters
The model input includes hydrogeological parameters such as hydraulic conductivity and specific storage (Ss), and hydrological stresses like recharge, evapotranspiration and groundwater abstraction. The model parameters like hydraulic conductivity and specific storage were determined by conducting pumping tests at nine different sites of the study area. The hydraulic conductivity values ranged from a minimum of 11.25 m/day at Site B to a maximum of 96.80 m/day at Site O. Similarly the specific storage values ranged from a minimum of 4.3 Ã- 10-6 at Site B to a maximum of 2.75 Ã- 10-5 at Site O (Mohanty et al., 2012). A ratio of horizontal hydraulic conductivity (Kh) to vertical hydraulic conductivity (Kv) was assumed as 10 to account for aquifer anisotropy. Since, the historical records of pumping from these tubewells were not available, the groundwater abstractions were obtained by conducting a detailed survey among the farmers. The pumping schedule, and position and extent of the well screens of respective pumping wells were assigned using the Well Package of the model.
3.2.7 Model calibration and validation
The developed groundwater-flow simulation model was firstly calibrated for the steady-state condition and then for the transient condition. The steady-state calibration was achieved by matching the model-calculated groundwater levels with average groundwater-levels observed in the 19 observation wells during 1st February 2004. The solution of the steady-state calibration was used as an initial condition for the transient calibration. Transient calibration was performed using weekly groundwater level data of 19 selected sites for the period 01 February 2004 to 04 June 2006, following the standard procedures (Anderson and Woessner, 1992; Zheng and Bennett, 2002; Bear and Cheng, 2010). A combination of trial and error technique and automated calibration code PEST was used to calibrate the developed flow model by adjusting the hydraulic conductivity, specific storage and recharge within reasonable ranges. The calibration results were evaluated relative to the observed values at the 19 sites by using statistical indicators and comparing observed and simulated groundwater level hydrographs.
After calibrating the model, validation was performed using the observed groundwater level data from June 2006 to May 2007. The calibrated hydraulic conductivity and storage coefficient values were used during validation of the model whereas other input parameters like pumping, river stage, recharge and observation head of the corresponding validation period were used.
3.2.8 Criteria of evaluation
Six statistical criteria (or statistical indicators) were used in order to evaluate the performance of the calibration and validation of the MODFLOW-based numerical model. They are bias, mean absolute error (MAE), root mean squared error (RMSE), correlation coefficient (r), mean percent deviation (Dv) and Nash-Sutcliffe efficiency (NSE) and are given by the following equations:
Where, hoi = observed groundwater level of the ith data [L], hsi = simulated/predicted groundwater level of the ith data, = mean of observed groundwater levels [L], = mean of simulated groundwater levels [L], and N = number of observations. The best-fit between observed and simulated groundwater levels under ideal conditions would yield bias = 0, MAE = 0, SEE = 0, RMSE = 0, normalized RMSE = 0, r = 1, Dv = 0 and NSE =1.
Moreover, the observed groundwater level hydrographs and MODFLOW-based numerical model simulated groundwater level hydrographs were plotted for a visual checking of model performance. Scatter plots (along with 1:1 line) of observed versus simulated groundwater levels were also prepared for calibration and validation periods for examining the efficacy of the models in simulating groundwater levels.
3.3 Groundwater Level Forecasting using Artificial Neural Network Model
Besides the development of a groundwater flow simulation model, ANN models were also developed to assess their efficacy in predicting groundwater levels in the study area. In most of the past studies on groundwater level prediction by ANN, models have been developed for predicting groundwater levels in a single well or a few selected wells using a set of input parameters. However, in the present study, an attempt was made to predict groundwater levels simultaneously in a large number of wells over the basin by using ANN technique.
3.3.1 Design of ANN model
In the present study, widely used feedforward neural network (FNN) architecture was used. It is one of the simplest neural networks and has been successfully used for water resources variable modeling and prediction (Maier and Dandy, 2000; ASCE, 2000). In a feedforward network, the nodes are generally arranged in layers, starting from a first input layer and ending at the final output layer. The nodes in one layer are connected to those in the next, but not to those in the same layer. Thus, the output of a node in a layer is only dependant on the input it receives from previous layers and corresponding weights. Fig. 5 shows the feedforward network for the current study having one hidden layer with several nodes in input and output layer.
Initially, three ANN algorithms, namely gradient descent with momentum and adaptive learning rate backpropagation (GDX) algorithm, Levenberg-Marquardt (LM) algorithm and Bayesian regularization (BR) algorithm were compared for predicting groundwater levels in Kathajodi-Surua Inter-basin. But as Bayesian regularization performed better than other two algorithms (Mohanty et al., 2010), it was used in the current study for groundwater level forecasting. The ANN model was designed to predict groundwater levels in 18 tubewells (Fig. 2) with one-week lead time using a set of suitable input parameters. Based on the correlation analysis between groundwater level and the selected input parameters, groundwater level at 1-week lag time, weekly rainfall, river stage, weekly evaporation, water level in the main drain and weekly pumping from the tubewells were considered as final input parameters. There were altogether 40 input nodes and 18 output nodes in the initial ANN model of the study area. The 40 input nodes represent groundwater levels with 1-week lag time at the 18 sites, groundwater pumping rates of the 18 tubewells, weekly rainfall, average weekly pan evaporation, average weekly river stage, and average weekly water level at the drain outlet. The 18 output nodes represent groundwater levels at the 18 sites in the next time step (i.e., one week ahead).
3.3.2 Clustering of study area
The ANN model having 40 input nodes and 18 output nodes was difficult to be trained by the trial and error method while using Bayesian regularization (BR) algorithm; it consumed a lots of computer memory and proved to be very time consuming. Maier and Dandy (1998) reported that the Levenberg-Marquardt algorithm has a great computational and memory requirement, and hence it is mostly useful for small networks. The same is true for the Bayesian regularization algorithm also. In order to run the model effectively, an effort was made to reduce the size of the neural network by dividing the study area into three clusters (Mohanty et al., 2010) and developing three separate ANN models for the three clusters to predict groundwater levels one week advance at the sites present in a particular cluster. Cluster 1 contains 7 sites namely A, B, D, E, H, I and J. Cluster 2 contains 5 sites namely C, F, G, K and L, and Cluster 3 contains 6 sites namely M, O, P, Q, R and S (Fig. 2). The division of the study area into three clusters and modeling groundwater separately in three clusters would not have any effect on the final output as the pumping of the tubewells in a given cluster has a very minor effect on the water level in the tubewells of other clusters.
In each cluster, groundwater levels at the sites in the previous time step, pumping rates of the tubewells, weekly total rainfall, weekly pan evaporation and weekly river stage were considered as input parameters. In the third cluster, however, an additional input parameter weekly water level in the drain was considered as it has potential to affect the groundwater level in this cluster only. Thus, Cluster 1 had 17 input nodes and 7 output nodes, Cluster 2 had 13 input nodes and 5 output nodes and Cluster 3 had 16 input nodes and 6 output nodes.
3.3.3 Model training and testing
The structure of the neural network consisted of one hidden layer along with the input and output layer. The optimal number of nodes in the hidden layer was optimized by trial and error and the number of hidden nodes corresponding to the least root mean mean squared error (RMSE) was selected as optimal number of hidden neuron. The activation function of the hidden layer and output layer was set as log-sigmoid transfer function as this proved by trial and error to be the best among a set of other options. In this study, supervised type of learning with a batch mode of data feeding was used for ANN modeling. Out of the 174 weeks datasets available, 122 datasets were used for training the ANN models and 52 datasets were used for testing the models. The ANN modeling was performed by using MATLAB 6.5 software. The six statistical indicators described in section 3.2.8 were used in order to evaluate the performance of the training and testing of the ANN model.
3.4 Comparison of Numerical Model and Neural Network Model
A comparison of the performance of the MODFLOW-based numerical model with that of the ANN model was carried out to study their efficacy in simulating/predicting groundwater levels. In order to have a fair comparison between the models, the training and testing periods of the ANN model were maintained same as that of calibration and validation period of the numerical model. The predicted groundwater levels by the ANN model at 18 sites during the testing period were compared with the groundwater levels simulated by the numerical model during the validation period using statistical indicators like bias, MAE, RMSE, r, Dv and NSE as described in Section 3.2.8. In addition, groundwater levels simulated by both the models were plotted along with the observed groundwater levels for visual comparison of performance of the two models.
4. RESULTS AND DISCUSSION
4.1 Groundwater Simulation by Numerical Model
4.1.1 Calibration results
During calibration, the groundwater flow-simulation model was found more sensitive to aquifer hydraulic conductivity values in comparison to aquifer specific storage. The statistical indicators, i.e., bias, mean absolute error (MAE), root mean squared error (RMSE), correlation coefficient (r), mean percent deviation (Dv) and Nash-Sutcliffe efficiency (NSE) along with the calibrated hydraulic conductivity values at nineteen calibration sites are presented in Table 1. The bias values range from a minimum of 0.006 m at Site D and F to a maximum of -0.517 m at Site G, whereas the MAE values range from a minimum of 0.335 m at Site H to a maximum of 0.663 m at Site R. The RMSE values range from a minimum of 0.442 m at Site D to a maximum value of 0.817 m at Site R, whereas the correlation coefficient values range from a minimum of 0.891 at Site G to a maximum of 0.974 at Site J. The Dv values range from a minimum of -0.012% at Site F to a maximum of -3.154% at Site G, whereas the NSE values range from a minimum of 0.602 at Site C to a maximum of 0.918 at Site J. These results indicate that the simulated groundwater levels at sites D, F, H, J and N are more accurate compared to other sites (relatively low values of MAE and RMSE, and high values of r and NSE). On the other hand, there has been relatively inferior simulation of groundwater levels at sites C, E, G and R as the MAE and RMSE values are on a higher side, and r and NSE values are on a lower side. The bias values at sites B, C, E, G, I, K, L, P and R are negative, which indicates there is overall under-simulation at these sites. There is overall over-simulation at the remaining sites. However, there is an overall good calibration because the values of bias, MAE, RMSE, and Dv for almost all the sites are reasonably low and are within acceptable limits. Also, the correlation coefficient and NSE values are reasonably high at most of the sites. The calibrated values of hydraulic conductivity varied from a minimum of 20 m/day (sites A and B) to a maximum of 52 m/day (sites M, N and O) (Table 1), whereas the calibrated values of aquifer specific storage remained more or less the same (varying from 1.43 Ã- 10-4 to 9.9 Ã- 10-4) as the measured values.
The MODFLOW-generated scatter diagram along with 1:1 line, 95% interval lines and 95% confidence interval lines for the entire calibration period is shown in Fig. 6. The 95% interval is the interval where 95% of the total number of data points is expected to occur. The 95% confidence interval shows the range of calculated values for each observed value with 95% confidence that the simulation results will be acceptable for a given observed value. For an ideal calibration, the 1:1 line should lie within the 95% confidence interval lines (WHI, 2005). Fig. 6 shows that the 1:1 line lies within the 95% confidence interval lines indicating a good calibration of the developed groundwater flow model. The observed and calibrated groundwater levels at three sites, i.e., Baulakuda (Site A) in the upstream portion of the basin, Dahigan (Site K) in the middle portion of the basin and Chanduli (Site S) in the downstream portion of the basin are shown in Figs. 7(a to c), respectively. The visual comparison of observed and calibrated groundwater level hydrographs at all the sites including the above 3 sites indicated a reasonably good match between observed and calibrated groundwater levels at almost all the sites except sites C and E where there was under-simulation of groundwater levels during dry periods, Site G where there was under-simulation during both dry and wet periods, and Site R where there was over-simulation during dry periods and under-simulation during wet periods.
4.1.2 Validation results
The scatter diagram along with 1:1 line, 95% interval lines and 95% confidence interval lines for the entire validation period is shown in Fig. 8. The figure shows that the 1:1 line lies within the 95% confidence interval lines which indicates satisfactory validation of the developed groundwater flow model. The comparison between the observed and simulated groundwater levels by graphical as well as statistical methods is described in succeeding section dealing with comparison of MODFLOW-based numerical model and ANN model.
4.2 Groundwater Level Forecasting using Neural Network Model
4.2.1 Model training results
The optimum number of hidden neurons in the neural network model was found 10, 20 and 40 for cluster 1, 2 and 3 respectively. Figs. 9(a to c) show the variation of RMSE and NSE values with the number of nodes in hidden layer for three different clusters respectively. The RMSE values are lowest and the NSE values are highest in all the figures with respect to the optimum number of hidden neurons.
The statistical indicators, i.e., bias, mean absolute error (MAE), root mean squared error (RMSE), correlation coefficient (r), mean percent deviation (Dv) and Nash-Sutcliffe efficiency (NSE) for the training period at eighteen sites are presented in Table 2. The bias values range from a minimum of -0.003 m at Site G to a maximum of 0.057 m at Site R, whereas the MAE values range from a minimum of 0.116 m at Site B to a maximum of 0.361 m at Site R. The RMSE values range from a minimum of 0.149 m at Site B to a maximum value of 0.478 m at Site R, whereas the correlation coefficient values range from a minimum of 0.963 at Site O to a maximum of 0.994 at Site I. The Dv values range from a minimum of -0.005% at Site I to a maximum of -0.714% at Site R, whereas the NSE values range from a minimum of 0.926 at Site O to a maximum of 0.987 at Site O. The statistical indicators indicate that the training of the model is very satisfactory as the values of bias, MAE, RMSE, and Dv are reasonably low and the correlation coefficient and NSE values are reasonably high at all the sites.
4.3 Comparison between the Numerical Model and ANN Model
The comparison between the MODFLOW-based numerical model and the ANN model in terms of bias, MAE, RMSE, correlation coefficient, Dv and NSE statistical indicators during validation period is shown in Table 3. The bias value in case of numerical model varies from a minimum of -0.025 m at Site Q to a maximum of -0.505 m at Site P, whereas that for the ANN model varies from a minimum of 0.01 m at Site C to a maximum of 0.239 m at Site K. The MAE value in case of numerical model varies from a minimum of 0.297 m at Site Q to a maximum of 0.709 m at Site M, whereas that in case of ANN model varies from a minimum of 0.178 m at Site C to a maximum of 0.464 m at Site R. The RMSE value in case of numerical model varies from a minimum of 0.38 m at Site Q to a maximum of 0.827 m at Site E, whereas the same in case of ANN model varies from a minimum of 0.24 m at Site C to a maximum of 0.522 m at Site R. The value of correlation coefficient in case of numerical model varies from a minimum of 0.922 at Site H and Site J to a maximum of 0.982 at Site Q, whereas that in case of ANN model varies from a minimum of 0.958 at Site J to a maximum of 0.988 at Site K. The Dv value in case of numerical model varies from a minimum of 0.36% at Site F to a maximum of -3.78% at Site P, whereas that in case of ANN model varies from a minimum of -0.06% at Site B to a maximum of 1.82% at Site K. Further, the Nash-Sutcliffe Efficiency (NSE) in case of numerical model varies from a minimum of 0.55 at Site C to a maximum of 0.95 at Site Q, whereas that in case of ANN model varies from a minimum of 0.90 at Site J to a maximum of 0.96 at Site P. The values of the correlation coefficient and Nash-Sutcliffe efficiency are generally higher and the values of bias, MAE, RMSE and Dv are lower for the ANN model at all the sites compared to the numerical model. Hence, it can be inferred that the ANN model predicted groundwater levels with higher accuracy than the numerical model.
Furthermore, simultaneous plots of the groundwater levels simulated by the MODFLOW-based numerical model and the ANN model along with the observed groundwater levels for three sites, i.e., Baulakuda (Site A) in the upstream portion of the basin, Dahigan (Site K) in the middle portion of the basin and Chanduli (Site S) in the downstream portion of the basin are shown in Figs. 10(a to c), respectively. The visual comparison of observed, numerical model-simulated and ANN model-simulated groundwater level hydrographs at all the sites including the above 3 sites indicated that the groundwater levels predicted by the ANN model matched better with the observed groundwater levels than the groundwater levels simulated by the numerical model. It is only at Site Q and to some extent Site R, the accuracy of groundwater levels prediction by the numerical model almost matched with that of ANN model. Thus, the visual checking of observed and simulated groundwater levels also confirms that the ANN model is superior to the numerical model in simulating groundwater levels.
A closer look at the quantitative indicators and graphical comparisons show that there is very little difference between the correlation coefficient values obtained for the numerical and ANN models (Table 3), even though the graphical comparisons and other statistical indicators indicate a clear difference between the performances of both the models. Similarly, the values of bias and Dv at sites C, F and H are significantly less in case of numerical model, even though other statistical indicators and graphical comparisons do not show a good matching between the observed and simulated groundwater levels. It can be attributed to the reason that in some cases, the over-calculated and under-calculated values negate each other, and produce a bias value close to zero. Sometimes, this can lead to false interpretation of model calibration (WHI, 2005). The same logic holds true for Dv also. On the other hand, the MAE, RMSE and NSE indicators are consistently found superior in ANN model than the numerical model, except at sites Q and R, where they are comparable. This is also in agreement with the graphical comparison of observed and simulated groundwater levels. Based on the above analysis, it is inferred that the MAE, RMSE and NSE statistical indicators are more powerful than the bias, Dv and r in evaluating the model performance.
Despite the limited data, the ANN model provides better prediction of groundwater levels. The neural networks also have the advantage of not requiring explicit characterization and quantification of the physical properties and condition of the aquifer system. Also, the data requirement of ANNs is generally easier to collect and quantify than the physically based models. However in case of ANN model, any changes in the input or output parameters will require total modeling of the system from the beginning, whereas this is not the case in case of numerical model. The numerical models provide total water balance of the system, whereas the ANN models are 'black box' models and they do not provide any information about the process of a system. The numerical models can help provide insights into the hydrogeologic framework and properties, and simulate future conditions (Coppola et al. 2003). They can also generate detailed output regarding head, flow, and water budget components across the study area. Thus, the numerical models can be more appropriate for long-term predictions, whereas the ANN technique may be better for real-time short-horizon predictions at selected locations that require a high accuracy (Coppola et al., 2005).
Thus, there are different advantages offered by the ANN technique and numerical models, and each should be selected in accordance with the problem. In some cases, both the models can act complimentary to each other like using numerical model for long term predictions and ANN model for short term predictions. If sufficient ANN prediction coverage exists for the study area, head and flow fields and water budget components can be estimated by using interpolation and estimations methods (Coppola et al., 2005). In cases where sufficient coverage is not available, numerical modeling approach would have to be used for predictions. The ANN models can replace the numerical models as an approximate simulator in the simulation-optimization models as has been reported by some researchers (e.g., Rao et al., 2004; Bhattacharya and Datta, 2005; Rao et al., 2006; Safavi et al., 2010). The replacement of the numerical model by the ANN model can help reduce the computational burden in distributed modeling.
A groundwater flow simulation model was developed for the Kathajodi-Surua Inter- basin of Odisha, India, using Visual MODFLOW model for simulating groundwater scenarios. Artificial neural network models were also developed for forecasting groundwater level in the study area. The comparison of both the models showed that the ANN model can provide better prediction of groundwater level than the MODFLOW-based numerical model for short-horizon predictions. The data requirement in case of ANN models is also substantially less than the numerical models. However, numerical models like MODFLOW provide the total water balance of the system whereas the ANN models are like a 'black box' and they do not describe the entire physics of the system. In case of ANN model, any changes in the input or output parameters will require total modeling of the system from the beginning whereas this is not the case in case of numerical models. The numerical models are more appropriate for long-term predictions, whereas the ANN technique is better for short-horizon predictions that require a high accuracy. Hence there are different advantages offered by the ANN technology and numerical models, and each should be selected in accordance with the problem. In some cases, they can be used as complimentary to each other for sound decision making in groundwater management problems.