This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Normal, Log-normal, Pearson type III, Log-Pearson type III and Gumbel distributions are applied to annual maximum flood data series of 19 rivers across Malaysia. The statistical parameters of fittings are estimated by method of moment and L-moment. The floods data are obtained from Department of Irrigation and Drainage Malaysia (DID).The main goal of this study is to find the most suitable probability distribution and method of parameter fitting. Comparison of probability distribution and method of parameter fitting with actual data will conclude which probability distribution and method of parameter fittings suitable for Malaysia. Probability distribution and method of parameter fitting are developed using excel spread sheets to calculates magnitudes of each river. Magnitude or discharge of river for return period of 2, 5, 10, 20 and 50 years are determine by using linear interpolation method.
Flood estimation may be carried out by numerous means. For example, (1) rainfall model; (2) unit hydrograph and losses model; (3) flood frequency analysis; (4) rational method and so forth. Flood frequency analysis is favourable among other method stated above due to the fact that this method takes actual hydrology data into account while carried out flood estimation.
Flood frequency analysis is calculation of statistical probability to determine the return period for a given magnitude from a given river. One of the primary objectives of frequency analysis is to determine the return period of a hydrology event of a given magnitude (Chow et al., 1988). Return period is the average time between occurrences of a defined event. Weibull plotting position is employed to determine the return period of 2, 5, 10, 20 and 50 years for all the probability distributions. Annual maximum floods data are employed in this study and an approximate 600 years flood data are utilize in obtaining the return period and magnitude.
Flood frequency analysis is an approach to determine critical design discharge for hydraulic structure that capable of satisfying both economical and perhaps political issues. For example, Water Resources Engineering by Larry W.Mays state that "The results of flow flood frequency analysis can be used for many engineering application: (1)for the design of dams, bridges, culverts, water supply systems and flood control structures; (2) determine economic value of flood control projects; (3) to determine the effect of encroachments in flood plain; (4)to determine reservoir stage for real estate acquisition and reservoir use purpose; (5) to select of runoff magnitudes for interior drainage, pumping plant and local protection project design; and (6)for flood planning zone(Mays 2001:320-321)" .
There are various types of probability distributions employed in estimating flood frequency such as normal, log-normal(LN2), Pearson type III(P3), log-Pearson type III(LP3), Gumbel(EV1), general extreme value(GEV), Wakeby, kappa, Weibull and log-logistic distribution etc. The statistical parameters fitting for these distributions may perhaps estimate by methods of moment (MOM), L-moment (LM), maximum-likely-hood (ML), probability-weighted moment (PWM), entropy, Bobee, mixed-moment and so forth. In this study, six distributions: normal; LN2; P3; LP3; EV1 and two methods of statistical parameters fitting: MOM and method of LM will be used.
"When the measured data are very positively skewed, the data are usually log-transformed and the distribution is called log-Pearson type III distributions. These distributions are widely used in hydrology, primarily because it has been (officially) recommended for application to flood flow by the U.S. Interagency Advisory Committee on Water Data (1982). "(CHIN 2000:267)
The main goal for this study is to determine the most suitable probability distribution and method of parameter fitting to be employed in Malaysia. This study utilizes six distributions and two method of statistical parameter fittings mentioned above. Taking into consideration an identical probability distribution is analysed twice by two methods of parameter fittings, a total of 12 analyses will be carried out for each river.
Subsequently, comparison between the outcomes of statistical analysis and actual data will be carried out to identify the probability distribution and method of statistical parameter fitting that yield the best results and consistent with the actual data.
Literature Review/Case Studies
2.1Comparison of probability distributions and parameter estimator
Terafuk Haktanir (1992) carried out test to find the suitable probability distributions and methods of parameter fitting for 45 unregulated streams in Anatolia. The probability distributions are LN2, LN3, P3, LP3, Gumbel, GEV etc. and the method of parameter are MOM, ML and PWM. Additional evaluation for LP3 using parameter methods of maximum Entropy (ME), Bobee and Method Mixed Moment in determines the best probability distribution and parameter fittings. Comparison of 100, 1000 and 10000 years are computed from a parent population with representative parameters versus those from many synthetic series of again finite lengths are attempted. The result shows that LN2 is best for large return period circumstances. LP3 Wakeby and GEV are used as parent distributions in computing with many different synthetic series. LN3 is the best in term of estimating quantile followed by LN2, Gumbel using ML, LP3 using ME and P3 using PWM. For certain cases Gumbel using ML show the best performance. Terafuk Haktanir (1992)conclude that the most suitable probability distributions are LN3, Gumbel by using PWM or ML, LP3 by ME or MOM, P3 by PWM and GEV by PWM.
J.C Smither (1994) carried out goodness-of-fits test to study probability distributions that are suitable to estimate short-duration fall in South Africa. Parameter estimators employed in this study is L-moment. A total of 38 sites with approximate 30 years data were utilized in this test. The probability distributions are LP3, LN2, LN3, P3, Gumbel, Wakeby, GEV etc. Chi-squared test, standardised deviation and non-parameter were carried out by using goodness-of-fits to rank probability distributions according to their performance. The results show that P3 is sensible probability distributions to be use but when L-skewness is larger than 0.2, P3 show poor performance. LP3 show poor performance when L-skewness is greater than 0.2 similar to P3 distributions. The goodness-of-fitness test show that LN3 performed best follow by GEV and P3 and L-EV1 performed the worst follow by LP3. According to J.C Smither (1994), EV1 is generally used in South Africa analysis and the result show that EV1 is not a good estimator. LN3 and GEV are recommended for the future short-duration fall analysis.
T.A.McMahon and R.Srikanthan (1980) carried out analysis on 172 Australia stream by using moment ratio diagram for LP3 and 6 others theoretical distributions namely: normal; LN2; P3; EV1; Weibull and exponential. T.A.McMahon and R.Srikanthan (1980) conclude that LP3 distribution is suitable for flood frequency analysis compare to the others theoretical probability distribution. The theoretical distributions are poorly fitted in to the graph according to T.A.McMahon and R.Srikanthan (1980). LP3 and moment diagram with skewness almost equal to zero indicate that LP3 is good distribution to be used for flood frequency analysis because it is not bias.
Joseph D.Countryman et al. (2008) found out that LP3 and P3 are unreliable and ineffective after conducted a test to estimate extreme events namely: 90% confidence bound of 100years, 200years and 500 years flood from 3 rivers in California. The test show that the 90% confidence bound was unacceptable because LP3 extensively overestimate the actual 100 years, 200years and 500 years flood flow data whereas P3 underestimate the actual 100 years, 200years and 500 years flood flow data. , Joseph D.Countryman et al. (2008) mention that P3 is more suitable for longer period of record data compare to LP3. For large flood data, LP3 is not suitable to be employed to make estimation because it will lead to unrealistic design.
Vogel.R.Met al. (1993) evaluate LP3, LN2 and LN3, GEV using LM diagram and goodness of fitness procedure for 10 regions which consists 383 sites of flood data in south western of United State. For 100 year event, LP3, GEV and LN2 produce exceedances that fall within 95% of likely interval. As for 1000 years event, LP3, GEV and LN3 produce exceedances that fall within 95% of likely interval. The final results show that LP3, LN2, LN3 and GEV gave a satisfactory estimation to the 383 data observed.
Vogel.R.M and Fennessey.N.M (1993) performed a case study with sample n =5000 of an average daily stream flow in Massachusetts using L-moment diagrams and product moment diagram. The L-moment diagram describe the relationship between L-skewness and L-kurtosis for probability distributions of P3, LN3, GEV, Gumbel, normal, uniform, exponential, Wakeby and Generalize Pareto. Product moment diagram is comparing the skewness, kurtosis and coefficient of variances with their theoretical counterpart mean respect to variance, kurtosis, skewness. Monte Carlo simulation is carried out by generating stream flow traces of length n of 10 ,20 ,50 ,100, 200, 500, 1000 and 5000 from LN2 and Generalize Pareto and populated with coefficient of variances of 1,2,5 and 10. Bias and root mean square error of coefficient of variances Cv and kurtosis is computed for LN2. The results show that when population and coefficient of variances increase Product moment ratio provides no information on either skewness or coefficient of variances of the samples. Vogel.R.M and Fennessey.N.M (1993) conclude that L-moment is better due to the fact that the L-skweness, L-kurtosis and L-Cv are close to unbiased. By taking logarithm of observed data, product moment will behave similar to L-moment but distributional properties of the original data will be undetermined except for LP3 and LN3.
For the comparison of probability distributions, LP3 and GEV are more favourable compared to P3, Gumbel and LN2. As for the parameter estimator, LM and LM diagram outperformed MOM and moment ratio diagram.
2.3Log-Pearson type III
Griffis and Stedinger (2007) conducted Monte Carlo analysis on parameters estimation for LP3 distribution: ML, method of mix moment, MOM, method of moment in log space without regional skew, method of moment in log space with regional skew(recommended by bulletin 17), and method of moment in real space. Sample size of 25, 50 and 100 were employed for comparison between the parameters estimation. Mean square error (MSE) was employed to find the efficiency of each method. Results for 100 years event were reported. ML was shown to be underperformed compare to MOM for the sample size of 25 because MSE for maximum likelihood is 60% larger than method of moment at .But as the sample size increase, ML outperform MOM. Griffis and Stedinger (2007) conclude that MOM improved as highly information regional skew is utilized as suggested in Bulletin 17. Method of mix moment is excellent method not using regional information and can be compared to Bulletin 17. ML turn out not to be as efficient as it was originally thought and only efficient depend on starting location and any parameter constrain. Method of moment in log space with regional skew is better in average compared to ML.
I.A. Koutrouvelisa and G.C. Canavosb(2000) evaluated LP3 by comparing 8 methods namely: direct moment(MDM), method of indirect moment(IM), method of mixed moment(MMM),an adaptive mixed moment method(AAMMM)and etc.. MDM is recommended by us Water Resources Council (1967) whereas MIM is recommended by Bobee (1975). Rao (1980, 1983) is the first to suggest MMM but there is limitation on this method. Subsequently, Phien and Hira (1983) and by Arora and Singh (1989) proposed simple procedures to obtain MMM estimates of parameter which overcome that limitation. AAMMM consists of the application of the mixed moment's method of Koutrouvelis and Canavos (1999) to the logarithmically transformed data. Monte Carlo simulation is employed to evaluate the performance of the methods specify above. AAMMM found to have the best performance in term of bias follow by MMM and MID. AAMMM, MMM and MID procedures maintain smallest absolute value of bias in approximation of return period Tâ‰¥50 and sample size nâ‰¤50 as well as for Tâ‰¥10, n=any value and high return period and sample size nâ‰¤50.Standardizedand normalized root mean squared error is carried out for various return period and sample size. For T=10 and n=25, AAMMM perform the best while generalised direct moment perform the worst. For the case of Tâ‰¥50 and nâ‰¤50, generalised mixed moment performed the best and MID shows the worst performance. Generalised mixed moment show the best performance in 98thand 99thpercentile.
In general the product of moment for LP3 is recommended by Griffisand Stedinger (2007), A. Koutrouvelisa and G.C. Canavosb (2000), Terafuk Haktanir (1992) and T.A.McMahon and R.Srikanthan (1980). Nevertheless, no L-moment is carried out to compare with the product of moment for LP3 distributions. Vogel.R.Met al. (1993)conclude that LP3 is suitable to be used in south western of America but J.C.Smither (1994)found out that LP3 is among the worst distributions for short duration fall in South Africa. Joseph D.Countryman et al. (2008) did not recommend LP3 because this distribution overestimates large data year flow data for 90% confidence interval.
2.4Pearson type III
Chen Yuan Fang et.al (2002) employed Monte-Carlo method to compared P3 with parameter estimation method of MOM, curve fitting, PWM and weighted function moment. Evaluation of P3 is carried out by means of bias, efficiency of the parameter, the quantile and the probability of failure. Root mean Square Error and mean of Monte Carlo generating samples show the bias, efficiency of the parameter and the probability of failure of parameter estimated method. The comparisons show that method of moment is not a good estimation method because it is biased in quantile estimation. Comparisons of PWM, curve fitting and weighted function moment are carried out in two tests which are a simple sample and a simple sample with history data. For A simple sample, curve fitting performed the best follow by PWM and weighted function moment. For the latter test, PWM performed the best follow by curve fitting and weighted function moment. Chen Yuan Fang et.al (2002) recommends employing PWM and curve fitting for parameter estimating as it shows excellent statistical performance. However for small size sample, all the parameter methods mention above provide larger expected probability of the quantile estimated than design frequencies.
DING JING and YANG RONGFU (1988) compute PWM for P3 with the purpose of PWM able to apply in the case of existence of extraordinary values in the sample. Sample with historical flood data is used in this test. PWM for P3 is compared with ML,MOM and curve fitting. For comparison between MOM and PWM, for skew CSâ‰¤2.5, PWM outperformed MOM in term of bias but have the same performed in term of efficiency. With the presence of extraordinary value, PWM outperformed MOM for both criterions. As for CS>2.5, PWM have the same performance as MOM. For comparison between ML and PWM, for skew CSâ‰¤2.5, PWM have the same performed in term of bias and underperformed MOM in term of efficiency. As for CS>2.5, Ml outperformed PWM for both criterion. For comparison between curve fitting and PWM, PWM outperformed curve fittings on both criterion for CSâ‰¤2.5 and CS>2.5.DING JING and YANG RONGFU(1988) conclude that PWM that proposed is sensible in engineering practice and for CSâ‰¤2.5 PWM is a good estimator but vice versa for CS>2.5.
P3 is found to be a good estimator in term of PWM by Chen Yuan Fang et.al (2002), DING JING, YANG RONGFU (1988), and Terafuk Haktanir (1992). Nonetheless, P3 perform badly in MOM for the reason that P3 is biased as shown by the two studies above.
2.5 Gumbel (extreme value 1)
M. FIORENTINO and S. GABRIELE (1984) evaluate corrected maximum likelihood for Gumbel by modifying ML estimation procedure. Comparison of estimator methods was carried out for sample sizes range from 5 to 300. The estimator methods namely MOM, ML, corrected maximum likelihood (CML), PWM and Maximum Entropy (ME). Evaluate sample size, mean, variance and mean square error for each estimator method. Relative efficiency was evaluated by the ratio of estimator methods with respect to ML. PWM and CML provide unbiased estimates of parameter. The other estimator methods provide biased parameter by overestimating the parameter. Mean square error depends on the variance and biased of estimator which will influence the outcome of the efficiency of the estimator respect to the ML. The result show that CML and PWM outperformed ML, CML provide the best performance. Quantile estimates for the least biased estimator method are PWM follow by CML and MOM.
HUYNH NGOC PHIEN (1986) estimate the parameter of Gumbel distributions with four methods namely MOM, ML, ME and PWM. Efficiency of each method was carried out by comparing the variance of the method respect to variance ML. HUYNH NGOC PHIEN (1986) point out that ML is asymptotically minimum variances which is more efficiency than other estimators. Subsequently, comparison of efficiencies is conveniently assessed using the variances of the estimators. Monte Carlo Simulation was carried out to estimate parameters and computation of Root Mean Square Error (RMSE) for the parameters. PWM performs ideally in term of bias and MOM performs the worst among all the other estimators. In term of RMSE, PWM and MOM underperform the other two estimators and MOM is found to be the least satisfactory estimators. HUYNH NGOC PHIEN (1986) concludes that MOM is the worst estimator in term of both RMSE and bias, PWM is the best in term of bias and ML is best for RMSE. By considering both RMSE and bias criteria, ME distribution is favour since the performance of RMSE for ME and ML are relatively similar.
For Gumbel distributions PWM performed the best compared to other parameter estimator as suggested by M. FIORENTINO and S. GABRIELE (1984), HUYNH NGOC PHIEN (1986) and Terafuk Haktanir (1992). MOM is not a good estimator as stated in the two studies above.
2.6 Generalized extreme value (GEV)
J.R.M Hosking et.al (1985) derives parameter and quantile of GEV by employing PWM and study the property of the GEV. For large sample, the property is investigated by using asymptotic theory whereas use computer simulation for medium and small samples. The result show that large sample for quantile in PWM has higher variance than Maximum-likelihood but smaller biases especially on the higher upper tail distributions. For small sample, the PWM show that standard deviation is smaller than the maximum-likelihood while PWM for medium size is bigger standard deviation compared to maximum-likelihood. Test on shape parameter is carried out to determine whether the distributions are Gumbel or GEV. Null hypothesis is carried out by taking shape factor as zero and comparing the statistic Z with the critical value form normal distributions. The statistic Z that gives significant positive or negative value will be rejected by the null hypothesis. A sample of 35 annual maximum flood data for the river Nidd to Hunsingore, Yorkshire, England is used to fit into extreme value distributions. The test show that the Z=1.00 and this value is insignificant thus this suggested that the river data assumed to come from Gumbel distribution.
For generalized extreme value, the estimator methods suggested are PWM and L-moment. Terafuk Haktanir (1992) show that the GEV is more suitable for PWM compare to MOM. J.C Smither (1994) and Vogel.R.M et al. (1993)
The aim of this study is to identify the most appropriate probability distributions and methods of statistical parameter fitting in Malaysia context. A total of six distributions and 2 methods of statistical parameter fitting are used in this study. There is no problem with method of moment since it is been develop for quite some time. L-moment is not as establish as method of moment and all probability distributions probability density function, cumulative distributions function and quantile function is given except for Log Pearson type 3and two parameter log normal (Hosking and Wallis,1997).The software utilize in modelling the flow data is excel spread sheet.
The annual peak data are obtained from Department of Irrigation and Drainage Malaysia (DID). These data consists of 19 rivers across peninsular Malaysia and the raw data are computed by means of Annual Maximum Series. These data are computed based on the highest value obtained annually. A total of 600 years data obtained from the 19 rivers with an average of 30 flood data.
3.2 Excel spread sheet procedures
Excel Spread sheet is employed for the modelling of the discharges. Mean, standard deviation, return period and skewness of the data are determined .These parameters are essential for the probability distributions to determine the discharges. MOM and LM have the similar procedures except for the order of the data is ranked. For MOM, The data is sorted according to descending order whereas LM ranks the data according to ascending order.
Firstly, sort data according to ascending or descending order. Ranks and sample size are obtained after sorting of data. Thus return period are determined by employing Weibull plotting position as given by equation 1 in appendix. Sample size, N is the number of flow data which range from 11 to 50 sets of data. The rank of descending order is given by equation 2. By knowing the value of N and m, the return period T can be computed. This study focuses on return period of 2, 5, 10, and 20 and 50 years, thus interpolations of discharges are required to obtain discharges which are X2, X5, X10, X20 and X50.
Selected parameter are obtained by using in built function in excel. Mean and standard deviation are obtained by using function in excel spread sheet i.e. AVERAGE and STDEVA. Gamma function is obtained by using function of EXP (GAMMALN) to compute the gamma probability of given variables.
3.3 Comparisons of methods of statistical parameter fitting and probability distributions
3.3.1 Comparison of methods of statistical parameter fitting
Pearson type 3
Log-Pearson type 3
Generalized extreme value
Method of moment
Table1: Comparisons of method of moment and L-moment
Comparisons of method of moment and L-moment as shown in table 1 to find the most suitable probability distribution and methods of statistical parameter fitting for tropical country for instance Malaysia.
Method of moment is a common and relatively easy parameter method. Parameter is estimated by equating the moment of probability distributions function with the moment of sample. There are total of six probability distributions employed in estimating the discharges of the rivers. All the probability distributions except generalized extreme value utilize frequency factor to compute discharges. Frequency factor which derived from the cumulative density function are used in this study. The magnitude is given in equation 3 in appendix. Frequency factor depends on the empirical formula for probability distribution. The magnitude is equivalent to the discharge of the rivers in this study.
L-moment is a linear function of probability weighted moment (PWM) (Hosking, 1986 and 1990).The unbiased sample of L-moment is given from equation 24 to equation 27. The L-coefficient of variance t1 and the L-skewness t3 are given in equation 28 and 29 respectively. The L-coefficient of variance and L- skewness are important parameters to determine the discharges.
Validation of the spread sheet is carried out by using an example given in book. The data given in the example is input into the excel spread sheet and the result is compared with the answer in the book. Once the answer is similar, the formula in the spread sheet will be validated.
3.3.2 Comparison of probability distributions
Pearson type 3
Log-Pearson type 3
Generalized extreme value
Method of moment
uy, Ïƒy, u
Table 2: parameters for probability distributions for method of moment and L-moment
The parameters for probability distributions for methods of statistical parameter fitting are computed to obtain the discharges. The parameters that show in table 2 are substitute into the quantile estimate equation to compute the discharges.
188.8.131.52 Normal distributions
In method of moment, the frequency factor for normal distribution is approximated by an empirical relation (Abramowitz and Stegun, 1965) as shown in equation 4. Firstly determine the exceedance probability p= 1/T .The exceedance probability is govern by the return period, thus a small return period will lead the p to exceed 0.5. This is not allowed as shown above by intermediate variable w. To overcome this matter, p is substituted with 1-p whenever p>0.5. Then determine intermediate variable w by substituting the exceedance probability p in to the equation 5. Subsequently frequency factor is computed by substitute the value w into the equation 3. The parameter Î±1 and Î±2 is mean and standard deviation respectively. An example from water resources engineering (WRE) by Chin (2000) is key into the spread sheet to validate the spread sheet. The answer given by WRE and result from spread sheet is different. This is due to the discrepancy in term of the frequency factor and discharge because WRE did not include the error of 0.00045 into equation 4. Nonetheless, the error is so small that it is negligible.
For l-moment, the normal variate u for the normal distributions is determined by equation 33. The intermediate variable w is given by equation 34 .The intermediate variable w is computed by substitute exceedance probability p into the equation 34 providing the P is lesser than 0.5.If P>0.5, then substitute 1-P into equation 34.The normal variate u is computed by substituting the W into the equation 33. The parameters Î±1 and Î±2 is given by equation 30 and equation 31 respectively i. The quantile estimates is given by equation 32 and the discharge is computed by substitute all the relevant parameters into equation. The spread sheet is validated by using an example from Flood frequency Analysis (FFA) (Rao, 2000).The answers from spread sheets and FFA are similar thus the spread sheet is validated.
184.108.40.206 log-normal distributions
For two parameter log normal distributions using method of moment, the frequency factor is obtained from equation 4. The magnitude XT is given by equation 8. YT is obtained from the transformation of data. Magnitude YT is the inverse of magnitude from equation XT and is given by equation 7. The data is transformed by taking logarithm of the data by maintaining the return periods. With this new transformed data, parameters Î±1 and Î±2 which are mean and standard deviation will be computed. Magnitude YT is obtained by substituting all the related parameter into equation 8. Discharges XT is given by YT to the power of ten. Example from applied hydrology (AP) (Chow, 1988) is input into excel spread sheet to validate the formulation for two-parameter log normal. The answer from the spread sheet and the book is same, thus the spread sheet is validated.
For L-moment, the quantile estimates for log-normal is given by equation 38. Firstly, transform the data is required to obtain the L-coefficient of variance, t. The log-normal distribution function F is given by in appendix. F is computed by substituting L-coefficient of variance t into the equation shown above. Then the intermediate variable w is computed by following the procedure from normal distributions. The frequency factor is acquired by using equation 33 and the parameters uy and Ïƒy are computed by using equation 35 and equation 36. Parameters are substituted into equation 37 to obtain the YT. Lastly, discharge is obtained by exponential of the parameters in equation 38. The parameters for log normal are not in the book by Hosking and Wallis (1997) and since the L-moment is linear to PWM, the example of PWM for two parameter log normal is used to compare with the excel. Therefore the spread sheet is validated by using an example from FFA. The answers from spread sheets and FFA are same thus the spread sheet is validated
220.127.116.11 Pearson type 3
Pearson type 3 also known as three-parameter gamma distributions, the frequency factor govern by return period and skewness. In method of moment the frequency factor can be approximate using the relation (Kite 1977) given by equation 9. Substitute skewness, gx into the equation 10 to acquire the value K. XT' is obtained from the frequency factors of normal distribution. Thus the frequency factor is computed by using the required parameters. Validation of the formulation for Pearson type 3 is carried out by referring to the AP and example from WRE. There is discrepancy between the result from the spread sheet and the answer from the WRE. The differences between the two answers are relatively small. The reason for this discrepancy is that WRE recommended to find the XT' by employing Z-distributions table where as the AP recommend to use equation 4 find the XT'. Utilizing equation to compute XT' will be more accurate compared to using table. The differences between the two answers are relatively small.
In l-moment, the quantile estimate for Pearson type 3 is given by equation 44 in appendix. Firstly, check L-skewness t3 against the equation 39 and 40 to determine which tm and parameter is suitable to be used. Once the parameter is computed, the gamma function and can be computed by using function in spread sheet. If the is too large, the value is unable to be computed by spread sheets. Parameter Î±, Ï’ and u are obtained by equation 41, 42 and 43. The frequency factor of the Pearson type 3 is given by Wilson-Hilferty transformation is given by equation 45. The skewness of the data is determined by using equation 46 and the standard normal variate u is computed by using equation 33. The frequency factor is computed by substituting the parameters of and u into equation 45. Discharge is obtained by using equation 44. The parameters for Pearson type 3 recommended by Hosking and Wallis (1997) have slight different in term of and compared to PWM method in FFA even though the method of deriving the parameter is similar. The PWM parameters are given by equation 47 and 48. Parameters recommended Hosking and Wallis (1997) are used as it is based on L-moment derivation. Since there is no example for l-moment to be validated, a separate spread sheet using PWM estimator is created. Surprisingly, the answer from the spread sheet and FFA turn out to be the same thus the spread sheets is validated.
18.104.22.168 log-Pearson type 3
For method of moment, the log-Pearson type 3 followed the procedure of two-parameter log normal to determine the mean, standard deviation and skewness. The parameters of Î±1 and Î±2 are mean and standard deviation respectively. The frequency factor of Log-Pearson type 3 is similar to frequency factor in Pearson type 3. K is obtained using equation 9 and XT' is obtained using equation 4. Frequency factors will be computed by substitute necessary parameters into the equation 9.Validation of formulation for log-Pearson type 3 by using example from AP. The answer acquire are similar for formulation from spread sheet and the AP, thus the spread sheet is validated.
For L-moment, the quantile estimates for Log-Pearson type 3 is given by equation 49. Firstly, transform the data by taking logarithm of the data. Determine the new L-coefficient of variance, L-skewness and mean. The parameters Î², Î± and Ï’ are computed by using the equation 39 or 40, equation 47 and equation 48 respectively. KT is obtained by using Wilson-Hilferty transformation which is similar to Pearson type 3 for l-moment. Discharge is computed by substituting the parameters as shown in table 2 into equation 49. Log-Pearson type 3 is similar to two parameter log normal case because no information provided for Log-Pearson type 3 in the book by Hosking and Wallis (1997). FFA suggested transforming the data by logarithm the data and the other steps follow the procedure in Pearson type 3. The procedure for Î± recommended in FFA is different from the Pearson type 3 procedures. The parameter recommended by Hosking is used to calculate quantile estimates.
22.214.171.124 Gumbel/extreme value 1
In method of moment, frequency factor for Gumbel or extreme value 1 distributions is given by (Chow, 1953). The frequency factor is given by equation. Frequency factor is computed by substitute the return period into the equation 3.11. The parameter Î±1 and Î±2 is given by mean and standard deviation respectively. Validation of formulation for Gumbel distributions is carried out by using an example from the WRC. The answer obtained is similar to the answer in the book thus the spread sheet is validated.
For L-moment, the quantile estimates for Gumbel or Extreme value 1 is given by equation 52 in appendix. Firstly, Parameter Î± and Î² are obtained by using equation 50 and equation 51. Subsequently, use the relationship of to obtained F. Discharges is computed by using the equation 52. The spread sheet is validated by using an example from (FFA). The answers obtained are similar for both spread sheets and FFA, thus the spread sheet is validated.
126.96.36.199 Generalized extreme value
In method of moment, the quantile estimator for generalized extreme value is given by equation 23. The parameter k is given by equation 15 or 16 or 17 and it depend on the skew Cs. Iteration will be necessary if the kn in equation 18 is not equal to ko. The iteration will continue until the kn almost equivalent to kn+1. The parameters for Î± and u are computed using the equation 21 and 22. Validation of formulation for generalized extreme value distributions is carried out by using an example from the FFA. The answer obtained is similar to the answer in the book thus the spread sheet is validated.
For L-moment, the quantile estimator is similar to MOM. The parameters for k, Î± and u are computed by using equation 53 to 55 respectively. Validation of formulation for generalized extreme value distributions is carried out by using an example from the FFA. The L-skewness t3 obtained is different from the answer from the book but surprisingly the L-CV and mean is similar. The t3 suggested by FFA is input to excel and the answers obtained is similar to the FFA. The L-skewness t3 is computed using equation 29 but the value is different from FFA thus the quantile estimate is significantly different.
3.5 Data Analysis
Comparisons of discharges/magnitude between results by probability distributions and their estimator methods with the actual data are conducted. Return periods of 2, 5, 10, 20 and 50 years are utilized in these comparisons. In matter of fact, there is a few data that can reach 50 years of return period and most of the data reach up to return period 20 years. The purpose of these comparisons is to evaluate the percentage of error for each probability distribution against actual data. The formula of this comparison is given by:
The percentage difference represents the accuracy of the probability distribution in estimating the discharge. For positive percentage difference, it indicates that the probability distribution overestimate the flow whereas negative percentage difference indicates that the probability distribution underestimate the flow. Absolute difference is carried out to find the summation of differences between probability distributions and actual data for each river.
The second comparisons of discharges are between method of moment and L-moment. Return period of 2,5,10, 20 and 50 years are used in these comparison. The main goal of this comparison is to evaluate the difference between both estimator methods. Comparison is conducted by comparing the difference of discharges between method of moment and L-moment with respect to the discharges for method of moment:
The percentage of difference shows which estimator method can estimate higher discharges value. The positive percentage of difference show that MOM approximate higher discharge compared to L-moment whereas negative percentage of difference indicate that MOM approximate lower discharge than L-moment. Absolute difference is carried out to find the summation of differences for probability distributions between method of moment and L-moment for each river.
Confidence intervals (CI) of 90%, 95% and 99% are conducted to evaluate performance of probability distributions and estimator method. Confidence interval of 90%, 95% and 99%are given by:
Confidence interval can be computed by using function in excel spread sheet as well. The actual data Â± CI will obtain a lower limit and upper limit to be compared with the discharges computed from probability distributions and estimator methods. The approximated discharge that fall in between the limits will be accepted and will be marked as '1'. For approximated discharges that fall outside the limits will consider rejected and will be marked as '0'. The probability distribution and estimator method that had the most '1' show that it is most appropriate to be employed in estimating discharges in Malaysia.
Chapter 4: Results and discussion
This study compares six probability distributions using two parameter estimators for 19 rivers in peninsular Malaysia. The comparisons are divided to three categories which are the method of moment vs actual data, L-moment vs actual data and method of moment vs L-moment. The tests for probability distributions are Chi-squared test, Kolmogorov-Smirnov test and confidence interval.
4.1 Method of moment vs actual data
Figure 1 illustrates six probability distributions and actual data are plotted against return period for all the rivers. A total of 19 rivers with sample size range from 11 to 49 are employed in the illustration. For sample size n=40 and above and for sample size n=20 and below, the probability distributions on average overestimated the flood data compared to actual data. For sample size range from 20 and 40, the probability distributions averagely underestimated the flood data compared to actual data. Presence of exceptional large flood in river (i.e. Lui River, Gemencheh River and Bernam River) that mainly consists of small flood data, the probability distributions will perform badly such that the differences between the discharges of actual data and the estimated discharges are fairly large. From figure 1, the rivers mention above have the same characteristic which is the flood data is overestimated initially and then underestimated at the end compared to actual data. Nonetheless, actual data for Kulim River is lower than majority of the probability distributions for all return periods even with presence of exceptional large flood data. For large flood data river (i.e. Berang River), the probability distributions overestimated the flood data compared to actual data throughout the return periods.
The percentage differences between the MOM and actual data for N range from -184% to 47%, LN2 range from -85% to 57%, P3 range from -143% to 47%, LP3 range from -94% to 47%, EV1 range from -187% to 42% and GEV range from -86% to 42% as shown in table 2. EV1 is the distribution with highest negative percentage differences following by N, P3, LP3, GEV and LN2. Nonetheless, EV1 and GEV distribution are the distributions that have the lowest positive percentage difference follow by N, P3, LP3 and LN2. GEV is the distribution with the lowest average percentage difference while N is the distribution with the highest percentage difference The sum of absolute difference between the actual data and probabilities distributions for the 19 rivers are N = 27.5, LN2 = 15.8, P3 = 18.4, LP3 = 14.6, EV1 = 22.62 and GEV = 12.7. From the sum of absolute difference, the GEV is the distribution with the lowest value thus GEV estimates the flood data with the highest accuracy compared to actual data for method of moment.
4.2 L- moment vs actual data
From figure 2, the log-Pearson type 3 for river Kepis is not shown because the gamma function is so large that the excel spread sheet unable to compute. For sample size n=40 and above, the probability distributions averagely overestimated the flood data compared to actual data and this is similar to the outcome for method of moment. For sample size range from 20 and 40, the probability distributions on average underestimated the flood data compared to actual data which is similar to the outcome for method of moment as well. Nevertheless, for sample size n=20 and below, the probability distributions for L-moment on average are fairly equal in either underestimating or overestimating flood data compared to actual data. For example, Buloh River and Pelarit River on average underestimate the flood data while Tasoh River and Berang River averagely overestimate the flood data compared to actual data. Similar to method of moment, the river with the presence of exceptional large flood data influence the performance of the probability distributions. From figure 2, the river with exceptional large flood data will behave similarly to characteristic that stated in method of moment. Furthermore, Kulim River in L-moment behaves similarly to the river with exceptional large data unlike Kulim River case in method of moment.
Figure 2: Comparison of probability distributions using L- moment and actual data vs return period
Figure 2: Comparison of probability distributions using L- moment and actual data vs return period
Figure 2: Comparison of probability distributions using L- moment and actual data vs return period
Table 2: Percentage difference and absolute differences of actual data vs L- moment