Arima Model Bangladesh Pea And Lentil Pulse Production Biology Essay

Published:

The study to examine the best ARIMA model that could be used to make efficient grass pea and lentil pulse production in Bangladesh. It is also tried to find out that the best deterministic type growth model that are commonly used to describing growth pattern and also for casting. It appeared from the study that grass pea and lentil pulse production time series are 1st order homogenous stationary. The study found that the ARIMA (0,1, 8) and ARIMA (0,1, 9) are the best for grass pea and lentil pulse production respectively. The best deterministic type growth models were cubic model for grass pea and lentil pulse production. It is observed from the analysis that short term forecasts are more efficient for ARIMA models compared to the deterministic models. The production uncertainty of pulses can be minimizing if production can be forecasted well and necessary steps can be taken against losses. The government and producer as well use ARIMA methods to forecast future production more accurately in the short run.

Lady using a tablet
Lady using a tablet

Professional

Essay Writers

Lady Using Tablet

Get your grade
or your money back

using our Essay Writing Service!

Essay Writing Service

Keywords: ARIMA, Growth model, Production and Forecasting.

Introduction

Bangladesh has been striving for rapid development of its economy. Economic development in Bangladesh could not be achieved unless it could have achieved a breakthrough in the agricultural sector (Alam, 1991, p.1). The agricultural sector contributed 20.87% at country's gross domestic product (BER, 2008, p. xvii). In Bangladesh, production of major food crops-rice, wheat, pulses and oilseeds- does not meet the present requirements of country's population of about 135 million. The gap is widening both in quantity and quality. Rice and wheat have been the focus of concerted government effort in research and development. Similar attention was long overdue for the pulse crops, commonly known as poor man's meat. Pulses are vital components in diversification of Bangladesh's predominantly rice based cropping system. Pulses contribute 2.3 per cent value added to agriculture. Total cultivated area in Bangladesh is 8031161 hectares (BBS, 2004a, pp.120, 134) of which pulses constitute 420763 hectares i.e., 5.24 per cent of total cultivated land (BBS, 2004b, p. 63). Lentil and grass pea are the two most important pulses in Bangladesh, comprising more than 60% of the total pulses grown in the country.

Grass pea covered about 38 per cent of the total cultivated area of pulses and 40 per cent of the total production of pulses in Bangladesh (BBS, 2004). However, nearly one-third of the total area of grass pea production belongs to three southern districts: Patuakhali, Barisal, and Bhola. Other major grass pea -growing districts are Faridpur, Dhaka, Noakhali, Jessore, Rajshahi, Comilla, Tangail and Pabna. Lentil had a 37 per cent share of total pulse production with same percent of area of cultivation in Bangladesh (own calculation, data from BBS, 2004). More than 85 percent of lentil area is concentrated within the nine greater districts: Jessore, Faridpur, Kustia, Rajshahi, Pabna, Comilla, Noakhali, Manikganj, and Khulna.

For proper future planning to develop the pulses sector in order to meet the pulses production recruitment of the country, it is extremely helpful to know the likely future movement of the production process. For this purpose one or both of the two types of models, usually known as structural regression models and time series models are often used in practice. The use of structural regression models requires information about the factors affecting the time series. On the other hand, time series analysis, especially Box-Jenkin type ARIMA models, let the data speak for themselves i.e. the future movements of a time series are determined using its own present and past values (Box and Jenkins, 1978). Among the stochastic time series models ARIMA types are very powerful and popular as they can successfully describe the observed data and can make forecast with minimum forecast error. These types of models are very difficult to identify and estimate. They are also expensive, time consuming and possesses a complex model building mechanism. Another type of time series models, called deterministic growth models are also very common to use in practice for growth analysis and forecasting, as they are very quick to estimate and less expensive, although less efficient. They are very good in many situations for describing the growth pattern and the future movement of a time series (Pindyck and Rubinfeld, 1991).

The purpose of this study is to develop appropriate ARIMA models for the time series of grass pea and lentil pulse production in Bangladesh and to make five year forecasts for all the time series with appropriate prediction interval. Another purpose is to compare the forecasting performance of ARIMA and deterministic models for grass pea and lentil pulse production in Bangladesh.

Methodology: Data and Models

Lady using a tablet
Lady using a tablet

Comprehensive

Writing Services

Lady Using Tablet

Plagiarism-free
Always on Time

Marked to Standard

Order Now

The data of annual pulses production were collected from the various issues of Statistical Yearbook and agricultural statistics year book of Bangladesh Bureau of Statistics for the period 1967-68 to 2007-08. The time series of grass pea and lentil pulse production were modeled by stochastic autoregressive integrated moving average (ARIMA) process. The most popular Box-Jenkins type ARIMA process of order p, d and q, denoted as ARIMA (p, d, q), may be defined as follows (Box and Jenkins, 1978):

Yt= Pulses production at time t, μ= The mean of ,

The ith autoregressive parameter; i = 1,2,3....p, The ith moving average parameter; i =1,2,3....q, p = The autoregressive order, q = The moving average order, d = The times that the series is differenced, = The difference operator, B = The back shift operator

The estimation methodology of the above model consists of three steps viz. identification, estimation of parameters and diagnostic checking. The identification step involves the use of the techniques for determining the values of p, d and q. Here, these values are determined by using autocorrelation and partial autocorrelation functions (ACF and PACF) and Augmented Dickey-Fuller (ADF) test. The second step is to estimate the parameters of the model. Here, the method of maximum likelihood is used for this purpose. The third step is to check whether the chosen model fits the data reasonable well. For this purpose the residuals are examined to find out if they are white noise. To test if the residuals are white noise the ACF of residuals and the Ljung and Box (1978) statistic are used. In case of two or more competing models passing the diagnostic checks the best model is selected using the criteria multiple R2, Adjusted , Root mean squared error (RMSE), Akaike Information Criterion (AIC), Bayesian Information (BIC), Mean absolute error (MAE) and Mean absolute percent error (MAPE). Nine deterministic types' growth models are also considered in this study for comprising the forecasting efficiency of stochastic models which are presented in table 01. The above mentioned model selection criteria are used to select the best deterministic model forecasting purpose.

Criteria used for selection of model

Until the rules of model selection are strictly followed, the forecast generated with the assumed model may sound insipid or dry. In order to select the type of growth model of the best model fit for forecasting the data for a particular time series the latest available model selection criteria are R2, , RMSE, AIC, BIC, MAE, MSE and MAPPE.

Results

Test of stationarity using ACF and PACF

Autocorrelation function is a very constructive tool to find out whether a time series is stationary or not. Both ACF and PACF are also used to determine auto-regression and moving average orders of the models. ACF and PACF of grass pea and lentil pulse production are shown in figures 02 through figures 05. All the graphs show that autocorrelations taper of very slowly is indicating that all the series are non-stationary. It is needed to take 1st-difference of all the time series and construct autocorrelation functions to examine if the time series are stationary or not. The autocorrelation functions of 1st differenced time series of grass pea and lentil pulse production are presented in figure 04 and figure 05. The 1st differenced grass pea and lentil pulse production seems to be stationary, as the autocorrelation decline faster. Before taking the decision about stationarity of the series the study needs to carry out the ADF (Augmented Dickey -Fuller) test of stationarity.

Test of stationarity using ADF

Apart from the graphical methods of using ACF for determining stationarity of a time series, a very popular formal method of determining stationarity is the Augmented Dickey-Fuller test. Here, this test is done for all the time series. The estimates of necessary parameters and related statistics for the time series of pulses production are presented in table 02.

The analysis exposed that the hypothesis of random walk that underlying process of generating the time series is non-stationary can not be rejected, as the related F statistics is insignificant at 5% level. So, all the undifferenced time series are non stationary and they must be 1st-differenced to see if 1st-differenced are nonstationary.

From table 03 the analysis reveals that first differences of grass pea and lentil pulses production are stationary, as the F-values were significance at 5% level. The 1st-differenced series is found to be evolutionary.

Modeling time series of grass pea pulse production

Lady using a tablet
Lady using a tablet

This Essay is

a Student's Work

Lady Using Tablet

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Examples of our work

To select the best ARIMA model for grass pea pulse production series, a routine test of identification is done before using Box-Jenkins methodology. Figure 02 shows the ACF and PACF plots at their level. The ACF has eight significant spikes at the beginning while the PACF has only one significant spike at the beginning. Figure 04 shows ACF and PACF plots of grass pea pulse production series at the first difference level. In this figure, the ACF show no significant auto­correlation at any lag. Augmented Dickey-Fuller unit root test (Table 02) show the stationarity position of this series at same order of difference. This implies that the grass pea pulse production series is non-stationary at their level and stationary at first difference level. Accordingly, the second parameter of the ARIMA model is selected as 1. From the Figure 04 it is seen that the ACF and PACF plots of grass pea pulse production series show different nature from that of figure 02. The order of auto-regression and moving average process of grass pea pulse production series are selected by estimating the ARIMA models at p = 0,1 and q = 0,1,2,3,4,5,6,7,8 using the same software packages. Eighteen ARIMA models at different values of p, d and q such as ARIMA (0,1,0), ARIMA (0,1,1), ARIMA (0,1,2), ARIMA (0,1,3), ARIMA (0,1,4), ARIMA (0,1,5), ARIMA (0,1,6), ARIMA (0,1,7), ARIMA (0,1,8), ARIMA (1,1,0), ARIMA (1,1,1), ARIMA (1,1,2), ARIMA (1,1,3), ARIMA (1,1,4), ARIMA (1,1,5), ARIMA (1,1,6), ARIMA (1,1,7) and ARIMA (1,1,8) are estimated. All these models are estimated and their diagnostic checks are done using ACFs of residuals and Box-Ljung chi square test. In addition the minimum values of RMSE, MSE, MAE, AIC, BIC, MAPPE and high value of R2, are used to select the best model which are shown in Table 04. The value of chi square with the P-value and the value of model selection criteria are given in table 09 only for the best model. Hence it can be concluded that ARIMA(0,1,8) is comparatively the best fitted model for forecasting the grass pea pulse production in Bangladesh. This justifies that the selection of ARIMA(0,1,8) as the best model to represent the data generating process very precisely. From the above table it is revealed that the model ARIMA (0,1,0) is better than the other models in case of AIC and BIC but ARIMA(0,1,8) is better for the remaining criterions except MAE. So the ARIMA(0,1,8) is considered as the best model.

Modeling time series of lentil pulse production

Figure 03 shows the ACF and PACF plots of investment series of Bangladesh at 16 different lags. The ACF plots have eight significant spikes at the beginning while the PACF plots have also one significant spike at the beginning. Figure 05 shows the ACF plots and PACF plots of lentil pulse production series of Bangladesh at first difference. The ACF and PACF plots of 1st difference show one significant spike at lag 13. This implies that the lentil series of Bangladesh is non-stationary at their level and stationary at first difference level. Augmented Dickey-Fuller unit root test shows the stationarity position of this series at the same order of difference. This suggests that the second parameter d of the ARIMA mode may tentatively selected as 1. The ACF and PACF plots of Figure 05 have quite opposite properties. In this case, the order of auto-regression and moving average process of investment series are selected by estimating the ARIMA models at p=0,1 and q==0,l,2,3,4,5,6,7,8 using the same software packages. Twenty ARIMA models at different values of p, d and q such as ARIMA (0,1,0), ARIMA (0,1,1), ARIMA (0,1,2), ARIMA (0,1,3), ARIMA (0,1,4), ARIMA (0,1,5), ARIMA (0,1,6), ARIMA (0,1,7), ARIMA (0,1,8), ARIMA (0,1,9), ARIMA (1,1,0), ARIMA (1,1,1), ARIMA (1,1,2), ARIMA (1,1,3), ARIMA (1,1,4), ARIMA (1,1,5), ARIMA (1,1,6), ARIMA (1,1,7), ARIMA (1,1,8) and ARIMA (1,1,9) are estimated. All these models are estimated and their diagnostic checks are done using ACFs of residuals and Box-Ljung chi square test. In addition the minimum values of RMSE, MSE, MAE, AIC, BIC, MAPPE and high value of R2, are used to select the best model which are presented in Table 05. The value of chi square with the P-value and the value of model selection criteria are given in table 09 only for the best model. Hence it can be concluded that ARIMA(0,1,9) is comparatively the best fitted model for forecasting the lentil pulse production in Bangladesh. This justifies that the selection of ARIMA(0,1,9) as the best model to represent the data generating process very precisely. From the above table it is revealed that the model ARIMA(0,1,0) is better than the other models in case of AIC, BIC and but ARIMA(0,1,9) is better for the remaining criterions. So the ARIMA(0,1,9) is considered as the best model.

Best estimated models

The above discussion about the fitness of various models to the time series of pulses production in Bangladesh reveals that ARIMA (0,1,8) finally chosen for grass pea and ARIMA (0,1,9) for lentil pulse in describing the future values. It also reveals that the selection of the best model for a particular category may sometimes be very confusing. However the discussion recommends a best model for a particular category as given in table 06.

Diagnostic checking

For diagnostic checking ACF of residuals and Ljung and Box chi square statistic are widely used in practice. In table 07 the chi square statistics are given for all the best-selected stochastic models with P-values. All the chi square values are insignificant. It implies that the residuals of the respective time series are white noise implying that the model fitness is acceptable.

Forecasting

Five year forecasts of grass pea and lentil pulse productions are estimated using the best selected models and are given in table 08 and 09. Prediction intervals of forecasts are also given. It appears from the analysis that short-term forecasts are more efficient for ARIMA models compared to the deterministic models. ARIMA (0,1,8) forecasts are higher than the deterministic (cubic) forecasts. A close observation of the forecasted values and confidence intervals presented in Table 08 reveal that forecasting error sufficiently small and consequently the intervals are too large.

It appears from the analysis that short-term forecasts are more efficient for ARIMA models compared to the deterministic models. ARIMA (0,1,9) forecasts are higher than the deterministic (cubic model) forecasts. A close examination of the forecast values and confidence intervals given in Table 09 reveals that forecasting error is small and consequently the intervals are not too large.

Table 01: The mathematical forms of the models considered and formulas of the growth rates

Name of the model

Mathematical form

Meaning and assumptions

Linear

Y is the time series considered

t represents time taking integer values starting from 1

is the regression residual

a, b, c, and d are the coefficient of the models

Logarithmic

Inverse

Quadratic

Cubic

Power

S

Exponential

Compound

(Source: Haque, 2004)

Table 02: ADF test of stationarity of grass pea and lentil pulse production

Area

Model

RSS

DF

DW

F

F05, 41

Grass pea pulse production

Unrestricted

SE

13075

10415.70

850.26

531.84

-0.249

0.117

0.099

0.218

22479335325

35

1.77

2.40

6.95

Restricted

SE

2671.92

4195.85

-0.097

0.201

25396337996

37

1.72

Lentil pulse production

Unrestricted

SE

6164.87

7279.39

647.54

416.21

-0.177

0.093

0.155

0.182

13196388734

35

1.95

1.94

Restricted

SE

2845.25

3194.81

0.049

0.177

14583381584

37

1.86

Table 03. ADF test of stationarity of grass pea and lentil pulse production (1st difference)

Area

Model

ρ-1

RSS

DF

DW

F

F05, 40

Grass pea pulse production

Unrestricted

SE

3563.96

10704.52

-28.92

404.09

-1.36

0.294

0.267

0.211

24242883247

34

1.71

11.73

7.06

Restricted

SE

2069.43

5412.01

-0.423

0.187

40038872394

36

1.93

Lentil pulse production

Unrestricted

SE

1297.69

7767.39

65.52

308.26

-0.909

0.257

-0.041

0.186

14546227765

36

1.86

8.01

Restricted

SE

1172.20

3829.05

-0.495

0.153

20057024339

34

2.15

Table 04. Diagnostic tools and model selection criteria for grass pea pulse production of best fitted models

Model

Values of selection criteria

MAE

MSE

RMSE

AIC

BIC

MAPPE

R2

ARIMA(0,1,0)

14829.84

738509415.11

27175.53

841.23

844.65

13.20

0.721

0.714

ARIMA(0,1,1)

15150.88

731341558.54

27043.33

842.83

847.97

13.52

0.724

0.710

ARIMA(0,1,2)

14722.25

704717222.72

26546.51

843.31

850.16

13.47

0.734

0.713

ARIMA(0,1,3)

14265.08

696380140.65

26389.02

844.82

853.39

13.18

0.737

0.708

ARIMA(0,1,4)

14293.34

695538639.96

26373.07

846.77

857.05

13.21

0.738

0.700

ARIMA(0,1,5)

14082.28

693169338.94

26328.11

848.63

860.62

14.82

0.738

0.692

ARIMA(0,1,6)

15114.46

674843130.37

25977.74

849.53

863.24

14.71

0.745

0.691

ARIMA(0,1,7)

15272.29

656261118.16

25617.59

850.38

865.81

15.00

0.752

0.690

ARIMA(0,1,8)

15040.47

654710288.40

25587.31

852.29

869.42

12.98

0.753

0.719

ARIMA(1,1,0)

15040.51

734663499.46

27104.68

843.01

848.15

13.39

0.723

0.708

ARIMA(1,1,1)

15979.91

688133470.66

26232.30

842.33

849.18

15.06

0.740

0.681

ARIMA(1,1,2)

14154.21

689796224.96

26263.97

844.43

853.00

13.06

0.740

0.711

ARIMA(1,1,3)

14100.52

688869061.03

26246.32

846.37

856.65

13.08

0.740

0.703

ARIMA(1,1,4)

14611.44

663042698.88

25749.62

846.81

858.80

14.14

0.750

0.706

ARIMA(1,1,5)

14067.18

685090853.38

26174.24

850.15

863.86

13.11

0.741

0.687

ARIMA(1,1,6)

14215.81

654796983.68

25589.00

850.29

865.72

13.77

0.752

0.691

ARIMA(1,1,7)

13868.45

659593487.10

25682.55

852.59

869.73

13.47

0.751

0.679

ARIMA(1,1,8)

14028.55

666154301.31

25809.97

855.00

873.85

13.43

0.749

0.665

Note: The value of the criterion for a model with asterisk shows that the model is better than other models with respect to that criterion

Table 05. Diagnostic tools and model selection criteria for lentil pulse production of best fitted models

Model

Values of selection criteria

MAE

MSE

RMSE

AIC

BIC

MAPPE

R2

ARIMA(0,1,0)

9365.81

411093129.52

20275.43

817.21

820.63

10.33

0.844

0.840

ARIMA(0,1,1)

9204.72

410425424.43

20258.96

819.14

824.28

10.21

0.844

0.836

ARIMA(0,1,2)

9238.94

409798093.12

20243.47

821.08

827.93

10.20

0.844

0.832

ARIMA(0,1,3)

9219.60

409474784.98

20235.48

823.05

831.61

10.22

0.845

0.827

ARIMA(0,1,4)

9272.34

408756121.42

20217.72

824.97

835.26

10.25

0.845

0.823

ARIMA(0,1,5)

10298.06

407406630.27

20184.32

826.84

838.83

10.29

0.845

0.818

ARIMA(0,1,6)

9214.53

407280369.27

20181.19

828.83

842.53

10.37

0.845

0.813

ARIMA(0,1,7)

11154.79

390905395.70

19771.33

829.14

844.57

13.02

0.852

0.814

ARIMA(0,1,8)

10218.70

374588666.42

19354.29

829.39

846.53

12.04

0.855

0.816

ARIMA(0,1,9)

9164.17

374381738.67

19348.95

831.37

850.22

10.13

0.858

0.810

ARIMA(1,1,0)

9196.94

410371272.43

20257.62

819.14

824.28

11.20

0.844

0.836

ARIMA(1,1,1)

9216.57

410243443.73

20254.47

821.12

827.98

10.22

0.844

0.832

ARIMA(1,1,2)

9248.46

409638424.98

20239.53

823.06

831.63

10.23

0.844

0.827

ARIMA(1,1,3)

9428.96

392531572.98

19812.41

823.31

833.59

10.82

0.851

0.830

ARIMA(1,1,4)

9462.90

392781825.54

19818.72

825.34

837.33

10.78

0.851

0.825

ARIMA(1,1,5)

9357.97

393602143.70

19839.41

827.42

841.13

10.69

0.851

0.819

ARIMA(1,1,6)

10279.56

384004858.81

19596.04

828.41

843.83

11.99

0.854

0.818

ARIMA(1,1,7)

9249.86

393248745.35

19830.50

831.39

848.52

10.61

0.851

0.807

ARIMA(1,1,8)

10329.11

374706261.24

19357.33

831.41

850.26

12.14

0.856

0.810

ARIMA(1,1,9)

9379.45

394094523.74

19851.81

835.48

856.04

10.54

0.850

0.794

Note: The value of the criterion for a model with asterisk shows that the model is better than other models with respect to that criterion

Table 06. Best estimated models for pulses production in Bangladesh

Variety

The name of the best model

The functional form of the model

Grass pea pulse production

ARIMA (0,1,8)

Lentil pulse production

ARIMA (0,1,9)

Table 07: Diagnostic tools and model selection criteria for the best fitted models

Area

Model

MAE

RMSE

AIC

BIC

MAPPE

R2

χ2 (BL at 16 lag)

P-value

Grass pea

ARIMA

(0,1,8)

Not satisfied

25587.31

Not satisfied

Not satisfied

12.98

0.753

0.719

4.658

0.997

Lentil

ARIMA (0,1,9)

9164.17

19348.95

Not satisfied

Not satisfied

10.13

0.858

Not satisfied

6.072

0.987

Table 08: Grass pea pulse production forecasts

Year

ARIMA(0,1,8)

Cubic

Forecast

LPL

UPL

Forecast

LPL

UPL

2008-09

187528.44

131624.22

243432.66

107587.20

20950.88

194223.53

2009-10

177568.03

100207.56

254928.51

95687.67

4158.47

187216.86

2010-11

188395.43

102369.92

274420.95

82546.59

-

180158.20

2011-12

194126.86

99269.46

288984.26

68125.17

-

173091.63

2012-13

209170.27

107931.13

310409.41

52384.64

-

166037.01

LPL: Lower Predictive Value; UPL: Upper Predictive Value

Table 09: Lentil pulse production forecasts

Year

ARIMA(0,1,9)

Cubic

Forecast

LPL

UPL

Forecast

LPL

UPL

2008-09

173038.03

129326.18

216749.87

100795.91

25549.77

176042.04

2009-10

183657.31

120638.56

246676.06

89719.86

10224.13

169215.59

2010-11

187005.50

108896.99

265114.00

77385.16

-

162163.64

2011-12

191398.96

102670.93

280127.00

63748.72

-

154915.10

2012-13

195068.36

97946.97

292189.75

48767.46

-

147477.80

LPL: Lower Predictive Value; UPL: Upper Predictive Value