Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UKEssays.com.
It cannot be denied that weather forecasting, i.e. predicting weather behaviour, is a very challenging task, even with the rapid growth in science. Weather is known to be in the area of meteorology. The process is carried out by collecting data related to the current state of weather like rain, heat, wind, and fog. Data mining techniques in this field has increasingly developed over the last ten years.
Over the years, many researchers have been successful in applying data mining tools in other to predict weather conditions and climate change forecasting.
This literature review would examine the use of data mining techniques in weather forecasting. It will also discuss important properties that are necessary for data mining techniques to be used in the prediction of weather forecasting.
The main objective of this literature review is to provide a comprehensive comparative analysis of various data mining techniques used in weather prediction. Also pointing out the advantage and disadvantage of each technique described.
The world of meteorology has been making great strides in putting a lot of research effort when it comes to weather prediction. The prediction of storm, clouds, wind, rain or hail is known as weather forecasting. One of the biggest and challenging issues with weather forecasting is as one would imagine the unpredictability of the data sets. These data sets can change frequently according to climate change .
If you need assistance with writing your essay, our professional essay writing service is here to help!Find out more
There has been a lot of techniques that have been tried and out of these techniques, data mining is considered the most useful approach towards weather forecasting . Data mining is most feasible because of its capability to find veiled patterns or connections and in turn, provide authenticity of the data sets based on certain input factors.
The process of sorting through a very large amount of data sets in other to identify connections and establish relationships between these data’s that would be in turn used to solve problems through data analysis is known as data mining . Data mining can also be referred to as the Knowledge Discovery in Databases (KDD).
Since the 1960, data mining has been used as a branch of applied artificial intelligence and has grown over the years. This technique uses various types of database, such as, spatial, relational and transactional.
In data mining, there are many methods to characterise data, such as clustering, data visualisation, clustering, classification and more .
The application of data mining can be classified quite broadly into two types ;
- Descriptive data mining and analysis for analysing properties of existing data
- Predictive data mining which includes statistical analysis on data to make predictions
The difference between data mining technique and statistical methods is that data mining allows you to search for very interesting details and information without the need of having prior hypothesis. Different kinds of pattern can be discovered by using different data mining techniques .
These data mining techniques employed are usually more flexible, powerful and efficient in comparison to statistical techniques for exploratory analysis .
The following data mining techniques are the most frequently used ;
Artificial Neural Networks
Nearest Neighbour method
Climate can be defined as the effect the sun’s radiation has in the long term on the varied earth atmosphere and surface when it is in rotation. The day to day change in a specific area makes up the weather, but climate is the effect these changes have in the long term .
The use of thermometers, barometers, rain gauges etc. are used in the measurement of weather but in other to study climate, statistics is used. These sorts of analysis using statistics is now carried out by computers quite efficiently. The downside to this is that the climate cannot be summarised by a simple method. This does not give the trough impact. To gauge the true impact weather has on the climate, it requires analysis of yearly, monthly and daily patterns . The way weather conditions change constantly is a very important factor that impacts day to day living. It goes as far as having an influence in the economy, being that agriculture is a big contributor to the economy  and is quite dependant on weather conditions.
For these reasons, there has been a need to mitigate the damage lack of ability to predict the weather may have on the economy. This has led to the influx of data mining techniques to help predict weather conditions.
The prediction of weather has been greatly researched over the years. The reason being that the change in the climate has been found to directly impact the population . Weather prediction can be classified into two categories, Numerical and Empirical approach. The collection of present weather conditions through ground observation is known as Empirical approach, i.e. observation from satellites, etc. Once these present observations are made, they are then forwarded to meteorological centres where they are analysed. Using computers, the analysed reading is then converted to multidimensional maps.
Using these maps, predictions about the changes that would occur in the different regions of the map over a certain period would be made by scientists.
The Empirical approach refers to the use of mathematical equations over climatic variables to make a prediction.
In order to reduce the detrimental impact and challenges incorrect weather forecasting would have on the economy, some major techniques have been developed. The techniques this review will focus on is ;
- Group Method Data Handling
- Decision Tree
- Neural Networks
- K-means clustering analysis 
These methods have been established through the critical review of scholarly works. P. Buryan, and A. Abraham (2007) described a self-organising modelling called enhanced e-GMDH (Group Method Data Handling)  as an effective technique to forecasting weather.
Onwubolu et.al. used this same model to predict pressure, daily temperature and monthly rainfall . Their result showed that temperature predicted had an error difference of +/- 1.5, while monthly rainfall prediction showed similar prediction to various other researches . The advantage of the GMDH method is its ability to process large amount of data. Its disadvantage lies in its poor generalisation and over-fitting of data .
Similarly, to this, Olaiya and A.B. Adeyemo (2012) investigated the use of Decision Tree algorithm and Neural networks and in the prediction of different weather conditions.
Meteorological data was collected and then classified using classifier algorithm, in addition to comparing them by using standard measures of performance . The results that was obtained when compared to actual weather data was found to be accurate . The advantage Artificial Neural Network has is that is can process different memory components, parameters and much more to find hidden patterns. Its disadvantage is that this technique is extremely complex and requires the data presented to be very accurate in order to be able to predict accurate weather forecast .
Our academic experts are ready and waiting to assist with any writing project you may have. From simple essay plans, through to full dissertations, you can guarantee we have a service perfectly matched to your needs.View our services
As mentioned above, their research discovered that the use of Decision Tree and Neural Networks algorithm are efficient in discovering the relationships between weather parameters. It was also discovered with this same research that prediction of future weather condition was very much possible .
Decision tree can be described as a support tool for data mining. It is easy to understand and interpret. Petre  went further to base the Classification and Regression Trees (CART) on this concept. The CART technique involves rules that are based on statistical information and the collection of datasets regarding the data generated during the modelling .
By using this technique, accurate predictions can be made on an average temperature for months in advance .
The advantage of this technique is that it is based on a simple prediction modelling and has interactive graphical properties. Its main disadvantage is its exhaustive search approach .
Forecasting of weather conditions can be described often be described as being challenging, due to the fact that it requires the analysis of various multidimensional and nonlinear data set. To combat this, Taksande and Mohod, used a data mining approach called Frequent Growth Algorithm, to perform weather prediction . They five general algorithms, neural network, k-nearst neighbour, support vector, classification, support vector machine and classification and regression . This method of forecasting weather condition was used for deleting and recognising unnecessary data. From research carried out using this method, it was found that it was 90% accurate in predicting weather . The advantage of this method is that it is very scalable. It can be used for various probabilities and across different population. The disadvantage is that it requires a lot of time and also it has workload threshold .
Figure 1: A model of weather forecasting through Growth Algorithm. 
Another research carried out by D. Chauhan, Shimla, and J. Thakur (2014) on weather attributes showed consistency with the results derived by A.B Adeyemo and F.Olaiya. This research also showed that using K-means clustering and decision tree analysis to predict possible weather condition is more accurate as it solves the problem quickly by deleting unnecessary data .
From the different scholarly works read, an argument can be made that Neural Networks and Decision Tree Algorithms are the most popular data mining techniques used in weather forecasting.
From the above, one can deduce that data mining is a powerful tool that enables the scrutinization and prediction of significant data from different databases.
From the articles read, one can deduce that for good weather prediction using data mining techniques, the following are a necessity;
- The chosen data mining technique used needs to be able to manage real time data, in addition to be able to co-ordinate data from various databases. The ability for the chosen data mining technique to have inbuilt algorithms would also aid in accurate predictions .
- The data mining technique needs to have the capability of handling text analysis in several formats, such as, pdf files, word files etc. This aids in recognising redundancies in the data that is made available .
- It must be interactive through use of graphical analysis, classification of image and is capable of classifying outcomes through groupings i.e., regard to weather forecasting it could be rainy, cloudy or sunny .
- It holds an added advantage if the technique is scalable, i.e., it can be used for both long term and short-term predictions . It also needs to be cost effective, independent, time efficient and desirable .
To conclude, this literature review covered the researches regarding accurate weather predictions using different data mining techniques carried out by several researchers.
Data mining techniques in the world of weather forecasting have the ability to process multi-dimensional data in very large amount and recognise different hidden links and pattern between them.
From the data mining techniques reviewed, based on their advantage and disadvantage, it was found that decision tree was the most efficient technique when concerned with weather forecast prediction.
This took in account the cost, the time efficiency and how accurate its prediction was. Decision tree within the Artificial Neural Network has shown to 87% accurate in prediction and 98% precise, hence why it is most suitable technique to use in comparison to others.
The researchers who used this data mining technique recognised that the predication models have room for further improvement. One of these improvements can be using better technique in classifying and characterising. Using more a larger scale of accurate data that is collected over a long period of time could be a way of doing this.
Cite This Work
To export a reference to this article please select a referencing stye below:
Related ServicesView all
DMCA / Removal Request
If you are the original writer of this essay and no longer wish to have your work published on UKEssays.com then please: