# Weather forecasting using support vector machines

**Published:**

**Disclaimer:** This essay has been submitted by a student. This is not an example of the work written by our professional essay writers. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.

Weather prediction is a technique of forecasting weather patterns for a future time in a particular location or area. Historically, various techniques were used to predict the weather, based on observation of environmental and meteorological elements such as clouds, sunlight and animal behaviour. These forecasts were not often very scientific or accurate. The motivation behind weather prediction was the desire for rulers of empires to plan wars of conquest and military campaigns.

The use of accurate weather forecasts was utilised in areas, such as defence, manufacturing and agriculture. The factor behind the weather organizations in England, France, Germany and United States was to provide warnings of forthcoming storms (Craft). Farmers can identify their optimal production levels of crops using their weather information.

Support vector machine is a computer algorithm, introduced by Boser, Guyon and Vapnik [Jason weston]. It is well motivated learning algorithm, developed from statistical learning theory. Support Vector Machines are based on the concept of decision planes that define decision boundaries [Support Vector Machines.]. SVM performs better by developing a multi-dimensional hyper plane to optimally separate a given data set into two categories. Using a kernel function, SVM's are an alternative training method for polynomial, radial basis function and multi-layer perceptron classifiers in which the weights of the network are found by solving a quadratic programming problem with linear constraints, rather than by solving a non-convex, unconstrained minimization problem as in standard neural network training (dterg). Applications of support vector machine have success in varies fields such as bio information, text mining, financial fraud detection, etc.

Artificial neural network generally refers to multilayer perceptron network, which is an implementation of multilayer feed-forward network with three layers - input layer, hidden layer and output layer. Multilayer perceptron is also another famous algorithm in data mining field. Multilayer perceptron algorithm is also used in many areas including finance. This technique is more useful for long term forecast for months and seasons ahead.

Key research interest of weather prediction using support vector machine is to analyse the accuracy of the result forecasted and compare it with the forecasted result using multilayer perception network. Compared to traditional methods, both techniques produce highly accurate results. In general, short term prediction is more accurate than long term prediction. Methodologies used for long term weather prediction are different from the forecasting methods used for short term weather prediction.

## Research Problem

There are several data mining techniques are used for weather prediction. Support vector machine is one of the latest data mining techniques. The performance of the support vector machine technique is very high in various applications such as text mining application, some financial application, etc. Applying support vector machine for weather prediction will improve the accuracy of the result. In order to justify the performance of support vector machine, weather prediction using multilayer perceptron network will be used to compare the result from the prediction using support vector machine.

## Research Question

In order to measure the performance of support vector machine for prediction, MATLAB will be used as a tool to produce prediction. In this research the following questions will be addressed.

What is mean by support vector classification and support vector regression?

Why support vector machine is better compare to multilayer perceptron (theoretically)?

Compare support vector machine and multilayer perceptron using MATLAB.

## Project Intention

The main intention of this research is to find a methodology for weather prediction using support vector machine and analysing the advantages over the new approach.

## Project Goal

The aim of this research is to analyse the performance and accuracy of weather prediction using support vector machine. Mat lab simulation will be developed with publicly available weather information to support this research and will be used for further analysis.

## LITERATURE REVEW

Historically, weather prediction is based on observation and the individual's experience. Sustainable Development Institute of College of Menominee Nation examined traditional weather forecasting methods and discovered a positive correlation between traditional methods with modern scientific methods (M.Balisacan,).Based on their research, the meteorological weather forecasts in the Philippines are concerned with a wide area, while traditional methods are still in practice, in the local areas outside of the cities.

Evangeline and Criselda categorized their findings about traditional methods as: ONSET OF RAINY SEASONS (Medium range), UP COMMING RAIN (Short range), TYPHOON OR FLOOD (Adverse weather) and SEASONAL OUTLOOK. Exodus of aunts from their caves, fruit of bangles, fruit of physic nut are some indicators for onset of rainy seasons and dragonflies fly low, bamboo buds dogs excrete waste in the middle of the roads are some indications for upcoming rain(M.Balisacan,). Evangeline and Criselda states that luminous ring around the moon, earthworms come out of soil en masse, long parallel band of feathery clouds and visible sea water evaporation are some best traditional methods to predict typhoon. In general, reliability of the traditional method for weather prediction is high but it is biased because the information was sought from knowledgeable people. In the modern world, these techniques are inappropriate because identifying person with very good domain knowledge is the challenging issue and there is no theoretical proof for any of the traditional methods.

Weather prediction employs advanced technologies and techniques during and immediately after world war II. The British radar installed in late 1930s to monitor the enemies aircraft, gave excellent returns from raindrops at certain wavelengths (5 to 10 centimetres) (Encyclopedia Britannica). Consequently, radar becomes as a forecasters' tool. The next breakthrough in weather prediction is the introduction of satellite in weather prediction process. The first weather satellite called Applications Technology Satellite 3 (ATS 3) was launched Nov. 5, 1967(National weather service). After the satellite, Synchronous Meteorological Satellites, Geostationary Operational Environmental Satellites (GOES) and Polar Orbiting Satellites were sent to space by NASA to provide weather forecasting services (National weather service). In addition to these advance technologies, some advanced techniques were introduced in weather prediction. Numerical weather prediction is a well-known techniques used for forecasting. Numerical weather prediction employs some data mining techniques such as multilayer perceptron in order to improve the result of the prediction. The modern approaches enhance the performance of the outcome of weather prediction.

Jason weston states that multilayer perceptron is one type of artificial neural network model, developed by constructing an input- output mapping without explicit derivation of the model equation [Multilayer perceptron]. Multilayer perceptron neural network is being used in many areas such as pattern classification, prediction, optimization and many other areas. Multilayer perceptron neural network has at least three layers - input layer, hidden layer and output layer (at least one or many layers) and each layer consists of many identical interconnected neurons. In multilayer perceptron network, learning occurs by adjusting the connection weight after each data has been processed. The value of the error in the output (difference between expected outcome and actual outcome) determine the value of adjustment of the weight in a perceptron. In multilayer perceptron neural network, the learning is carried out throughout back propagation training algorithm [Support Vector Machines].Multilayer perceptron causes for local minima and model over fitting issues.

Support vector machine performs classification by constructing an optimal hyper plane that map the data set into a feature space with higher dimensionality. The original data is mapped to a new dataset and then support vector machine performs the separation is the basic idea. In figure 1, the inputs (dots in input space) are mapped with feature space. Classification is straightforward in feature space.

## Figure

Support vector can be used for regression. It is a non linear generalization of generalized portrait algorithm developed in Russia in the sixties [Debasish B, Srimanta P and Dipak C]. Support vector regression uses structural risk minimization (Burges. C) not empirical risk minimization (Y.Radhika and M.Shashi). Support vector regression obtains better generalization on a limited number of learning pattern. Reduction in the generalization error improves the performance of prediction. Application areas of support vector machines are expanding. Support vector machines are used in 3D object recognition problem, financial problem, text mining and other more. Radars use support vector machine for Automatic Target Recognition [SVM Application List]. This research focuses more on the applicability of support vector machine for weather prediction.

## Weather prediction

Weather forecasting is complex business which predicts various types of weather states. Temperature, visibility of atmosphere, pressure, type and amount of clouds, wind direction, wind speed are collected by national weather centres. World metrological organization is an agency of United Nations which coordinates and regulates the weather observation process. The organization provides training to provide accurate weather data and enables free and unrestricted exchange of data (WMO in brief). According to the regulation of world metrological organization, weather information is collected in every three hours in general, and every half an hour in special locations such as airport, harbour, etc...The collection of data is used for the prediction.

Good knowledge about isobaric patters including anticyclones, fronts, depressions and high pressure ridges is vital for better prediction (Forecasting ). In general, weather observers record minimum temperature, maximum temperature, average temperature, dew point temperature, etc. These minimum, maximum temperatures are used for prediction. Dew point temperature is measured when the air doesn't hold any water vapour. This is always less than or equal to the air temperature. In general, if the dew point temperature is high, you can feel the thickness of air when you breathe (Weather Questions). The difference between the dew point temperature and the air temperature results the changes in the visibility of the air.

If the difference becomes less then dew, fog or clouds begin to form. The relative humidity is 100% when the air temperature is equal to the dew point temperature. Fog can be further classified into radiation fog, advection fog, stem fog and frontal fog. Visibility of atmosphere varies depending on the type of fog. Clouds are categorised into low clouds (Under 10,000 Feet), middle clouds (10,000 to 20,000 Feet), high clouds (Over 20,000 Feet) and Towering Clouds (Up To 60,000 Feet) (pjbartlett). Low clouds are further classified into Cumulus clouds, Stratus clouds, Nimbostratus Clouds and Stratocumulus Clouds. Middle clouds can be Altocumulus Clouds or Altostratus Clouds. Cirrus clouds, Cirrostratus Clouds, Cirrocumulus Clouds and Contrail Clouds are high clouds, and Swelling Cumulus Clouds and Cumulonimbus Clouds are Towering Clouds.

Observing and recording weather information is a key responsibility of weather centres. These records are vital for weather forecasters. Various types of instruments are in practise to measure and record the weather. The following table lists the instruments and their purposes in weather centres.

## Instrument

## Purpose

Barometer

Measure the air pressure

Psychomotor

Measure relative humidity

Thermometer

Measure the air temperature

Anemometer

Measure wind speed

Wind Vane

Measure direction the wind is blowing

Rain Gauge

Measure the amount of liquid precipitation

Automatic Weather Station (AWS)

Records of daily weather conditions have of course been kept for 200 years

Table: Weather instruments summarised from http://www.pjbartlett.co.uk

## Support Vector Machine

Support vector machine is a form of supervised learning algorithm and a statistical learning algorithm. It is a best algorithm to develop decision support system. In data mining, support vector machine becomes famous due to the performance in various applications. The applications of support vector machine are categorised into two areas - classification and regression. Support vector classification is a learning technique which attempt to separate different groups in a data set. Support vector regression is based on the support vector, used for prediction by developing a real function. In support vector classification, the output can be either one or zero but in support vector regression, the output is a real number. Support vector machine are based on the structural risk minimization principle which is a regularization theory, elaborate capacity control to prevent over fitting. (Burbidge R and Buxton B)

Classification is a technical term in data mining, belongs to supervised learning. Supervised learning utilises two data set - input and output. Input and output are known. In unsupervised learning, latent variables are used in the model. The main application of classification is to evaluate whether a new data point is belongs to a particular group or not. Many decision making system utilises various classification techniques and algorithm to develop a better decision.

## Figure : Optimal Separation Hyper plane

In general, there are too many possible classifiers that can separate two classes. In the above example (Figure 2), there are too many linear classifiers which can separate the data. The margin between the closet points and the linear classifier is maximised for the bold green coloured classifier. The classifier is the optimal separation hyper plane. The nearest data points are called support vectors. Here is the training set.

D = {(x1, y1), ..., (xk,yk)}, x Rn , y {1,-1}

For the above data, support vector machine finds a hyper plane which classifies the data into two classes. The hyper plane will developed using the following vector operations. Here, domain xi is the values for input and yi are labels or targets.

Here, w is weight vector and b is a constant.

The optimal hyper plane or optimal separation hyper plane is the linear classifier which separates the classes with the maximum margin.

## Figure :

In support vector classification, if the training data is lineally separable, the data can be classified by computing WTX. The basic idea is that the SVM predicts '1' if and only if WTX > 0 or '0' if and only if WTX < 0. If WTX is much greater than zero, SVM predict '1' with more confidence and vice versa.

Figure 1 shows that it is possible have many hyper planes which can separate the classes. By minimising 2, the optimal separation hyper plane could be developed for the given training set. Here, is the Euclidean norm of vector W. Support vector machine maximise the distance, normal to the hyper plane. The distance is called margin.

It is not possible to perform linear separation for all data. In this situation, support vector machine introduce feature space. Feature space is a higher dimensional data set, which is mapped to the original data set. Non linear separable data set becomes lineally separable by mapping the data set into the feature space. . Here, is a function which maps the original data into the feature space. By choosing appropriate function, the linear separation becomes easy.

Support vector machine map the data into a feature space that the data can be linearly separable even though the original data is inseparable. In order to map the data, there is a function is in need for support vector machine. This function is called kernel function. This kernel function transforms the original data non-linearly into a higher dimensional data. Figure 4 depicts the nonlinear mapping of the original data into a feature space.

## Figure - Mapping data into feature space

## Support vector classification example

Consider the following vectors:

The first sample is positively labelled and the second one is negatively labelled. The above two samples can be linearly separable (see figure 5).

First, support vectors have to be identified. From the figure 4, there are three vectors accurately discriminate the positively labelled and negatively labelled classes. These vectors are called support vectors. They are

By inspection, the data is linearly separable (see Figure 4), therefore, a linear SVM is applicable. The size of the margin is generally assigned to 1. All the support vectors are augmented as 1 which is a bias input. The identity function, maps , with respectively , . By solving the following operation, the numerical values for 1, 2 and 3 can be calculated.

1 (S1). (S1) + 2 (S2). (S1) +3 (S3). (S1) = -1

1 (S1). (S2) +2 (S2). (S2)+3 (S3). (S2) = 1

1 (S1). (S3) + 2 (S2). (S3) + 3 (S3). (S3) = 1

## Figure : Red squares are positive examples and blue diamonds are negative examples

The following is generated by computing the dot product.

21 + 52 + 53 = -1

51 + 2 + 133 = 1

1 + 132 + 3 = 1

By solving the above algebraic equation, numeric values of 1, 2, 3can be found. The followings are the solution for 1, 2, 3.

1 = -

2 =

3 =

In order to calculate the weight vector, the following operation has to be performed.

The vectors were augmented by a bias value 1. The bias has to be removed from the weight vector. The weight vector becomes. The separating hyper plane is. Here, .

Support vector regression

Regression is a well-known technique in predictive analytics. Various regression models are available in predictive analytics such as linear regression, logistic regression, multinomial logic and profit models, time serious model and etc. Support vector regression is one type of regression model, a modified version of support vector machine. Artificial Neural Network was used for stock market prediction over the last decades (Wang K, Kovacs G, Wozny and Fang M, 2006). Support vector regression replaces the artificial neural network because of its strong explanatory power of its results, very good generalization and lack of over fitting issues. Support vector regression is now a well-known tool in financial time series forecasting and expected to be a good tool for weather prediction.

The objective of the support vector regression is to fit a flat function to given data points. Minimising the norm ensures that the flatness of the function (Farag A and Mohamed R M, 2004). Many researchers found that the dual formation is in practise to solve many real world problems. The dual formations enhances support vector regression algorithm and the enhanced algorithm can solve non linear problems.

Dual Problem and Quadratic programming

Developing a Lagrange function for the minimization problem along with its constraints is the goal of the dual problem. It introduces a dual set of variables which facilitates to implement the constraints along with the minimization problem. Then the Lagrange function will be as follows:

In the above equation, are Lagrange multipliers and

In order to obtain the saddle point condition, partial derivation of the function L with respect to the primal variables has to be performed. When you perform the partial derivation, the primal variables are eliminated as follows:

## = 0

Multi layer perceptron network

Multi layer perceptron is a neural network, consists three types of layers - input layer, hidden layer and output layer, an advanced version of standard linear perceptron. Back propagation algorithm is used in multi layer perceptron to train the neural network. Multi layer perceptron consist multiple layers of nodes which are fully connected in a directed graph. Multilayer perceptron employ a linear activation function. The application area of multi layer perceptron is not limited to computational neuroscience and parallel distributed processing.

Key element in artificial neural network is perceptron. A single perceptron can perform a simple binary operation. They cannot represent XOR function (SeungHYPERLINK "#Seung" S ). XOR function can be represented only by multi layer network because XOR can be developed using the simple Boolean operations - AND, OR and NOT. Multilayer network with n number of layers has n+1 numbers of perceptron. Generally, multi layer perceptions are representationally powerful

## RESEARCH METHOD

Experimental based research methods and opinion based research methods are more suitable to analysis the support vector machine for weather prediction. According to researchers in data mining field, support vector machine performs well for prediction and classification. Researches on support vector machine for weather prediction are less compare to other field such as finance. Therefore, an empirical analysis was developed to compare support vector machine with multilayer perception (MLP).

Prediction of weather information is a process involving many activities. The traditional approaches and the modern approaches for weather prediction have significant difference. For example, satellite information and numerical modelling are vital parts of modern weather prediction. In this research, history of weather prediction highlights the needs and other significances of weather prediction for every human being. This topic describes the traditional weather prediction model including remarks about the traditional approaches. The remarks state the impact of the research on new techniques for weather prediction.

Most of the weather services employ multilayer perceptron technique for processing the result. The performance of multilayer perceptron network technique is better in many cases. This research contains a detail study about multilayer perceptron network. It is important to understand the concept behind the multilayer perceptron network. Many researchers worked on multilayer perceptron network and its applications in weather prediction. Multilayer perception network's performance is investigated based on the above researches.

Support vector machine is a new technique in data mining field and successful in many areas. Understanding support vector machine is a vital for this research. There are many resources available about support vector machines. www.kernel-machines.org provides lots of information about support vector machine. After the topic of multilayer perceptron network, this research investigates about the support vector machines. Support vector machine are successful in many areas such as biometric information, image processing, financial application as text mining.

Researches about the application of support vector machine for weather prediction are less compare to finance area. Sufficient numbers of researches were done about the application of support vector machine in financial area. Apart from our theoretical research, a simulation program will be developed in Mat lap for support vector machine and multilayer perceptron network. The result will be analysed and support our thesis statement.

## Data Collection

Data collection is another challenging part of this research. The relevant data can't be sought from user with use of a questionnaire or other means. Satellite information can't be available due to security reasons. Therefore, the primary data will not be used for this research.

Alternatively, there are some free database is available. The secondary information is available for this research. The difficulty is to select a best one available for free. Quality of the information is difficult to identify for the secondary information. In this research, US-based databases and UK-Based databases are preferred to use for the simulation. For example, NOAA satellite and information service is a US-based database. Selection of a data from trusted organization guarantees the quality of the historic information of weather. Accurate information ensures a reliable conclusion from our simulation.

After a careful research on public weather databases, few of the databases are selected. The following databases are selected for this research.

Missouri Historical Agricultural Weather Database: http://agebb.missouri.edu/weather/history/index.asp

Weather base : http://www.weatherbase.com

World Climate : http://www.worldclimate.com/cgi-bin/grid.pl?gr=N51W000

National Weather Service, US: http://www.srh.noaa.gov/bmx/

Met Office, UK: http://www.metoffice.gov.uk/weather/uk/nw/nw_forecast_weather.html

Missouri Historical Agricultural Weather Database provides daily and hourly information. Various types of weather states such as atmospheric temperature, precipitation, etc. are available in this database. The disadvantage of this database is that the information is only available for Missouri. Weather base contains weather records more than 16,439 cities worldwide[Weather Base]. Weather Climate is exactly similar to weather base. Both database cover wide area. The main disadvantage of both database is only providing daily averages. National weather service, US and Met Office, UK are very good resources with various information but the availability of excessive information make more complex and ambiguity. Missouri Historical Agricultural Weather Database is selected for our simulation because the geographical region is not a main factor but hourly information is a useful factor in our research.

## Approach for atmospheric temperature prediction

Various methods are used to predict atmospheric temperature in many places. Advection equation for temperature is one method, which states that the rate of change of temperature with time depends on the advection of the temperature in the east-west and north-south directions by the wind [How do we predict Weather and Climate?]. Time series is another simple mechanism for prediction. Support vector regression becomes more popular technique these days. In this research, support vector regression based prediction will be used for compare the performance with multi-layer perceptron algorithm.

Steve Gunn who is a senior lecture at university of Southampton developed a MATLAB function for support vector regression. The implementation is free for academic purposes. The MATLAB implementation will be used in this project. One month daily temperature of 2010 extracted and will be used for training data set (or training input). The daily temperatures of the following 10 days are used for test date which will be used to validate the regression function. Using the regression function developed by support vector machines, the daily temperature of the following five days will be predicted and analyse the accuracy of the predicted values.

## RESULTS AND FINDINGS

Sample Result for Support vector regression

The followings are the sample data, used to develop the support vector regression.

Training Set X = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11; 12; 13; 14; 15;

16; 17; 18; 19; 20; 21; 22; 23; 24; 25; 26; 27; 28; 29; 30];

Training Set Y = [15.6; 23.3; 28.3; 23.2; 29.0; 31.2; 31.4; 17.9; 20.3; 23.7; 9.5; 5.6; 5.5; 17.5; 20.8; 16.2; 32.5; 19.6; 10.6; 10.9; 9.5; 21.3; 14.0; 19.1; 23.1; 15.5; 27.0; 30.8; 29.9; 19.0];

## Figure : Avg Temperature Vs Time graph on Mat lab

Command Line out put

## ----------------------------------------------------

------ ONLINE SVR ------------------------------

## ----------------------------------------------------

C: 10

Epsilon: 0.1

KernelType: RBF

KernelParam: 30

Number of Samples Trained: 30

> Support Samples: 21

> Error Samples: 8

> Remaining Samples: 1

## ----------------------------------------------------

### Cite This Essay

To export a reference to this article please select a referencing stye below: