Stock Trading using Computational Intelligence
t
Computational Intelligence has been widely used in recent years in many areas, such as speech recognition, image analysis, adaptive control and time series prediction. This research attempts to explore the usefulness of neural network and support vector machine in financial market. Two popular stock market indexes have been studied: Hong Kong Hang Seng Stock Index and Dow Jones Transportation Index. The performance of neural network and support vector machine are evaluated in two dimensions: error in forecasting and trading profits.
Popular technical indicator, percentage price oscillator (PPO), has been selected as training input and output. Predictive models use previous 8 days PPO to forecast future 5 days PPO. Empirical results on Hong Kong Hang Seng Index show that multilayer perceptron optimized with GA (MLPGA) trading system obtain 6.71 times of original capital from 1997129 to 200738, totally 2500 trading days. While support vector regression optimized by genetic algorithms (SVRGA) trading system generates 5.705 times of original capital during the same time horizon. In contrast, conventional nonpredictive trading system only produces 2.064 times of starting equity. "Buy and Hold" strategy gives 1.605 times return to investors. A recent published fuzzy trading system provides 5.781 dollars as final equity for 1 dollar initial investment.
Further evaluations of two intelligent trading systems have been made. A back test using the same parameters and same assumptions on Dow Jones Transportation Index have further proved the robustness of the proposed trading systems. MLPGA trading system provides 4.87 times of initial capital and SVRGA trading system obtains 5.168 as final equity. These two intelligent trading systems again outperform conventional trading system, which generate 2.805 dollars for 1 dollar investment.
Acknowledgements
I am very grateful to my final year project supervisor, Associate professor Wang Lipo, and would like to take this opportunity to thank him for his patient and insightful guidance throughout the project. Professor Wang always offers me detailed and valuable explanations and suggestions in our discussion, and provides me useful knowledge about doing research. Not only professor Wang enlightens me in academic area, he also arranges meeting with industrial professionals for me to discuss this project. Again, I would like to express my sincere appreciation to professor Wang.
Zhu Ming
April, 2010.
Stock Trading using Computational Intelligence
List of Figures
Fig 2‑1 A multi layer neural network with L layers 13
Fig 2‑2 Maximummargin hyperplane and margins for a SVM trained with samples from two classes. 16
Fig 2‑3 Genetic Algorithm flowchart, with maximum 100 generation 18
Fig 2‑4 One point crossover 19
Fig 2‑5 roulettewheel selection 20
Fig 3‑1 Dow Jones Industrial Average price, with EMA plotted. 23
Fig 3‑2 Using single EMA 23
Fig 3‑3 Using two EMA to make decision 24
Fig 3‑4 A predictive trading system. 26
Fig 3‑5 Structure of GA optimized MLP 28
Fig 4‑1 Training performance of MLP 33
Fig 4‑2 MSE for out of sample data 34
Fig 4‑3 Linear regression for trained neural network 35
Fig 4‑4 Linear regression for out of sample data 36
Fig 4‑5 Equity curve for intelligent and conventional trading systems 37
Fig 4‑6 Trading signal of NN+GA trading system 38
Fig 4‑7 Trading signal of conventional trading system 39
Fig 4‑8 MSE for GA+SVR model 41
Fig 4‑9 Equity curve for GA+SVR trading system and conventional trading system 42
Fig 4‑10 Comparison of 4 trading systems 43
Fig 4‑11 Equity curves of different trading system on DJT 44
List of Tables
Table 3‑1 Settings for GA and NN 26
Table 3‑2 Settings for GA and SVR 29
Table 4‑1 Data distributions for training and testing neural network 32
Table 4‑2 Total return for different prediction time horizon 34
Table 4‑3 Trading performance comparison 42
Stock Trading using Computational Intelligence
Chapter 1
Introduction
1.1 Background
Analyzing stock market is one of the most important and fascinating issue as it is highly related with the profitability of investment. There are two main types of analysis in financial market: technical analysis and fundamental analysis. Fundamental analysis is based on the premise that a stock, bond, fund, commodity, or a market as a whole has an underlying intrinsic value. By analyzing the fundamental characteristics, such as assets, liabilities, income, supply or demand, values can be determined [11]. Normally fundamental analysts use a trading strategy called "Buy and Hold", since they tend to buy the stocks of undervalued companies or the companies with great growth potentials. They believe that the share price would rise eventually since the company they buy is growing. Hence, they would like to keep the stocks for a relative long time. On the other hand, technical analysis believes that the market's price reflects all the relevant information, such as news and events. Thus, price is the only information they need to analyze. In their perspective, history will repeat itself such that we could trade for profits. Therefore, technical analysis only employs historical data to build the model for future investment.
Over the past decade, Computational Intelligence has been widely used in stock trading, such as using neural networks (NN) [10]). Using computational intelligence could provide opportunities for investors to combine the information gathered from fundamental analysis and technical analysis to make trading decision. Mainly, two types of input data have been used in computational intelligence. One type, price or technical indicators, is considered as technical analysis. The other type includes macroeconomic indices and information related to a specific company, such as the interest rate and P/E ratio.
Many pioneer scholars have focused on minimizing the mean square error (MSE) in price direction prediction as well as providing paper profits in trading financial market. Patel et al [10] uses hierarchical coevolutionary fuzzy system (HiCEFS) to predict a technical indicator and hence build a prudent trading strategy. Furthermore, by testing this model with real world data of Hong Kong Hang Seng Index and NOL stock in Singapore Exchange, they achieved a final return of 14.251 times of original capital on NOL stock in 2329 trading days and 5.781 times of original capital on Hang Seng Index in 2461 trading days.
1.2 Objectives and Scope
The objective of this project is to explore and examine the usefulness of computational intelligence in stock trading on Hong Kong Hang Seng Index and Dow Jones Transportation Index. The intelligent trading system built on matlab could analyze the historical data and generate buy or sell signals for any given time series.
The main objectives are as follows:
1. Apply intelligent trading system on Hong Kong Hang Seng index to generate buy and sell signals. The intelligent trading system could be constructed with neural networks optimized by genetic algorithm or support vector machine optimized by genetic algorithm.
2. Examine entry and exit signals generated by intelligent trading system and nonintelligent trading system. Compare the empirical trading profits between them.
3. Compare the trading performance of intelligent trading system with other researcher's work, using the same data and trading rules.
4. Further validate the trading system's performance by applying the proposed system on Dow Jones Transportation Average Index, and compare the trading profits with nonintelligent trading system.
1.3 Organisations
This report is organized into 5 chapters:
Chapter 1 provides some background knowledge of financial market and other researcher's accomplishment on using computational intelligence in financial market. It also gives a detailed project objectives and scope.
Chapter 2 introduces the background knowledge for this project, such as neural network, support vector machine and genetic algorithm.
Chapter 3 describes the proposed methodology of this project. It introduces the technical indicators and inputs to the intelligent trading system, the architectures of the trading system. In addition, it also provides the settings for each intelligent prediction model, as well as the data preparation for these prediction models.
Chapter 4 presents the empirical results of trading Hong Kong Hang Seng Index and Dow Jones Transportation Average Index. Furthermore, it compares the results with nonintelligent trading system as well as "buy and hold" strategy.
Chapter 5 summarizes the project and provides the future work for the project.
Chapter 2
Literature Review
2.1 Artificial Neural Networks
An artificial neural network (ANN) is inspired by the structure and functions of biological neural networks, and expressed using mathematical models. It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. Modern neural networks are nonlinear statistical data modeling tools. They are usually used to model complex relationships between inputs and outputs or to find patterns in data. Neural networks are considered as highly parallel system which could learn from the past data and would be able to apply the knowledge learned to new data.
2.1.1 Multilayer Perceptron Neural Networks
There are varies of ANN structures, multilayer perceptron neural networks (MLP) is one of them. It is a feedforward network has a layered structure. Each layer consists of units which receive their input from units from a layer directly below and send their output to units in a layer directly above the unit. There are no connections within a layer Fig 2‑1. The inputs are fed into the first layer and each input is associated with a weight. The first layer outputs are considered as second layer's input and eventually calculated the final output. The activation function for each layer is described as:
in which
Information in MLP networks only move in the forward direction, from the input nodes through the hidden layers and to the output layer. There are also no loops in a MLP network.
Fig 2‑1 A multi layer neural network with L layers
2.1.2 Back Propagation
Back propagation is a common method of teaching artificial neural networks how to perform a given task. It was first described by Arthur E. Bryson and YuChi Ho in 1969,[14]. Back propagation is a supervised learning method, and is an implementation of the Delta rule. It requires a teacher that knows, or can calculate, the desired output for any given input. In another word, it has to be provided with desired output in order to calculate the errors. The errors propagate backwards from the output nodes to the inner nodes and from the inner nodes to input nodes. Hence back propagation is a method to calculate the gradient of the error for the network with respect to the network's modifiable weights, either in input layer or in hidden layer.
In short, back propagation algorithm could be describe as below.
Summary of the backpropagation technique:
1. Present a training sample to the neural network.
2. Compare the network's output to the desired output from that sample. Calculate the error in each output neuron.
3. For each neuron, calculate what the output should have been, and a scaling factor, how much lower or higher the output must be adjusted to match the desired output. This is the local error.
4. Adjust the weights of each neuron to lower the local error.
5. Assign "blame" for the local error to neurons at the previous level, giving greater responsibility to neurons connected by stronger weights.
6. Repeat from step 3 on the neurons at the previous level, using each one's "blame" as its error.
2.1.3 LevenbergMarquardt Algorithm
LevenbergMarquardt Algorithm is used for training the neural network. It could be used to modify the ANN's weights of each layer. The LevenbergMarquardt Algorithm interpolates between the GaussNewton algorithm and the method of gradient descent. It is more robust than the GaussNewton algorithm, which means that in many cases it finds a solution even if it starts very far off the final minimum. On the other hand, for wellbehaved functions and reasonable starting parameters, the LevenbergMarquardt Algorithm tends to be a bit slower than the GaussNewton algorithm. LevenbergMarquardt Algorithm could be expressed as [15]
2.2 Support Vector Machine
Support Vector Machine (SVM) is a relatively new learning method developed from statistical learning theory. Compared with traditional statistics, statistical learning theory does not assume infinite samples, but rather focused on estimations utilizing small samples. The basic idea of support vector machine is to find a hyperplane which separates the ddimensional data perfectly into its two classes. Support Vector Machine is a supervised learning method which could map the input space to output space Fig 2‑2.
Given that a training set (), i = 1..., the support vector machine requires the minimum value of following formula [17].
Fig 2‑2 Maximummargin hyperplane and margins for a SVM trained with samples from two classes.
2.2.1 Support Vector Regression
Support Vector Machine used in regression was proposed in 1996 by Vladimir Vapnik, Harris Drucker, Chris Burges, Linda Kaufman and Alex Smola [18], which is called support vector regression (SVR). The model produced by support vector machine used in solving classification problems depends only on a subset of the training data or called support vectors, because the cost function for building the model does not care about training points that lie beyond the margin. Similarly, the model produced by SVR depends only on a subset of the training data, because the cost function for building the model ignores any training data close to the model prediction.
Given a training set (), i = 1..., the target of SVR is to find a linear function that could minimize the discrepancy between the desired output and predicted output. The optimal regression function is the same with SVM.
There are several kernel functions commonly used in SVR, which includes liner, polynomial, radial basis function and sigmoid kernel function. Their respective formula is as below [23]:
n Linear:
n Polynomial:
n Radial Basis Function (RBF):
n Sigmoid:
Here, are kernel parameters
Support Vector Machine or SVR has some advantages when comparing to Neural Networks. For instance, it does not over fit the training data since it uses only several training data as support vectors. However, parameters in SVR would affect the final results in spite that SVR has much fewer parameters compared to NN. The main parameters in SVR are error insensitive tube around the regression function [19] and the balance of training errors with model complexity.
2.3 Genetic Algorithm
Genetic algorithm (GA) is a searching technique to look for exact or approximate solutions for optimization and searching problems. It is considered as global search heuristics.GA uses techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover.
A typical genetic algorithm requires:
1. a genetic representation of the solution domain
2. fitness function to evaluate the solution domain
In GA, an abstract representation of candidate solutions is called chromosomes, and it could be used in an optimization problem evolves toward better solutions. Solutions are represented in some encoding method, such as binary encoding. A fitness function is a particular type of objective function that prescribes the optimality of a solution so that a particular chromosome may be ranked against all the other chromosomes. The evolution usually starts from a population of randomly generated individuals. In each generation, the fitness of every individual in the population is evaluated. Based on their fitness, the fittest group of individuals are selected and through reproduction, crossover or mutation to form a new population. The new population is then used in the next iteration of the algorithm. Commonly, the algorithm terminates when either a maximum number of generations has been produced, or a satisfactory fitness level has been reached for the population. A common genetic algorithm is shown Fig 2‑3.
Fig 2‑3 Genetic Algorithm flowchart, with maximum 100 generation
2.3.1 Operators of Genetic Algorithm
When generating the next generation population of solutions, GA would use genetic operators: crossover, and/or mutation. For each new solution to be produced, a pair of "parent" solutions is selected for breeding from the pool selected previously. By producing a "child" solution using the above methods of crossover and mutation, a new solution is created which typically shares many of the characteristics of its "parents".
Crossover selects genes from parent chromosomes and creates a new offspring. One common way is using single crossover point on both parents' organism strings. All data beyond that point in either organism string is swapped between the two parent organisms. An illustration on one point crossover is shown in Fig 2‑4
Fig 2‑4 One point crossover
There are other ways for crossover, for example two crossover points could be chosen. Crossover can be rather complicated and very depends on encoding of chromosome. In some cases, GA performance could be enhanced by trying out other crossover techniques.
After a crossover is performed, mutation takes place. The purpose of mutation in GA is to preserve and introduce diversity. Local minima could be prevented because of mutation, and the population of chromosomes would not be too similar to each other so that the evolution could continue. Mutation changes the new offspring randomly. For binary encoding, a common way is switching a few randomly chosen bits from 1 to 0 or from 0 to 1.
2.3.2 Selection in Genetic Algorithm
Selection would choose individual genomes from a population for breeding next generation. There are varies of selection algorithms, such as roulettewheel selection, rank selection or Tournament selection.
Roulettewheel selection chooses parents according to their fitness. The chromosome has high fitness possesses the higher chances to be selected. The fitness level is used to associate a probability of selection with each individual chromosome. This algorithm could be imagined as roulette wheel in casino, where the larger piece has higher probability to be chosen, as shown in Fig 2‑5. If is the fitness of individual i in the population, its probability of being selected is, where N is the number of individuals in the population.
Fig 2‑5 roulettewheel selection
Tournament selection involves running several "tournaments" among a few individuals chosen at random from the population. The winner of each tournament (the one with the best fitness) is selected. Selection pressure is easily adjusted by changing the tournament size. If the tournament size is larger, weak individuals have a smaller chance to be selected.
Chapter 3
Intelligent Trading System Design
3.1 Technical Analysis
Technical analysts seek to identify price patterns and trends in financial markets and attempt to exploit those patterns.[20] People who are using technical analysis would search for archetypal patterns, such as the wellknown head and shoulders or double top reversal patterns, study indicators such as moving averages, and look for forms such as lines of support, resistance, channels, and more obscure formations such as flags, pennants or balance days. In this project, only indicators have been studied since they are quantitative and do not require ambiguous identifications.
Among all the technical indicators, moving average is considered as the simplest and most useful one. It is popular because moving average could discover the trends by smoothing the prices. Most importantly, moving average could be a useful tool since investors can make profits through trends. Exponential moving average (EMA), being one of the moving average indicators, is considered as more adaptive since it puts more weights on recent prices, e.g., today's close price, while putting less weights on earlier days. Equation below shows the calculation of EMA:
The plot of long term EMA of 45 days and short term EMA of 15 days are plotted with close price for Dow Jones Industrial Average Index in Fig 3‑1, all data and figures are provided by yahoo finance.
Dow Jones Industrial Average price, with EMA plotted.
There are many ways of using EMA, and two common uses are introduced here. First, investors could take a long position, or buy the stock index when close price is above the EMA, and take a short position when close price is under EMA. An example is shown in Fig 3‑2, using 30 days of EMA on Dow Jones Industrial Average. Although there are some whipsaw in the middle, using single EMA is helpful to investor when making buy or sell decisions.
Fig 3‑2 Using single EMA
Another way of using EMA is taking a long position (buy) when short term EMA is above long term EMA, and taking a short position when short term EMA is under long term EMA. An example of how to buy or sell is illustrated in Fig 3‑3, using 15 days EMA and 45 days of EMA. As we could see on the chart, this method is effective by taking large profits and suffering small losses.
Fig 3‑3 Using two EMA to make decision
It is clear that EMA could help investors to identify the trend. However, being able to discover the trend is not good enough, the trading rule should be established to take profits through the trend.
However, using chart and technical indicators are not sufficient since there are some serious disadvantages. For example, we do not know whether this technical indicator could bring investors consistent long term profits. Also, we do not know how many shares should we buy or sell. Without providing more information on these topics, investors may not dare to trade with real money. However, a quantitative trading system based on these indicators could concur the shortcomings. A well established trading system would be able to tell when to buy and when to sell, as well as how many shares to buy and sell. In addition, a trading system could provide back testing results, which could present the trading performance to investors, such as the equity curve or maximum drawdown. Therefore, in this project, a quantitative trading system is built and tested.
This trading system uses a technical indicator named Percentage Price Oscillator (PPO), PPO is calculated as formula below:
A buy signal is triggered if PPO is greater than 0, in other words, when short term EMA crosses over with long term EMA. A sell signal is triggered if PPO is less than 0, which means long term EMA is above short term EMA. This trading system is a typical trend following system which could catch every major trend to make promising profit, while suffering minuscule losses when significant trends are absent in the market.
3.2 Computational Intelligence in Trading
When using PPO trading system, there would be a lag between the time when the trend starts and the time when the trading system detects it. Failing to compensate the lag has been a dominant disadvantage of traditional trading systems (without prediction). An intelligent trading system attempts to predict PPO in the near future, so as to enter the market before the trend while closing the position before the market falls. The input for our intelligence trading system studied in this paper is PPO of the last 8 days and the output is PPO in the future 5 days. The intelligent model is either an MLP optimized by GA or an SVM optimized by GA. 0.2% of transaction cost and slippage are counted in the process of calculating profits, as indicated in Fig 3‑4.
Fig 3‑4 A predictive trading system.
3.3 Experimental Settings
3.3.1 GA optimized neural network
In this project, a feedforward MLP with one hidden layer is used. The number of hidden neurons is determined to be 30 by the trial and error. The LevenbergMarquardt algorithm is used to train the MLP. Initial weights of the neural network are determined by GA. The settings for the NN+GA model are selected as Table 3‑1.
GA settings 

the population size of GA 
300 
Maximum Generation 
800 
Stop criteria 
maximum generation reached 
the probability of mutation 
0.02 
Neural Network Settings 

layers 
Single hidden layer with 30 neurons 
Transfer function 
Transig, purelin 
Training 
LevenbergMarquardt 
performance 
Mse (mean square error) 
Table 3‑1 Settings for GA and NN
Using genetic algorithm to determine the initial weight and bias is essential since they have great impact on the generalization ability of the neural network. If the weights and bias are initialized with some random number and they happen to be far way from a good solution, or near local optimum, the neural network may not be trained to achieve good performance. Being trapped in local extremes is normally happened. On the other hand, appropriate initialization would put the weights and bias near a good solution, and hence provide a high chance for neural network to reach better outcome.
In this project, genetic algorithm is chosen to provide the initial weights and bias for neural network. The structure of using GA to optimize MLP is shown in Fig 3‑5. The fitness in GA is based on the error of predicted output and desired output, shown as below
Where is the desired output and is predicted output.
3.3.2 GA optimized SVR
Main parameters in SVR are error insensitive tube around the regression function [14] and the balance of training errors with model complexity. In this paper, GA is used to determine the best SVR parameters. The structure of GA optimized SVR is the same as using GA to optimize MLP, where GA is trying to minimize the difference between desired output and predicted output. The settings for GA optimized SVR model are listed in Table 3‑2
Fig 3‑5 Structure of GA optimized MLP
GA settings 

the population size of GA 
30 
Maximum Generation 
200 
Stop criteria 
maximum generation reached 
the probability of mutation 
0.05 
the probability of crossover 
0.4 
SVR Settings 

Kernel function 
radial basis function 
Table 3‑2 Settings for GA and SVR
3.4 Preprocessing Input Data
Once the appropriate raw input data has been selected (in this case, they are previous 8 days PPO) , it must be preprocessed; otherwise, the neural network will not produce accurate forecasts. The decisions made in this phase of development are critical to the performance of a network.
Normalization is commonly used to distribute the input data evenly and scale it into an acceptable range for the network. Knowledge of the domain is important in choosing preprocessing methods to highlight underlying features in the data, which can increase the network's ability to learn the association between inputs and outputs.
In normalizing data, the goal is to ensure that the statistical distribution of values for each net input and output is roughly uniform. In addition, the values should be scaled to match the range of the input neurons. This means that along with any other transformations performed on network inputs, each input should be normalized as well.
In this project, mapping the training input minimum and maximum values between 1 and 1 is adopted as normalizing method. In this method, it is assumed that the input has only finite real values, and that the elements are not all equal, as indicated below.
Where in this case is 1, is 1. is the largest number of training input, while is the smallest number of training input. stands for each individual training data, and is the normalized training data.
For the testing set, data should also be scaled to a certain range, as training set does. However, the largest number and smallest number of testing set are not available since we assume these data are unknown for trading simulation. Therefore, the testing data set are scaled using the parameters in training input data. In specific, and are still the largest number and smallest number in training data set.
Chapter 4
Results and Evaluation
This chapter illustrates the experiment results for 2 intelligent trading models, which are using GA optimized MLP and using GA optimized SVR. In addition, it introduces some evaluation criteria, and evaluates the prediction models according to these criteria. Furthermore, it analyzes and compares the return of capital and maximum drawdown with other publication as well as conventional trading method.
4.1 Experimental Data
This intelligent trading system uses Hong Kong Hang Seng Stock Index (HSI) from 19861231 to 1997128, total 2500 daily close price as in sample training session, and uses HSI from 1997129 to 200738, total 2500 daily price as out of sample testing data. All the HSI index data was obtained from Yahoo Finance (http://finance.yahoo.com/q/hp?s=^HSI).
In sample data used to train the neural network have been separated into three sets: training, validation, and testing. In this project, we divide the input data randomly such that the first 60% of the samples are assigned to the training set, the next 20% to the validation set, and the last 20% to the test set. Table 4‑1 is to summarize the distribution of experimental data.
Data Set 
Distribution (%) 
Distribution(data) 

Training Data 
Training set 
60% 
1500 
Validation set 
20% 
500 

Test set 
20% 
500 

Total 
100% 
2500 

Testing Data 
100% 
2500 
Table 4‑1 Data distributions for training and testing neural network
4.2 GA Optimized MLP Trading System
4.2.1 Forecasting Performance
The GA optimized MLP model is used to predict the future 5 days PPO. The performance of this predicative model could be evaluated by mean square error (MSE). MSE could be expressed as below
Where is the target output and is the predicted output.
The performance of forecasting in terms of MSE is 0.0087 for out of sample data, while 0.00213 for in sample data. In either case, we could see that the MSE is relatively small, which means the prediction is acceptable. In Fig 4‑2, the difference between desired output and predicted output is plotted, as we could see, although there are some large errors in prediction, most of the forecasting is acceptable.
Fig 4‑1 Training performance of MLP
The training results of neural networks could be further evaluated by linear regression. The best network is indicated by the correlation coefficient, r closed to unity (r ≈ 1) Fig 4‑4 shows the linear regression for out of sample data, which is 0.91227. Although it is nearly 8% lower compare with performance of in sample data, this model could still be considered as well trained neural network.
Fig 4‑2 MSE for out of sample data
4.2.2 Empirical Trading Results and Evaluation
PPO of future 5 days is selected to be desired output after prudent consideration. As a matter of fact, forecasting larger time horizon would definitely produce more profit, which is made by early entry and early exit. On the other hand, the larger the time horizon, the harder it is to predict. This would increase the chance of wrong prediction, which decreases the profit. Table 4‑2 is total return of investing 1 dollar, with different prediction time horizon.
Prediction Time Horizon 
Total return 
No prediction 
2.064 
Predict future 3 days PPO 
4.357 
Predict future 5 days PPO 
6.910 
Predict future 7 days PPO 
5.464 
Table 4‑2 Total return for different prediction time horizon
In this experiment, reinvesting all capital is selected as the money management strategy, in which the trading system would reinvest all the profit and initial capital for next buy or sell decision.
Fig 4‑3 Linear regression for trained neural network
Fig 4‑4 Linear regression for out of sample data
The proposed trading system assumes that it is possible to enter the market using the close price on the same day which triggers the trading signal. In addition, it assumes that the initial capital is 1 dollar and it is valid to buy or sell fraction number of the HSI. The PPO is calculated using parameters that short term of 15 days EMA and long term of 45 days EMA.
The equity curves of proposed intelligent trading systems are shown in Fig 4‑5 with equity curve of conventional trading system and equity curve for "buy and hold" trading strategy in contrast. The predictive MLP+GA model achieves 6.71 times of original capital from 1997128 to 200738 while in the mean time, a nonpredictive trading system only achieves 2.064 for 1 dollar investment, and "buy and hold" trading strategy generates 1.605 as final capital. In comparison, Huang and Quek et al. [10] use hierarchical coevolutionary fuzzy system (HiCEFS) to achieve 5.781 times of original capital on Hang Seng Index on the same trading days.
Fig 4‑5 Equity curve for intelligent and conventional trading systems
Sample testing data is shown in Fig 4‑7, it is obvious that prediction trading system would enter the market and exit the market earlier compared with trading system without prediction. However, using prediction has certain disadvantage. During nontrendy time, the proposed trading system may make wrong prediction and hence suffer some losses. For example, NN+GA trading system enters the market at day 61 at price 13030 and exit on the day 138 at price 15600, takes profit of 2570 points. On the other hand, for trading system without prediction, it enters the market at day 65 at price 13630 and exit at day 144 at price 13710, takes a profit of 80 points. That is the reason why the predictive model performs better than trading system without prediction. But during nontrendy market, such as around day 400, the trading system without prediction holds the position while the intelligent model made a wrong prediction. In this case, the investment incurred some losses.
Fig 4‑6 Trading signal of NN+GA trading system
Fig 4‑7 Trading signal of conventional trading system
Moreover, another important criterion to evaluate the trading system is the maximum drawdown (MDD). MDD is defined as the maximum cumulative loss from a market peak to the following trough [22]
The trading system using NN+GA suffers a MDD from 3.079 dollars to 2.443 dollars, which is 20.65% of the highest capital. In contrast, the trading system without prediction would have a MDD from 1.705 dollars to 1 dollar, which is 41.34% of the highest capital. "Buy and Hold" strategy suffers a MDD from 1 dollar to 0.466 dollar, which is 53.4% drop from the peak capital. Thus the NN+GA trading system reduced the risk involved. As it is shown in Fig 4‑5 regarding the conventional trading system without prediction, the capital is back to original 1 dollar after 1276 trading days. This may shake people's will to follow this system. On the other hand, the MDD happened in NN+GA trading system is from day 903 to day 930, which is easier for investors to follow the trading system.
All the trading records are listed in appendix A.
4.3 GA Optimized SVR Trading System
4.3.1 Forecasting Performance
The performance of forecasting future 5 days PPO using GA optimized SVR is evaluated in terms of MSE. MSE is 0.0058 for out of sample data, in contrast, MSE is 0.0087 in using GA optimized NN model for the same data. In another word, GA optimized SVR has smaller MSE, or better forecasting. In Fig 4‑8, the difference between desired output and predicted output is plotted.
However, better forecasting does not guarantee better profitability. Some wrong prediction at the top or at the bottom would bring larger losses comparing with wrong prediction at other situations.
4.3.2 Empirical Trading Results and Evaluation
The same assumptions are made as using GA+NN trading system. In addition, 15 days EMA and 45 days EMA are used to form PPO. The equity curve of GA+SVR trading system is shown in Fig 4‑9 with equity curve of conventional trading system in contrast. This GA+SVR trading system achieves 5.705 times of original capital.
Fig 4‑8 MSE for GA+SVR model
Although this predictive model does not achieve profit as much as GA+NN model, it has its own advantage. First, this SVR model would provide consistent performance after each training session. Second, in term of prediction accuracy, GA+SVR model offers smaller prediction errors while GA+NN mode has larger errors. Last, it trades less frequently compared with GA+NN model, this would give investors different options to choose which type of trading systems are fitting to them. For active traders, GA+NN model could be more suitable for them, while for less active investors, GA+SVR model could be adopted since it trades less frequently.
The comparison of GA+NN trading system, GA+SVR trading system, conventional trading system and "buy and hold" strategy is shown in Fig 4‑10, the equity curves for 4 trading system mentioned above are plotted together for comparison.
Fig 4‑9 Equity curve for GA+SVR trading system and conventional trading system
Trading System 
Final Equity 
MDD 
Win ratio 
Trading times 
Long position times 
Short position times 
GA+NN trading system 
6.71 
20.65% 
49.6% 
127 
63 
64 
GA+SVR trading system 
5.705 
28.5% 
44.8% 
67 
30 
37 
Conventional trading system 
2.064 
41.34% 
48.7% 
41 
20 
21 
Buy and hold strategy 
1.605 
53.4% 
100% 
1 
1 
0 
Table 4‑3 Trading performance comparison
Fig 4‑10 Comparison of 4 trading systems
4.4 Further Evaluation
In designing trading system, one of the most important issues is to avoid over curve fitting the system to back testing data. The more you bend your system around to improve performance on past data, the less likely it is your system will trade profitably in the future. Past performance will only approximate future performance to the extent the system is not over curve fitted. There are many ways to examine the over curve fitting trap. One way is to do back testing long enough. The longer the historical time period a system can trade profitably, the more robust it is. Another way to guard effectively against overcurvefitting is to make sure your system works in many markets using the same parameters. Hence, the trading system is further evaluated by applying to Dow Jones Transportation Index (DJT).
The data used as in sample training data is from 1968920 to 197895, totally 2500 trading days, and data used as out of sample testing data is from 197896 to 1988726, which is 2500 trading days. All data is from yahoo finance (http://finance.yahoo.com/q?s=^DJT). All the same assumptions are the same as trading HSI using intelligent trading systems. The equity curves of GA+NN trading system and GA+SVR trading system are shown in Fig 4‑11 with equity curve of conventional trading system in contrast. This GA+SVR trading system achieves 5.168 times of original capital, while the predictive GA+NN model achieves 4.87 times of original capital while in the mean time, a nonpredictive trading system only achieves 2.805 for 1 dollar investment.
Fig 4‑11 Equity curves of different trading system on DJT
GA+NN trading system and GA+SVR trading system outperform the conventional trading system again on DJT. This further proves that using computational intelligence would enhance the performance of conventional trading system. In addition, the proposed intelligent trading systems, using GA+NN or using GA+SVR, would survive in different market, such as DJT and HSI, and be able to generate profits consistently.
Chapter 5
Conclusion
In this project, a predictive trading system is proposed to trade on real market data of Hong Kong Hang Seng Index, and trade on Dow Jones Transportation Index as cross market validation. Neural network optimized by GA and support vector regression optimized by GA are implemented as predictive model in the trading system. The trading system mainly uses technical indicator price percentage oscillator (PPO) as trading rules. Hence the predictive model uses last 8 days PPO as input to predict future 5 days PPO, and based on predicted PPO to make trading decisions.
The testing period is 10 years, which is long enough to reduce the possibility of curve fitting. The proposed predictive trading system produces around 3 times more profits on HSI compared with conventional trading system without prediction, and around 2 times more profits on DJT compared with non predictive trading system.
Despite promising profits generated by the trading system, further improvements such as applying the system to other new immerging markets, such as China Stock market, or applying a better money management strategy can be considered as future research area. Furthermore, due to the randomness introduced by GA, neural network may not always be trained well enough every time. We shall study effective ways to assure reasonable performance for each training session.
References
[1] E. F. Fama, "The Behavior of Stock Market Prices," Business, vol. 38, pp. 34105, 1965.
[2] A. P. N. Refenes, A. N. Burgess, and Y. Bentz, "Neural networks in financial engineering: A study in methodology," IEEE Transactions on Neural Networks, vol. 8, no. 6, pp. 1222  1267, 1997.
[3] CHEN, KuanYu and ChiaHui HO, "An Improved Support Vector Regression Modeling for Taiwan Stock Exchange Market Weighted Index Forecasting", ICNN&B '05: International Conference on Neural Networks and Brain, Volume 3, , 2005
[4] L. Cao and F. Tay, "Support Vector Machine with adaptive parameters in financial time series forecasting," IEEE Transactions on Neural Networks, vol. 14, no. 6, pp. 15061518, 2003.
[5] P.B. Patel and T. Marwala, "forecasting closing price indices using neural networks." In International Conference on Systems, Man and Cybernetics, pp. 23512356, Oct 811, 2006, Taipei, Taiwan.
[6] S.H. Lee, H.J. Kim and J.S. Lim, "forecasting short term KOSPI time series based on NEWFM," in Advance Language Processing and Web Information Technology (ALPIT), pp. 303307, July, 2007.
[7] B. Doeksen, A. Abraham, J. Thomas, and M. Paprzycki, "Real stock trading using soft computing models," in Information Technology: Coding and Computing (ITCC), 2005, vol. 2, pp. 162167.
[8] A.S. Chen, M. T. Leung, and H. Daouk, "Application of neural networks to an emerging financial market: Forecasting and trading the Taiwan Stock Index," Computers and Operations Research, vol. 30, no. 6, pp. 901923, May 2003.
[9] K.K. Ang and C. Quek, "Stock Trading Using RSPOP: A Novel Rough SetBased NeuroFuzzy Approach," IEEE Transactions on Neural Networks, vol. 17, no.5, pp. 1301  1315, 2006.
[10] H.M. Huang, M. Pasquier, and C. Quek, "Financial Market Trading System With a Hierarchical Coevolutionary Fuzzy Predictive Model," IEEE Transactions on Evolutionary Computation, vol. 13, no.1, pp. 56  70, 2009.
[11] H. Bandy, Quantitative Trading Systems, Blue Owl Press, 2007.
[12] B. Krose and P.V.D Smagt, Introduction to Neural Network. The University of Amsterdam, 1996.
[13] S. Russell and P. Norvig. Artificial Intelligence A Modern Approach. p. 578.
[14] A.E.Bryson and YuChi Ho. Applied optimal control: optimization, estimation, and control. Xerox College Publishing. pp. 481.
[15] P.N. Bahrun and M.N. Taib, "Selected Malaysia Stock Predictions using Artificial Neural Network," in International Colloquium on Signal Processing & Its Applications (CSPA), 2009, pp. 428  431.
[16] Lipo Wang (ed.), Support Vector Machines: Theory and Applications. Berlin, Springer, 2005.
[17] C.W. Hsu, C.C. Chang, and C.J. Lin, A practical guide to support vector classification, Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan, 2003. [Online]. Available: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf
[18] H. Drucker, C. J.C. Burges, L. Kaufman, A. Smola and V. Vapnik. "Support Vector Regression Machines". Advances in Neural Information Processing Systems 9, NIPS 1996, 155161, MIT Press.
[19] A. J. Smola and B. Scholkopf, "A tutorial on support vector regression," NeuroCOLT2 Technical Report NC2TR1998030, 2003.
[20] John J. Murphy, Technical Analysis of the Financial Markets ,New York Institute of Finance, 1999, pages 15,2431.
[21] M. MagdonIsmail, A. Atiya, "Maximum Drawdown," Risk Magazine, Volume 17, Number 10, pp. 99102, October, 2004.
[22] M. MagdonIsmail, A. Atiya, A. Pratap, Y. AbuMostafa, "On the Maximum Drawdown of a Brownian Motion", Journal of Applied Probability, Vol. 41, no. 1, PP. 147161, March, 2004.
[23] Appendix
The trading details of NN+GA trading system on HSI are listed below. There would be price difference between exit and enter on the same day. This is due to consideration of slippage and commissions.
 enter short position at price 12414.3 at trading day 45
 exit short position at price 13020.8 at trading day 61
 enter long position at price 13033.8 at trading day 61
 exit long position at price 15598.9 at trading day 138
 enter short position at price 15583.3 at trading day 138
 exit short position at price 15547.2 at trading day 139
 enter long position at price 15562.7 at trading day 139
 exit long position at price 15534 at trading day 140
 enter short position at price 15518.5 at trading day 140
 exit short position at price 14776.8 at trading day 165
 enter long position at price 14791.6 at trading day 165
 exit long position at price 14810.8 at trading day 166
 enter short position at price 14796 at trading day 166
 exit short position at price 10525.5 at trading day 244
 enter long position at price 10536 at trading day 244
 exit long position at price 10232 at trading day 254
 enter short position at price 10221.8 at trading day 254
 exit short position at price 10671 at trading day 255
 enter long position at price 10681.7 at trading day 255
 exit long position at price 11151.6 at trading day 295
 enter short position at price 11140.4 at trading day 295
 exit short position at price 10968.3 at trading day 296
 enter long position at price 10979.3 at trading day 296
 exit long position at price 10977.5 at trading day 297
 enter short position at price 10966.5 at trading day 297
 exit short position at price 8189.25 at trading day 395
 enter long position at price 8197.44 at trading day 395
 exit long position at price 7849.96 at trading day 397
 enter short position at price 7842.11 at trading day 397
 exit short position at price 7701.61 at trading day 408
 enter long position at price 7709.31 at trading day 408
 exit long position at price 7946.04 at trading day 409
 enter short position at price 7938.09 at trading day 409
 exit short position at price 7837.61 at trading day 410
 enter long position at price 7845.45 at trading day 410
 exit long position at price 7883.46 at trading day 411
 enter short position at price 7875.58 at trading day 411
 exit short position at price 7564.54 at trading day 412
 enter long position at price 7572.1 at trading day 412
 exit long position at price 7744.72 at trading day 413
 enter short position at price 7736.98 at trading day 413
 exit short position at price 8506.79 at trading day 415
 enter long position at price 8515.3 at trading day 415
 exit long position at price 9499.5 at trading day 488
 enter short position at price 9490 at trading day 488
 exit short position at price 9913.58 at trading day 511
 enter long position at price 9923.49 at trading day 511
 exit long position at price 12436.9 at trading day 567
 enter short position at price 12424.4 at trading day 567
 exit short position at price 12346.9 at trading day 568
 enter long position at price 12359.3 at trading day 568
 exit long position at price 12409.2 at trading day 569
 enter short position at price 12396.8 at trading day 569
 exit short position at price 12308.5 at trading day 570
 enter long position at price 12320.8 at trading day 570
 exit long position at price 12059.3 at trading day 571
 enter short position at price 12047.2 at trading day 571
 exit short position at price 12471.6 at trading day 575
 enter long position at price 12484.1 at trading day 575
 exit long position at price 13093.7 at trading day 609
 enter short position at price 13080.6 at trading day 609
 exit short position at price 13473.8 at trading day 616
 enter long position at price 13487.3 at trading day 616
 exit long position at price 13591 at trading day 617
 enter short position at price 13577.4 at trading day 617
 exit short position at price 13254.3 at trading day 618
 enter long position at price 13267.6 at trading day 618
 exit long position at price 13167.1 at trading day 619
 enter short position at price 13153.9 at trading day 619
 exit short position at price 13566.7 at trading day 629
 enter long position at price 13580.3 at trading day 629
 exit long position at price 13214.4 at trading day 652
 enter short position at price 13201.2 at trading day 652
 exit short position at price 13322.1 at trading day 677
 enter long position at price 13335.4 at trading day 677
 exit long position at price 15574.6 at trading day 730
 enter short position at price 15559 at trading day 730
 exit short position at price 15275.3 at trading day 732
 enter long position at price 15290.6 at trading day 732
 exit long position at price 15167.5 at trading day 735
 enter short position at price 15152.4 at trading day 735
 exit short position at price 15917.8 at trading day 738
 enter long position at price 15933.7 at trading day 738
 exit long position at price 15653.9 at trading day 741
 enter short position at price 15638.2 at trading day 741
 exit short position at price 15789.8 at trading day 742
 enter long position at price 15805.6 at trading day 742
 exit long position at price 16491.4 at trading day 785
 enter short position at price 16474.9 at trading day 785
 exit short position at price 16850.7 at trading day 787
 enter long position at price 16867.6 at trading day 787
 exit long position at price 16487.7 at trading day 788
 enter short position at price 16471.2 at trading day 788
 exit short position at price 15278.3 at trading day 793
 enter long position at price 15293.6 at trading day 793
 exit long position at price 15367.1 at trading day 795
 enter short position at price 15351.8 at trading day 795
 exit short position at price 15900.1 at trading day 824
 enter long position at price 15916 at trading day 824
 exit long position at price 16629.8 at trading day 893
 enter short position at price 16613.2 at trading day 893
 exit short position at price 15820.8 at trading day 930
 enter long position at price 15836.6 at trading day 930
 exit long position at price 15504.8 at trading day 932
 enter short position at price 15489.3 at trading day 932
 exit short position at price 15329.6 at trading day 955
 enter long position at price 15344.9 at trading day 955
 exit long position at price 15024.5 at trading day 959
 enter short position at price 15009.5 at trading day 959
 exit short position at price 15188 at trading day 960
 enter long position at price 15203.2 at trading day 960
 exit long position at price 14659.3 at trading day 962
 enter short position at price 14644.7 at trading day 962
 exit short position at price 15436.5 at trading day 971
 enter long position at price 15452 at trading day 971
 exit long position at price 15527.4 at trading day 999
 enter short position at price 15511.8 at trading day 999
 exit short position at price 13718.1 at trading day 1046
 enter long position at price 13731.9 at trading day 1046
 exit long position at price 13600.8 at trading day 1048
 enter short position at price 13587.2 at trading day 1048
 exit short position at price 13585.1 at trading day 1050
 enter long position at price 13598.7 at trading day 1050
 exit long position at price 13636.6 at trading day 1052
 enter short position at price 13623 at trading day 1052
 exit short position at price 13459.2 at trading day 1057
 enter long position at price 13472.6 at trading day 1057
 exit long position at price 13721.3 at trading day 1058
 enter short position at price 13707.5 at trading day 1058
 exit short position at price 13878 at trading day 1059
 enter long position at price 13891.8 at trading day 1059
 exit long position at price 13174.4 at trading day 1066
 enter short position at price 13161.2 at trading day 1066
 exit short position at price 13703.4 at trading day 1071
 enter long position at price 13717.1 at trading day 1071
 exit long position at price 13523.3 at trading day 1075
 enter short position at price 13509.8 at trading day 1075
 exit short position at price 10609.3 at trading day 1176
 enter long position at price 10619.9 at trading day 1176
 exit long position at price 11209.4 at trading day 1219
 enter short position at price 11198.2 at trading day 1219
 exit short position at price 11013.6 at trading day 1220
 enter long position at price 11024.6 at trading day 1220
 exit long position at price 10964.1 at trading day 1221
 enter short position at price 10953.1 at trading day 1221
 exit short position at price 11003 at trading day 1253
 enter long position at price 11014 at trading day 1253
 exit long position at price 10863.1 at trading day 1265
 enter short position at price 10852.2 at trading day 1265
 exit short position at price 11032.9 at trading day 1269
 enter long position at price 11044 at trading day 1269
 exit long position at price 10878 at trading day 1270
 enter short position at price 10867.2 at trading day 1270
 exit short position at price 11217.2 at trading day 1281
 enter long position at price 11228.4 at trading day 1281
 exit long position at price 11359.8 at trading day 1311
 enter short position at price 11348.4 at trading day 1311
 exit short position at price 11312.5 at trading day 1312
 enter long position at price 11323.9 at trading day 1312
 exit long position at price 11402.4 at trading day 1313
 enter short position at price 11391 at trading day 1313
 exit short position at price 9787.49 at trading day 1411
 enter long position at price 9797.28 at trading day 1411
 exit long position at price 9560.46 at trading day 1415
 enter short position at price 9550.9 at trading day 1415
 exit short position at price 9655.36 at trading day 1419
 enter long position at price 9665.02 at trading day 1419
 exit long position at price 9613.84 at trading day 1424
 enter short position at price 9604.23 at trading day 1424
 exit short position at price 9865.65 at trading day 1427
 enter long position at price 9875.52 at trading day 1427
 exit long position at price 9656.46 at trading day 1448
 enter short position at price 9646.8 at trading day 1448
 exit short position at price 9834.08 at trading day 1465
 enter long position at price 9843.91 at trading day 1465
 exit long position at price 9552.02 at trading day 1470
 enter short position at price 9542.47 at trading day 1470
 exit short position at price 9155.57 at trading day 1544
 enter long position at price 9164.73 at trading day 1544
 exit long position at price 13024.1 at trading day 1753
 enter short position at price 13011 at trading day 1753
 exit short position at price 12326.9 at trading day 1811
 enter long position at price 12339.2 at trading day 1811
 exit long position at price 12050.7 at trading day 1817
 enter short position at price 12038.6 at trading day 1817
 exit short position at price 12185.5 at trading day 1824
 enter long position at price 12197.7 at trading day 1824
 exit long position at price 12285.8 at trading day 1827
 enter short position at price 12273.5 at trading day 1827
 exit short position at price 12220.1 at trading day 1828
 enter long position at price 12232.4 at trading day 1828
 exit long position at price 11939.4 at trading day 1837
 enter short position at price 11927.5 at trading day 1837
 exit short position at price 12123.6 at trading day 1840
 enter long position at price 12135.8 at trading day 1840
 exit long position at price 12395.1 at trading day 1841
 enter short position at price 12382.7 at trading day 1841
 exit short position at price 12320.2 at trading day 1842
 enter long position at price 12332.5 at trading day 1842
 exit long position at price 12852.4 at trading day 1907
 enter short position at price 12839.5 at trading day 1907
 exit short position at price 13054.7 at trading day 1910
 enter long position at price 13067.7 at trading day 1910
 exit long position at price 13712 at trading day 1958
 enter short position at price 13698.3 at trading day 1958
 exit short position at price 13578.3 at trading day 1976
 enter long position at price 13591.8 at trading day 1976
 exit long position at price 13555.8 at trading day 1977
 enter short position at price 13542.2 at trading day 1977
 exit short position at price 13845.6 at trading day 1981
 enter long position at price 13859.5 at trading day 1981
 exit long position at price 13772 at trading day 1997
 enter short position at price 13758.2 at trading day 1997
 exit short position at price 13941.5 at trading day 1999
 enter long position at price 13955.4 at trading day 1999
 exit long position at price 13890.9 at trading day 2001
 enter short position at price 13877 at trading day 2001
 exit short position at price 13906.9 at trading day 2002
 enter long position at price 13920.8 at trading day 2002
 exit long position at price 13832.5 at trading day 2004
 enter short position at price 13818.7 at trading day 2004
 exit short position at price 13776.5 at trading day 2008
 enter long position at price 13790.2 at trading day 2008
 exit long position at price 13603.6 at trading day 2009
 enter short position at price 13590 at trading day 2009
 exit short position at price 13750.2 at trading day 2029
 enter long position at price 13764 at trading day 2029
 exit long position at price 13627 at trading day 2044
 enter short position at price 13613.4 at trading day 2044
 exit short position at price 13867.1 at trading day 2053
 enter long position at price 13880.9 at trading day 2053
 exit long position at price 14847.8 at trading day 2144
 enter short position at price 14832.9 at trading day 2144
 exit short position at price 14629.5 at trading day 2169
 enter long position at price 14644.1 at trading day 2169
 exit long position at price 14627.4 at trading day 2170
 enter short position at price 14612.8 at trading day 2170
 exit short position at price 14788 at trading day 2172
 enter long position at price 14802.8 at trading day 2172
 exit long position at price 15542.1 at trading day 2249
 enter short position at price 15526.5 at trading day 2249
 exit short position at price 15519.8 at trading day 2250
 enter long position at price 15535.3 at trading day 2250
 exit long position at price 15720.4 at trading day 2251
 enter short position at price 15704.6 at trading day 2251
 exit short position at price 15729 at trading day 2252
 enter long position at price 15744.8 at trading day 2252
 exit long position at price 16313.4 at trading day 2294
 enter short position at price 16297 at trading day 2294
 exit short position at price 15805.5 at trading day 2295
 enter long position at price 15821.3 at trading day 2295
 exit long position at price 15864.6 at trading day 2296
 enter short position at price 15848.7 at trading day 2296
 exit short position at price 16326.7 at trading day 2324
 enter long position at price 16343 at trading day 2324
27