# The Importance Of Statistics In Economic Forecasts Finance Essay

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Statistics helps analyse observed data and draw conclusions to generalize or use in forecast of future behaviours. Statistics enables to quantify and communicate important features of the data. Descriptive statistics refers to the description and analysis of the important aspects of data. Based on the data, descriptive statistics enables information such as measures of central tendency, dispersion, and distribution, useful for description of the population or the sample.

Population is defined as 'all' the data. Sample is a subset of the population. In general, statistical analysis is based on samples since to obtain all the data is not feasible due to paucity of money and time. Statistical analysis is usually quantified in terms of sample statistics which are quantities computed from or used to describe a sample.

Inferential statistics or statistical inference, deals with the forecast, estimation or judgment of the population parameters. After drawing a sample and computing the sample statistics, it is important to infer whether the drawn sample and the statistics that describe the sample are good estimators of the population.

Descriptive statistics, inferential statistics, hypothesis testing and regression analysis are considered in this assignment. Two sets of data are analysed through descriptive statistics, followed by inferential statistics by constructing confidence intervals and hypothesis test to make inferences about the population, and concluded with a regression analysis to evaluate the relationship that exists between the two variables.

These exercises help in analysing the tendency of the stock vis-ï¿½-vis the market index in terms of returns and can help in quantitative risk management.

Descriptive Statistics

A set of data prices was drawn for both variables starting from 30.12.1999 and ending in 06.05.2008. The daily returns for both variables were computed as the percentage change of the final price to the initial price.

Summary statistics are used to summarize a set of observations, in order to communicate as much as possible as simply as possible through:

(i) A measure of location, or central tendency, such as the arithmetic mean, median, mode, or interquartile mean

(ii) A measure of statistical dispersion like the standard deviation, variance, range, or interquartile range, or absolute deviation

(iii) A measure of the shape of the distribution like skewness or kurtosis

Measures of central tendency help measure the frequency of repeated observations. The mean is the most common measurement because it is the number that balances out all the positive and negative deviations, the median is the middle observation from the data set, and the mode is the most common observation. An analysis of these three measures will be helpful to identify the point around which most of the data is gathered. However, it's not only relevant to know where most of the observations lie but also how dispersed the observations are from the centre. Measures of dispersion indicate the scattering of data values. When the data is more dispersed it also means that the stocks return have a higher volatility and therefore have a higher risk compared to the market returns. For our data, stock returns are a less risky asset compared to the market because it has a lower standard deviation

Prices: Excel was used to compute the statistics for the Stock and the Market Index Price as prices can take a countable number of possible values and so, are considered as a discrete random variable.

Returns: In contrast, returns cannot take a countable number. Returns are a continuous random variable and technically the range of possible outcomes can be any point of the real line.

Distribution: By looking at the histograms of the stock and market prices, (Graph 3 and Graph 4) is evident that the random variable (Prices) does not follow a defined distribution. The histogram of the Market (Graph 4) has two slumps, gives the impression of being bimodal and a significantly amount of data is gathered in the right-hand tail. In contrast, the histogram for stock returns, display a bell-shape, most of the data is gathered around the centre and they appear to be more or less symmetrical. In fact, the shape of the histogram looks like the shape of a normal distribution. Returns follow a normal distribution and are completely described by two parameters (mean and standard deviation).

Skewness: A perfect normal distribution will have a skewness of zero (0). This means that mean = median = mode. For our set of data we found that stock returns are positively skewed (tilted a little bit to the left) as the mode<median<mean; and that market returns are negatively skewed (tilted a little bit to the right) as the mean<median<mode.

Graphs

Graph 1: Stock Price and Market Price Graph 2: Histogram of Stock Returns

Graph 3: Histogram of Stock Prices Graph 4: Histogram of Market Prices

Answer to Q.1.1: Descriptive Statistics for Returns

Daily returns were computed using the equation: rt = (price t/ price t-1) - 1. This was computed using natural logarithms, rt = LN(price t/ price t-1) as it is more suitable for stock prices.

Measures of Central Tendency

The arithmetic mean for the returns is -0.00034723.

The median is given by the values of the item in the data array. The values do not exceed 0. So, the median is 0.

The mode is the value that occurs most frequently in the data set. So, the mode is 0.

Measures of Statistical Dispersion

Deviation for each return was computed with the arithmetic mean. Each of the deviations was squared (amplifying larger deviations and making negative values positive). The sum of the obtained squares, as a first step to obtaining an average, was divided by the sum by the number of values, giving an average. Taking the non-negative square root of the quotient, that is, converting squared units back to regular units, the standard deviation of the set obtained is 0.02640066.

The sample variance for the stock is 0.00069699.

The range for ungrouped data is equal to the value of the largest observation minus the value of the smallest observation in the data set. So, the range is 0.41768732.

Measure of Distribution

The distribution in this instance is platykurtic which indicates a peaked curve.

Confidence level of 95% was considered in this instance which yielded 0.00113034. The key point to note is that the target confidence level (95% in the above example) is the given parameter here; the output from the calculation (0.00113034) is the maximum loss or the value at risk at that confidence level.

In summary, the dataset provided (91ZR) yields the following values:

Mean -0.00034723

Standard error 0.00057638

Median 0

Mode 0

Standard deviation 0.02640066

Variance of the sample 0.00069699

Kurtosis 12.9536459

Skewness -0.77546017

Range 0.41768732

Minimum -0.22569045

Maximum 0.19199688

Sum -0.72848517

Count 2098

Confidence Level (95.0%) 0.00113034

Answer to Q.1.2 Subsamples

First subsample starts 01.03.2001 and ends 02.06.2003

Measures of Central Tendency

The arithmetic mean for the returns is -0.002378607.

The median is -0.00272476.

The mode is 0.

Measures of Statistical Dispersion

The standard deviation of the set obtained is 0.043799319.

The sample variance for the stock is 0.00191838.

The range is 0.417687322.

Measure of Distribution

The distribution is negatively skewed. The distribution is platykurtic.

Confidence level of 95% was considered which yielded the value 0.003619297.

Statistical characteristics of first subsample

Mean -0.002378607

Standard Error 0.001842651

Median -0.00272476

Mode 0

Standard Deviation 0.043799319

Sample Variance 0.00191838

Kurtosis 4.467273635

Skewness -0.52183745

Range 0.417687322

Minimum -0.225690446

Maximum 0.191996876

Sum -1.343912895

Count 565

Confidence Level (95.0%) 0.003619297

Second subsample starts 01.03.2004 and ends 19.05.2006

Measures of Central Tendency

The arithmetic mean for the returns is 0.000591647.

The median is 0.000552676.

The mode is 0.

Measures of Statistical Dispersion

The standard deviation of the set obtained is 0.013118442.

The sample variance for the stock is 0.000172094.

The range is 0.10174261.

Measure of Distribution

The distribution is positively skewed. The distribution is leptokurtic.

Confidence level of 95% was considered which yielded the value 0.001084024.

Statistical characteristics of second subsample

Mean 0.000591647

Standard Error 0.000551897

Median 0.000552676

Mode 0

Standard Deviation 0.013118442

Sample Variance 0.000172094

Kurtosis 1.633550449

Skewness 0.35133905

Range 0.10174261

Minimum -0.042709097

Maximum 0.059033513

Sum 0.334280487

Count 565

Confidence Level (95.0%) 0.001084024

Comparison between the first and second subsample:

Statistical characteristics First subsample Second subsample

Mean -0.002378607 0.000591647

Standard Error 0.001842651 0.000551897

Median -0.00272476 0.000552676

Mode 0 0

Standard Deviation 0.043799319 0.013118442

Sample Variance 0.00191838 0.000172094

Measure of Distribution

Kurtosis 4.467273635 1.633550449

Skewness -0.52183745 0.35133905

Range 0.417687322 0.10174261

Minimum -0.225690446 -0.042709097

Maximum 0.191996876 0.059033513

Sum -1.343912895 0.334280487

Count 565 565

Confidence Level (95.0%) 0.003619297 0.001084024

Two questions need to be answered - One, is the mean return for subsample 1 significantly different from subsample 2? ; Two, is the variance of subsample 1 significantly different from the variance of subsample 2?

Hypothesis tests help answer.

Test concerning two means (stock):

1. Hypotheses: H0 : ?1 = ?2 (same as ?1 - ?2 =0)

Ha: ?1 ? ?2 (same as ?1 - ?2 ?0)

2. Significance level test: = 0.05

Test Statistic: Critical Z = ï¿½ 1.96

4. Actual Z: so

5. Conclusion: Z< Z i.e., 0.45<1.96. We accept the null hypothesis and conclude that the two samples are not significantly different with a 95% level of confidence.

t-Test: Paired Two Sample for Means

Variable 1 Variable 2

Mean -0.002378607 0.000591647

Variance 0.00191838 0.000172094

Observations 565 565

Pearson Correlation 0.018704833

Hypothesized Mean Difference 0

df 564

t Stat -1.552171589

P(T<=t) one-tail 0.060591052

t Critical one-tail 1.647559816

P(T<=t) two-tail 0.121182103

t Critical two-tail 1.964178939

Test concerning the equality of two variances (stock):

1. Hypotheses: H0:

Ha:

2. Significance level test: = 0.05

3. Test statistic F(n1-1, n2-1, ) : F564,564, = 1

4. F value:

5. Conclusion: Because 2.706>1, we reject the null hypothesis and conclude that the variance for each subsample is significantly different with a 95% confidence level.

t-Test: Two-Sample Assuming Equal Variances

Variable 1 Variable 2

Mean -0.002378607 0.000591647

Variance 0.00191838 0.000172094

Observations 565 565

Pooled Variance 0.001045237

Hypothesized Mean Difference 0

df 1128

t Stat -1.544171052

P(T<=t) one-tail 0.061413692

t Critical one-tail 1.646205603

P(T<=t) two-tail 0.122827385

t Critical two-tail 1.962069236

We need to test if the mean market returns for subsample 1 is significantly different than the mean returns for subsample 2, and if the variance for subsample 1 is significantly different from the variance of subsample 2.

Test concerning two means (market):

1. Hypotheses: H0: ?1 = ?2 (same as ?1 - ?2 =0)

Ha: ?1 ? ?2 (same as ?1 - ?2 ?0)

2. Significance level test: = 0.05

Test Statistic: Critical Z = ï¿½ 1.96

4. Actual Z: where;

5. Conclusion: Because -1.54>-1.96; we cannot reject the null hypothesis, so we have to conclude that the two systems are not significantly different with a 95% level of confidence. The market mean of subsample 1 is equal to the market mean of subsample 2.

t-Test: Two-Sample Assuming Unequal Variances

Variable 1 Variable 2

Mean -0.00237861 0.000591647

Variance 0.00191838 0.000172094

Observations 565 565

Hypothesized Mean Difference 0

df 664

t Stat -1.54417105

P(T<=t) one-tail 0.061511668

t Critical one-tail 1.647151685

P(T<=t) two-tail 0.123023335

t Critical two-tail 1.963543053

Test concerning the equality of two variances (market):

1. Hypotheses: H0:

Ha:

2. Significance level test: = 0.05

3. Test statistic F(n1-1, n2-1, ) : F564,564, = 1

4. F value:

5. Conclusion: Because 6.01>1, we can reject the null hypothesis and conclude that the variance for each subsample is significantly different with a 95% confidence level.

z-Test: Two Sample for Means

Variable 1 Variable 2

Mean -0.002378607 0.000591647

Known Variance 0.00191838 0.000172094

Observations 565 565

Hypothesized Mean Difference 0

z -1.54417101

P(Z<=z) one-tail 0.061273455

z Critical one-tail 1.644853627

P(Z<=z) two-tail 0.12254691

z Critical two-tail 1.959963985

Answer to Q.2.1: Inference and Hypothesis Testing

Suppose the daily returns of "Stock X" follow a Student t distribution. What is the probability that a return be superior to +0.5%?

## =

To calculate this probability we are going to assume that the sample for stock returns follow a normal distribution ~N ( ) based on the central limit theorem, which states that for very big samples (n>30), the sampling distribution closely approximates to the normal distribution, even if the parent population is not normally distributed. For this exercise, we can apply the central limit theorem as our sample size is 2098 observations.

Before we calculate the probability, we have to convert the normal distribution into a standardize normal distribution ~N (0,1). We do this by calculating the standard normal variable (Z).

The original formula when computing the Z variable using a sample is:

However, if we look this Z number in the cumulative probability distribution tables for a standard normal distribution, from now on called 'Z-tables', the table's range is up to 4 which means that the implied probability for N(15.94) is 1. If this were to be true, then the P(X>0.5%) would be zero (i.e., 1-1 = 0%). So the probability of having returns greater than 0.5% will be none. But, if we look at the histogram calculated in point 1, it shows that there are several outcomes above the 0.5%.

The problem arises when standard deviation is divided by square root of n ( n). In this case, the sample size is so large that we could assume that we are working with the entire population. When the data is considered not as a sample but as the whole population, then we have computed the population's mean and standard deviation, and Z changes such that:

= 0.45

Z = (0.5 - Mean)/Standard Deviation = 0.45

Now, the Z variable is more coherent. We look in the Z-table, N(0.45) and find that:

Z (0.45) = 0.1736 = 17.36%

X < 0.5%

X > 0.5% = 0.5 - 0.1736

So, the probability that a return be superior to +0.5% is 17.36%

What is the probability that a return be between -2% and +2%?

When we compute the Z variable following the formula for sample data, we again obtain a non coherent result. Looking at the Z-table for N(60.2) will give a probability of 1 and from our previous evaluation we know that this is incorrect. Hence, once again, the difference arises because we are assuming to be working with a sample. If we were rather working with the population the Z statistic will be:

P (-2% < X < 2%)

Z = X - ?/ ? = 2% - ?/ ? = 1.64

1.64 from the table = 0.4495 = 44.95%

P (-2% < X < 2%) = 44.95% X 2

Z = X - ?/ ? = 2% - ?/ ? = 1.64

The Z variable is more coherent, and looking this value in the Z-table we find that N(1.64) is 0.4495. Therefore,

1.64 from the table = 0.4495 = 44.95% x 2

internal X' = 95%

? = 1 - 95% = 5%

?' = X' + Z ?

0.025

df = n - 1

2098 - 1 = 2097 = ?

9.96

? = 1 - 95% = 5%

In conclusion, the probability that a return be between -2% and 2% is 5%, which is a coherent result based in our prior knowledge.

Give a 95% confidence interval for the population mean

If we assume that the returns of stock follow a t-distribution, a 95% confidence interval will be constructed as:

The samples' mean price is -0.00034723 with standard deviation of 0.02640066. With the mean and all the 2098 prices there are ? degrees of freedom.

We can lookup in the table that for a confidence range of 95% and ? degrees of freedom, the value is 1.645

Is the mean of the daily returns of "Stock X" significantly different from the mean of the daily returns of "Market Index"?

Test concerning two means:

Stock returns

Market return

Hypotheses: H0: ?1 = ?2 (same as ?1 - ?2 =0)

Ha: ?1 ? ?2 (same as ?1 - ?2 ?0)

2. Significance level test: = 0.05

Test Statistic (two-tail test): Critical Z ï¿½

4. Actual Z: where;

5. Conclusion: Because -1.20 is less than -1.96; we accept the null hypothesis and conclude that the means from the two systems (market vs stock) are not significantly different with a 95% level of confidence. Or in other words, the mean of the daily returns of stock equals the mean of daily returns of the market.

2.2 Would your answer be different if you did not know what kind of distribution the population follows?

Yes. The probability of any outcome of a random phenomenon is the proportion of times the outcome would occur in a very long series of repetitions. That is, where each repetition is an independent trial (or an independent event). Two random events A and B are independent if knowing that outcome A occurs does not influence the probability that B occurs. Similarly, a trial is independent if the outcome of one trial does not influence the outcome of any other.

If we didn't know what distribution the population follows we could have still assume, due to the central limit theorem, that the sample followed a normal distribution if the data is made up of a very large number of observations (usually more than 30). The central limit theorem is a useful theorem for statistical inference and regression as it allows us to make precise probability statements about the population mean by using the sample mean, whatever the distribution of the population, as the theorem states that the sample mean follows and approximate normal distribution for large-size samples. Hence to know the population distribution for very large samples is irrelevant because we can just assume that the sample mean follows a normal distribution.

When the variance of the population is unknown, it is prudent to assume that the sample follows a t-distribution and use a t-reliability factor to construct confidence intervals and hypothesis test, as the t-distribution has wider tails so it will provide a more conservative result. Yet, for more than 120 degrees of freedom, the t-distribution table would provide the same probability like a normal distribution, which was what actually happened in our exercise. For t0.025, and ? degrees of freedom, we look at the t-distribution table and find that the t-reliability factor is 1.96 the same as the Z0.025. A different result is possible if we have fewer number of observations (less than 120) because then the t-reliability factor would be marginally different form the normal distribution (i.e., t0.025, 100= 1.984 ? 1.96).

Regression

Consider the daily returns of Stock X as the dependent variable and the daily returns of Market Index as the independent variable in a simple regression. Compute the estimates for a and b in y = a + bx for the two previously specified subsamples; Give a graphical representation of your results

Regression analysis is concerned with measuring the way in which one variable is related to another. In this case we want to evaluate if the stock returns are explained by or related to the market returns. The objective of a simple regression analysis is to arrive to a linear-equation that best describes the relationship between the two variables x and y. In this case, x corresponds to the market data, and y corresponds to the stock data.

Results for the simple regression analysis for the samples: For the linear equation y = a + bx for the samples, a = -0.0033 and b = 7.2352. Therefore, the estimated relationship between market returns and stock returns is: Y = -0.0033+7.2352X

Assuming a linear relationship, for every 1% increase in the market return, the stock return can be expected to increase by 7.2319% on average. With no change in the market, the stock returns are likely to be -0.0033%. Also, the slightly positive slope of the line means that the relationship between the two variables is positive (i.e., is not inverse). This means that if the market returns increases the stock returns also increases but in a relatively tiny proportion

Statistics of regression

Correlation coefficient multiple 0.02485623

Coefficient of determination R^2 0.00061783

R^2 adjusted 0.0001408

Standard error 0.02639705

Observers 2097

Analysis of variance (Anova)

Degrees of freedom Sum of squares Average squares F Critical value of F

Regression 1 0.000902472 0.000902472 1.29515878 0.255229555

Residue 2095 1.459804875 0.000696804

Total 2096 1.460707347

Coefficients Standard error Statistical Probability Probability Lower 95% Higher 95% Low 95.0% High 95.0%

Intercept -0.00335416 0.002716546 -1.23471521 0.21707498 -0.00868157 0.00197325 -0.00868157 0.001973249

Market Return 7.2352 6.3575 1.138050428 0.25522956 -5.23252 1.9703 -5.23252 1.97028

Test the significance of estimates (are your estimates of a different from zero, are your estimates of b different from zero?)

Testing significance of b:

While testing the significance for the estimate b, we are concerned with testing the null hypothesis H0: B=0 which means whether the slope of the population regression line is zero. An alternative hypothesis could be to test whether the slope of the population is greater than zero Ha: B>0.

To test this hypothesis we need to compare our previously obtained 'b' in the regression analysis, with the hypothesised value 'B'.

1. Hypotheses: H0: B = 0

2. Ha: B > 0

3. Significance level test: = 0.05

4. Test Statistic (one-tail test): Critical t n-2 = Critical t 565-2 = 1.64

5. t = b - B/Sb = 7.2352-0/6.3575 = 1.1380

6. Conclusion: Because 1.1375 < 1.64 we reject the null hypothesis and accept the alternative hypothesis which states that the regression line must slope downwards (B<0) such that information about the market (x) will help us predict values of the stock (y).

An alternative way for hypothesis testing is the p-value, which provides more precise information regarding the strength of the test. The p-value for the b estimate is 0.060591052, which is a very small number. The smaller the p-value, the stronger is the evidence against the null hypothesis, which means that we have more evidence that B is greater than 0. The p-value shows that the test is very strong as the p-value is almost equal to zero.

Testing the significance of a:

When testing the significance for the estimate 'a', we are concerned with testing the null hypothesis H0: A=0; that is, that the slope of the regression line is zero. The alternative hypothesis could be to test whether the slope of the population is different than zero Ha: A?0.

To test this hypothesis we need to compare our previously obtained 'a' in the regression analysis, with the hypothesised value 'A'.

Testing significance of a

1. Hypotheses: H0: A = 0

2. Ha: A ? 0

3. Significance level test: = 0.05

4. Test Statistic (two-tail test): Critical t n-2 = Critical t 565-2 = 1.96

5. t = ? - A/S? = -0.0033 - 0/0.00271 = -1.2177

6. Conclusion: Because -1.2177 < 1.96 we can not say that the population intercept is different than zero with a 95% confidence level. Or in other words we accept the null hypothesis and conclude that the intercept (a) is equal to zero.

7. P-value: For this sample the p-value is 0.123, which is greater than our level of significance (? = 0.05). This proves that the null hypothesis cannot be rejected.

Coefficients Typical Error t statistic Probability

Interception a -0.0033542 0.002716546 -1.23471521 0.21707498

coefficient b 7.2352 6.3575 1.138050428 0.25522956

From the table above, we can infer that the values of the coefficients are slightly different from zero. The significance of the estimates was assessed by a Student's t-test with a confidence interval of 95%. The probability column indicates that for coefficient a, there is a probability of 21% and for b the probability is 25%. In order for the coefficients to be significant, these probabilities should show values lesser than 5% with a 95% confidence. So, we can conclude that the estimates are of no significance.

How would you measure and qualify the explanatory power of these regressions? Comment your results.

A way of measuring the power of these regressions or the strength in the relationship between the two variables is by looking at the R2, or the coefficient of determination. The coefficient of determination, measures how much of the total variation is actually explained by the independent variable. For the subsamples that are considered the coefficient of determination is 0.000617832. Which means is that <1% of the variation in the stock returns is explained by the variation in the market, while >99% is unexplained. We will test the significance of r, to see whether the coefficient of determination is significantly different than zero.

Sample 1 - Testing significance of r

1. Hypotheses: H0: = 0

2. Ha: ? 0

3. Significance level test: = 0.05

4. Test Statistic (two-tail test) with n-1 degrees of freedom: Critical t n-2 = Critical t 565-2 = 1.96

5. t = r - Sr = r/ 1 - r2/n - 2 = 0.0248 - 0/0.0310 = 0.8

6. Conclusion: Because 1.96 > 0.8 we reject the null hypothesis and say that with a 95% confidence level, the degree of correlation between the market and the stock returns is insignificant.

Statistics of regression

Correlation coefficient multiple 0.02485623

Coefficient of determination R2 0.00061783

R2 adjusted 0.0001408

Standard error 0.02639705

Observer 2097

The table above shows the poor explanatory power of the regression. The multiple correlation coefficient is close to zero which means that there is actually almost no relation between the movements of the price of the market index and the prices of the stock. The determination coefficient r squared is almost zero, this shows that the price of the stock can only be explained by the movements in the market index in 0.06% out of 100%. So, we can not comment about the movements in stock price based on the market index movements.

Conclusion

In conclusion, the statistical analysis of Market Index and Stock X indicate that prices do not follow a distinguishable distribution while returns follow a normal distribution.

The Market Index price has a mean of US$417.58 and a standard deviation of US$90.65. The market index price also shows a bimodal frequency distribution, with a significant amount of observations occurring at the upper tail. The data is positively skewed and the mode (429) < median (415) < mean (417).

The Stock X price has a mean of US$242.42 and a standard deviation of US$123.25. Compared to the market index, stock X has a lower variation in price than the market. The frequency distribution graph shows a unimodal set of observations but again, there is a significant amount of observations that occur at the upper extreme. The data is negatively skewed and the mode (468) > median (220) > mean (242)

Stock Returns and Market Returns follow a normal distribution which is completely described by two parameters, mean and standard deviation. Stock returns ~N(-0.05%, 1.09%) and Market Returns ~N(0.01%, 0.69%), but the first one has positively skewed returns, whereas the market is negatively skewed.

The central limit theorem gives us the flexibility of not necessarily having to know the distribution that the population follows as for very large samples this fact is irrelevant because we can just assume that the sample mean follows a normal distribution. However, if the population variance is unknown, it is better to assume that the sample follows a t-distribution and construct confidence intervals and hypothesis test based on a t-reliability factor. However, when the number of observations exceed 120, using a t-factor or a Z-factor becomes irrelevant as they would be the same number due to the central limit theorem.

For the two time-frame analysis, we concluded that the first time-frame period was marked by a higher volatility for both variables relative to the second time-frame period. The volatility of the stock X returns in period 1 was significantly different than the one experienced in the second period. For market returns, period 1 was also marked by a higher volatility than period 2, and the variance of the two subsamples was significantly different.

For a standard normal distribution, approximately 95% of all observations fall in the interval ?ï¿½2?. However, for our stock X returns data, the probability that a return lay between -2% and 2% was approximately 5%.

The regression analysis for the two subsamples, showed that the line of best fit for the samples is described by the linear equation Y = -0.0033+7.2352X, with a slope (b) of -0.0033 and an intercept (a) of 7.2352. The test for significance for our estimates showed that the intercept is not significantly different than zero, but that the slope is slightly greater than zero with a 95% confidence level. Additionally, 0.06% of the variation the stock returns is explained by the market, but although it seems a low probability of explanatory power, we can say with a 95% confidence level that the stock returns is not significantly explained by the variation on the market returns.