Statistics Essays - Histogram

2015 words (8 pages) Essay

1st Jan 1970 Statistics Reference this

Disclaimer: This work has been submitted by a university student. This is not an example of the work produced by our Essay Writing Service. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UKEssays.com.

A histogram is often used for representing data from a continuous variable which are summarised as a grouped frequency distribution.

We use Excel to generate a Box to represents both the original and the corrected sets of data. The result is the following diagram:

The different methods of diagrammatic representation of statistical data are bar chat, histogram, steam and leaf, and lineplots. The bar chart is more appropriate to data from a discrete distribution that are summarised using a frequency distribution. A histogram is often used for representing data from a continuous variable which are summarised as a grouped frequency distribution. A histogram is therefore similar to a bar chat, but is used to present continuous data. Steam and leaf gives a visual representation similar to the histogram but has the advantage that it does not lose the detail of the individual data point in the grouping.  All these diagrams serve to examine the general shape of the distribution of data and help in making conjecture about values of quantities such as the median, the mean or the interquartile range. The last one, the lineplot, is often appropriate for smaller data sets, and can be useful for example to check whether toe data sets have a common variance.

We denote by  and  the mean of the original set and the corrected set respectively. Then we have:


i.e. .


i.e. .

Since we have an even number of observation, the median in this case will be the midpoint of the two middle observations. That’s:

For the original set the median is ;
For the corrected set the median is .

The standard deviation of each data set is given by , where ,  are the different values in each data set. Hence:

For the original set, , and for the corrected set .

The lower quartile is defined to be the th observation counting from below, and the upper quartile is the same but counting from above. The interquartile is simply the difference between the upper and the lower quartile. We have the results in the following table.

Original set

Corrected set

Lower quartile

3.815

3.7475

Upper quartile

3.3925

3.3925

Interquartile

0.4225

0.355

Question 2

Theoretically, the fact that 9 and 12 can be made up in as many ways as 10 and eleven 11 means that both sets of numbers should have the same probability to appear. The first thing that should be noted here is the fact that this is true if and only if when we throw a dice, all the numbers have the same probability of appearance, which if not always the case in practice when if when we need to allow consideration such as the on uniformity of the surface on which the dice is thrown, the angle and the velocity at which the is thrown, and even any deformation on the dice which all have an effect on the number that we will get. This problem thus highlights the impossibility of the probability to be an absolutely precise science as oppose to the other branches of mathematics.

Question 3

  1. The probability that a film processed on machine X is . Also, the quality of a film is independent of the quality of all the films processed before it. Thus the probability that three films randomly chosen from a batch coming from machine X is simply .
  2. Let’s denote by the event “the batch came from machine X”,  the event “the three film are all of good quality”. Clearly, what we are asking for is the probability that   and occur at the same time, which is the probability that the three films are all of good quality and the batch came from machine X. Using the theory of conditional probabilities, we have:

.

Since all  of all films are processed on machine X, then .  is simply the probability the probability that we calculated above. Thus . Hence:

.

Question 4

At each question only two things can happen:
1-the student can answer the correctly, and we denote by the probability that this does happen;
2-or the student can choose the wrong outcomes among the five possible, and we denote by  the probability that this does happen.

Obviously we must have . Given that only five outcomes are available at each question, only one of which being correct, we have , and .

The experiment that consists in answering a single question can therefore be viewed as a Bernoulli experiment with parameter . Hence, Taking all the multiple-choice examination can be viewed as Binomial experiment with parameter , where . Let’s  be the random variable representing the number of correct answer achieved by the student. Clearly, the distribution of  Binomial with parameter . The probability that the student passes the test is the , which is equivalent to . But:

,

where for each , .

Hence,
.

This gives us , and thus the probability that the student passes the test is .

Question 5

Bayes’ Formula
Let E, F be subsets of some sample space S, and let Fc be the complement of F in S. We can express E as
begin{displaymath}E=EFcup EF^{c} end{displaymath}

because in order for a point to be in E it must be either in E and F or in E but not in F. As EF and EFc are mutually exclusive we can write
Applying this to the conditional probability equation gives
.
Consider the following problem:

We have three boxes labelled U1, U2 and U3. Each of them contains a mix of white and red balls. The proportion of white balls is each of them is as follows: 30% for U1, 60% for U2, 40% for U3.
We draw one ball from U1; if it is a white ball then we draw a ball in U2, otherwise we draw a ball in U3.
We would like to find the probability that the first draw gives a red ball knowing that the second draw has given a given a white ball.

We denote by the event “the second draw is made in the box Ui”,  the event “the second draw gives a white ball”.

Clearly, if the first draw gives a red ball, then the second can be made only in U3. Thus the probability that the first draw gives a red ball knowing that the second draw has given a given a white one is exactly the same as the probability that the second ball comes from U3 knowing that it is a white ball, which is nothing else than . Using the Bayes’ formula, we have

.                               (1)

It can be easily seen that  and  are mutually exclusive as a the second draw can not happen in both U2 and U3 simultaneously. Also since the second draw can happen only either in U2 or U3, then  gives all the possibility on where the second draw can happen. That is why
.

The top of the fraction (1) is simply application of the conditional probability.

Hence:

A histogram is often used for representing data from a continuous variable which are summarised as a grouped frequency distribution.

We use Excel to generate a Box to represents both the original and the corrected sets of data. The result is the following diagram:

The different methods of diagrammatic representation of statistical data are bar chat, histogram, steam and leaf, and lineplots. The bar chart is more appropriate to data from a discrete distribution that are summarised using a frequency distribution. A histogram is often used for representing data from a continuous variable which are summarised as a grouped frequency distribution. A histogram is therefore similar to a bar chat, but is used to present continuous data. Steam and leaf gives a visual representation similar to the histogram but has the advantage that it does not lose the detail of the individual data point in the grouping.  All these diagrams serve to examine the general shape of the distribution of data and help in making conjecture about values of quantities such as the median, the mean or the interquartile range. The last one, the lineplot, is often appropriate for smaller data sets, and can be useful for example to check whether toe data sets have a common variance.

We denote by  and  the mean of the original set and the corrected set respectively. Then we have:


i.e. .


i.e. .

Since we have an even number of observation, the median in this case will be the midpoint of the two middle observations. That’s:

For the original set the median is ;
For the corrected set the median is .

The standard deviation of each data set is given by , where ,  are the different values in each data set. Hence:

For the original set, , and for the corrected set .

The lower quartile is defined to be the th observation counting from below, and the upper quartile is the same but counting from above. The interquartile is simply the difference between the upper and the lower quartile. We have the results in the following table.

Original set

Corrected set

Lower quartile

3.815

3.7475

Upper quartile

3.3925

3.3925

Interquartile

0.4225

0.355

Question 2

Theoretically, the fact that 9 and 12 can be made up in as many ways as 10 and eleven 11 means that both sets of numbers should have the same probability to appear. The first thing that should be noted here is the fact that this is true if and only if when we throw a dice, all the numbers have the same probability of appearance, which if not always the case in practice when if when we need to allow consideration such as the on uniformity of the surface on which the dice is thrown, the angle and the velocity at which the is thrown, and even any deformation on the dice which all have an effect on the number that we will get. This problem thus highlights the impossibility of the probability to be an absolutely precise science as oppose to the other branches of mathematics.

Question 3

  1. The probability that a film processed on machine X is . Also, the quality of a film is independent of the quality of all the films processed before it. Thus the probability that three films randomly chosen from a batch coming from machine X is simply .
  2. Let’s denote by the event “the batch came from machine X”,  the event “the three film are all of good quality”. Clearly, what we are asking for is the probability that   and occur at the same time, which is the probability that the three films are all of good quality and the batch came from machine X. Using the theory of conditional probabilities, we have:

.

Since all  of all films are processed on machine X, then .  is simply the probability the probability that we calculated above. Thus . Hence:

.

Question 4

At each question only two things can happen:
1-the student can answer the correctly, and we denote by the probability that this does happen;
2-or the student can choose the wrong outcomes among the five possible, and we denote by  the probability that this does happen.

Obviously we must have . Given that only five outcomes are available at each question, only one of which being correct, we have , and .

The experiment that consists in answering a single question can therefore be viewed as a Bernoulli experiment with parameter . Hence, Taking all the multiple-choice examination can be viewed as Binomial experiment with parameter , where . Let’s  be the random variable representing the number of correct answer achieved by the student. Clearly, the distribution of  Binomial with parameter . The probability that the student passes the test is the , which is equivalent to . But:

,

where for each , .

Hence,
.

This gives us , and thus the probability that the student passes the test is .

Question 5

Bayes’ Formula
Let E, F be subsets of some sample space S, and let Fc be the complement of F in S. We can express E as
begin{displaymath}E=EFcup EF^{c} end{displaymath}

because in order for a point to be in E it must be either in E and F or in E but not in F. As EF and EFc are mutually exclusive we can write
Applying this to the conditional probability equation gives
.
Consider the following problem:

We have three boxes labelled U1, U2 and U3. Each of them contains a mix of white and red balls. The proportion of white balls is each of them is as follows: 30% for U1, 60% for U2, 40% for U3.
We draw one ball from U1; if it is a white ball then we draw a ball in U2, otherwise we draw a ball in U3.
We would like to find the probability that the first draw gives a red ball knowing that the second draw has given a given a white ball.

We denote by the event “the second draw is made in the box Ui”,  the event “the second draw gives a white ball”.

Clearly, if the first draw gives a red ball, then the second can be made only in U3. Thus the probability that the first draw gives a red ball knowing that the second draw has given a given a white one is exactly the same as the probability that the second ball comes from U3 knowing that it is a white ball, which is nothing else than . Using the Bayes’ formula, we have

.                               (1)

It can be easily seen that  and  are mutually exclusive as a the second draw can not happen in both U2 and U3 simultaneously. Also since the second draw can happen only either in U2 or U3, then  gives all the possibility on where the second draw can happen. That is why
.

The top of the fraction (1) is simply application of the conditional probability.

Hence:

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this essay and no longer wish to have your work published on the UKDiss.com website then please: