The Histogram

Histograms can be used to display entire data sets, must like box plots, however they use bars to display how many data points fall into various data ranges within the set. As a result, histograms roughly approximate the true frequency distribution of a data set, and the narrower the data ranges used are, the better the approximation. The ranges into which the data is grouped are referred to as classes, or bins, as they act as containers where each data point can be allocated to, hence ‘filling up’ the bin.

As such, histograms are most often used for discrete data sets with a wide range. For example, histograms could be used to indicate the earnings distribution of the UK. The histogram could show who earns less that £10,000, who earns £10,000 to £20,000 and so on, with a maximum bin for those earning £100,000 or more. The histogram would then plot the different ranges on the x axis and the number of data points in each bin on the y axis. However, for a large data set such as the population of the UK, it is possible to use the percentage of data points in each bin as the scale for the y axis. Alternatively, by using the probability that any one point will be in a certain bin as the scale for the y-axis, the histogram can be used to display the probability distribution of the data.

MBA Writing Service

Our professionally qualified writers are available to produce most types of academic work on the subject of statistics. From assignments and coursework to full dissertations we are bound to have a service to suit your needs:

Our Writing Services

Histograms are generally used to display the rough shape and degree of symmetry of any frequency distribution, and if it resembles any defined distribution such as Binomial or Normal. The shape of the distribution can also indicate any measurement bias in the data, by comparing the data to the known distribution. For example, if a study of a sample of the UK population showed that most of them earned above £100,000 a year, this would indicate that the sample was biased towards high earners, as in fact a very low percentage of highly achieving people in the UK earn this each year.

The usefulness of a histogram is intrinsically linked to the number and width of the bins used for the data. As discussed above, using more bins will tend to reveal the exact nature of the underlying distribution of the data. However, if the bins are made too narrow them the small number of data points in each bin will mean that random variations in the data will blur the distribution. As such, a histogram should be constructed with a variety of different bin widths to give a good idea of the pattern of the data, as well as any random variations which may affect it.

Related Content

On top of our MBA help guides we also have a range of free resources covering the topic of statistics: