A set of statistical techniques used to describe and summarize the data. These techniques give a preliminary look at the central tendency, shape and spread of the data and may be graphical, tabular or numerical in nature.

Bar charts, Histograms, box plots etc. are common graphical tools; tables may be used to enumerate the frequencies in different cells. A numerical summary of the dataset includes measures of central tendency (mean, median, mode) and dispersion (variance, standard deviation, range, inter-quartile range) and may also include various measures of the shape (skewness, kurtosis) of the distribution.


Some software packages display a five-number summary to describe the dataset, which lists:

Minimum, the smallest value in the data set
First Quartile, or the cut-off value for the bottom 25% of the data points
Second Quartile, or Median
Third Quartile, or the cut-off value for the top 25% of the data points and
Maximum, the largest value in the data set

