Show Menu
Cheatography

Step 3 Cheat Sheet (DRAFT) by

Step 3: Summarize your data with descriptive Statistics

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Inspect your data

Frequency distri­bution
tables, bar charts, scatter plot
normal distri­bution
means that your data are symmet­rically distri­buted around a center where most values lie, with the values tapering off at the tail ends.
skewed distri­bution
is asymmetric and has more values on one end than the other. The shape of the distri­bution is important to keep in mind because only some descri­ptive statistics should be used with skewed distri­but­ions.
Outliers
are extreme values that differ from most other data points in a dataset. They can have a big impact on your statis­tical analyses and skew the results of any hypothesis tests.
 

Calculate measures of central tendency

describe where most of the values in a data set lie.
Mode:
the most popular response or value in the data set. To find the mode, order your data set from lowest to highest and find the response that occurs most frequently
Median:
the value in the exact middle of the data set when ordered from low to high. To find the median, order each response value from the smallest to the biggest. Then, the median is the number in the middle. If there are two numbers in the middle, find their mean.
Mean:
the sum of all values divided by the number of values. To find the mean, simply add up all response values and divide the sum by the total number of responses. The total number of responses or observ­ations is called N.
 

Calculate measures of variab­ility

tell you how spread out the values in a data set are. Four main measures of variab­ility are often reported:
Range:
the highest value minus the lowest value of the data set. To find the range, simply subtract the lowest value from the highest value.
Interq­uartile range:
the range of the middle half of the data set.
Standard deviation:
the average distance between each value in your data set and the mean. The standard deviation (s) is the average amount of variab­ility in your dataset. It tells you, on average, how far each score lies from the mean. The larger the standard deviation, the more variable the data set is.
Variance:
the square of the standard deviation. The variance (s2)is the average of squared deviations from the mean. Variance reflects the degree of spread in the data set. The more spread the data, the larger the variance is in relation to the mean.