Show Menu
Cheatography

AP Stat Test Cheat Sheet (DRAFT) by

AP Stat concepts (non-formula)

This is a draft cheat sheet. It is a work in progress and is not finished yet.

Five Number Summary

Minimum, Q1, Median, Q3, Maximum

IQR and Outliers

IQR: Q3-Q1
Low Outlier: (Q1)(1.5 IQR)
High Outlier: (Q3)(1.5 IQR)

Boxplot

Histograms

Regular: equal spacing, used to determine shape
Relative: divide # by n to get %
Cumula­tive: ≤, last bar = n, never for shape
Cumula­tive: percentage based, 50% is median
relative location= (# of values below)/n

Transf­orm­ations

Linear: affects center not spread
Adds to sample mean (x), M, Q1, Q3, IQR
Multip­lic­ation: affects center and spread
Multiplies to mean (x), M, Q1, Q3, IQR, and sigma

"­Des­cribe the distri­but­ion­"

Center: Mean or median
Spread; Standard deviation, IQR or range
Shape: Symmetric or skewed

Shape

Density Curves

-Always on or above x-axis
-Area = 1
-Has mean and median
-Infinite tails
-Singular point has NO AREA
Median: equal areas point (point that divides curve in half)
Mean: balance point (where curve would balance if solid)
Symmetric curve: Mean = Median

"­Des­cribe the relati­ons­hip­" (Scatter plot)

Direction: +/- or neither (analyzed from L to R)
Form: linear or not
Strength: how correlated (see below)
Very strong: 100-91
Strong: 90-85
Moderately strong: 84-75
Moderately weak: 74-70
Weak: 70 and below
 

Residual Plots

Exhibits random­ness, then a line is a good model for the data
Exhibits a pattern, then a line is NOT a good model for the data

Coeffi­cents

R2 (Coeff­icient of determ­ina­tion): represents the percentage of the change in the y-variable that can be attributed to its relati­onship with the x-variable
Ex: r-squared for the regression between x and y is .73, we can say that x accounts for 73% of the variation in y
R (Corre­lation coeffi­cient): strength of linear line on scale of -1≤0≤1
-1: perfectly linear (negative slope)
0: literally sucks
1: perfectly linear (positive slope)

Correl­ation

X and Y variable assignment doesn't matter
Quanti­tative values only
Non-re­sistant (affected by outliers)

Mini Tab!

Scatte­rplot Vocab

X-vari­able: explan­ato­ry/­ind­epe­ndent variable (cause)
Y-vari­able: respon­se/­dep­endent variable (effect)
Extrap­ola­tion: predicting outcome outside of the domain
Interp­ola­tion: predicting inside the domain [lowest X, highest X]

Sampling Methods

Simple random sample (SRS): every group of objects has equal probab­ility of being selected
Ex: Hat method, calc, table of random digits*
Stratified random sample: sample from each subgroup (good for compar­ison)
Cluster sample: pick a few subgroups to sample and sample entire subgroup
Systematic Random Sampling: select a sample using a system (like every 3rd)
*ignore number if not in sample, skip if repeat

Bad Sampling

Voluntary response: incomplete data (extremes)
Conven­ience: chooses easiest indivi­duals to reach
Under-­cov­erage: people aren't reached or accessible

Bad surveying

Non-re­sponse: not providing data or talking to you
Response: lying
Poor wording: leans toward bias answer
 

Principles of Experi­mental Design

Compar­ison: Use design that compares two or more treatments
Random Assign­ment: Use chance to assign experi­mental units to treatments (balances effects of other variables)
Control: Keep other variables that might affect the response the same for all groups
Replic­ation: Use enough experi­mental units in each group so that any differ­ences in the effects of the treatments can be distin­guished from chance differ­ences between the groups
Completely random­ized:
-The treatments are assigned to all the experi­mental units completely by chance
-Control group: that receives an inactive treatment or an existing baseline treatment
-Placebo effect: response to a dummy treatment
-Doubl­e-blind experi­ment: neither the subjects nor those who interact with them and measure the response variable know which treatment a subject received

Experi­mental Design

Matched pairs:
-Rando­mized blocked experiment in which each block consists of a matching pair of similar experi­mental units
-Chance is used to determine which unit in each pair gets each treatment

Law of Large Numbers

As n becomes large the sample mean approaches the population mean

Binomials

1. Each observ­ation falls into one of just two categories –“success” or “failure”
2. The procedure has a fixed number of trials – (n)
3. The observ­ations must be indepe­ndent – result of one does not affect another
4. The probab­ility of success (p) remains the same for each observ­ation

Geometric

(1-3 same as binomial)
4. The variable of interest is the number of trials required to obtain the first success*
*Geometric is also called a “waiti­ng-­time” distri­bution

Error

Type I: Rejecting the Ho when it is actually true (a false positive); probab­ility = alpha
Type II: Accepting the Ho when it is actually false (a false negative)

Central Limit Theorem

As n becomes very large the sampling distri­bution for sample mean (x bar) is approx­imately normal
(n≥30)