Cheatography

# Hypothesis testing cheatsheet Cheat Sheet by mmmmy

This is a cheat sheet that provides a basic introduction and summaries of different hypothesis testing in stats

### Introd­uction

 Statis­tical hypothesis Statis­tical hypothesis testing a hypothesis that is testable on the basis of observed data modeled as the realized values taken by a collection of random variables a statis­tical way of testing the assumption regarding a popular parameter

### steps of formul­ating a hypoth­esis

 1. state the two hypoth­­esis: Null hypoth­­esis and Alte­r­native hypoth­­esis 2. set the sign­i­fi­­cance levels usually α = 0.05 3. carrying out the hypothesis testing and calculate the test statistics and corres­­po­nding P-va­lue 4. compare P-value with signif­icance levels and then decide to accept or reject null hypothesis

### Errors in Testing

 Error Types Descri­ption denotation correct inference Type I error Reject null when null is true α = P(Type I error) 1 - α (signi­ficance level) Type II error Not reject null when null is false β = P(Type II error) 1 - β (= power)

### Chi-Square Test

 Types Descri­ption Test for indepe­ndence tests for the indepe­ndence of two catego­rical variables Homoge­neity of Variance test if more than two subgroups of a population share the same multiv­ariate distri­bution goodness of fit whether a multin­omial model for the population distri­bution (P1,....Pm) fits our data
Test for indepe­ndence and homoge­neity of variance share the same test statistics and degree of freedoms by different design of experiment

Assump­tions
1. one or two catego­rical variables
2. indepe­ndent observ­ations
3. outcomes mutually exclusive
4. large n and no more than 20% of expected counts < 5

### F-test

 Anova Analysis comparing the means of two or more continuous popula­tions One-way layout A test that allows one to make compar­isons between the means of two or more groups of data. two-way layout A test that allows one to make compar­isons between the means of two or more groups of data, where two indepe­ndent variables are consid­ered.
1. each data y is normally distri­buted
2. the variance of each treatment group is same
3. all observ­ations are indepe­ndent

### T-test

 Types Hypothesis Two Sample T-test If two indepe­ndent groups have different mean Paired T-test if one groups have different means at different times One Sample T-test mean of a single group against a known mean
1. indepe­ndent
2. normally distri­buted
3. have a similar amount of variance within each group being compared

### One sample T-test

where
m = the mean of sample
s = standard deviation of sample
degree of freedom = n - 1

### Paired T-test statistics

where
m = the mean of differ­ences between two paired sets of data
n = size of differ­ences
s = the standard deviation of differ­ences between two paired sets of data
degree of freedom = n - 1

### Indepe­ndent two-sample T-test statistics

where
m = the means of group A and B respec­tively
n = the sizes of group A and B respec­tively
degrees of freedom = nA + nB - 2 (given two samples have the same variance)

### Test of indepe­ndence and Homoge­neity of variance

where
Er,c = (Nr * Nc)/n
df = (r - 1) * (c - 1)
c = column number
r = row number

### Goodness of fit test

where:
O = observed value of data
E = expected value of data
k = dimension of parameter
df = n -1 - k

### Carrying out one-way anova test

 SST total variance sum(Yij - overall mean of Y)2 SSW intra-­group variance sum(mean of each observ­ations across different treatments - mean of each treatment)2 SSB inter-­group variance sum(mean of each treatments - overall mean of Y)2
Null hypoth­esis: the differ­ent­iated effect in each treatment group is 0
Altern­ative hypoth­esis: not all differ­ent­iated effect is 0

SST = SSW + SSB

test statis­tics:

Fi-1,i­(j-1) = SSB/(I­-1)­/SS­W/I­(J-1)

where
I = number of different treatments
J = number of observ­ations within each treatment