Cheatography
https://cheatography.com
This is a cheat sheet that provides a basic introduction and summaries of different hypothesis testing in stats
Introduction
Statistical hypothesis 
Statistical hypothesis testing 
a hypothesis that is testable on the basis of observed data modeled as the realized values taken by a collection of random variables 
a statistical way of testing the assumption regarding a popular parameter 
steps of formulating a hypothesis
1. state the two hypothesis: Null hypothesis and Alternative hypothesis 
2. set the significance levels usually α = 0.05 
3. carrying out the hypothesis testing and calculate the test statistics and corresponding Pvalue 
4. compare Pvalue with significance levels and then decide to accept or reject null hypothesis 
Errors in Testing
Error Types 
Description 
denotation 
correct inference 
Type I error 
Reject null when null is true 
α = P(Type I error) 
1  α (significance level) 
Type II error 
Not reject null when null is false 
β = P(Type II error) 
1  β (= power) 
ChiSquare Test
Types 
Description 
Test for independence 
tests for the independence of two categorical variables 
Homogeneity of Variance 
test if more than two subgroups of a population share the same multivariate distribution 
goodness of fit 
whether a multinomial model for the population distribution (P1,....Pm) fits our data 
Test for independence and homogeneity of variance share the same test statistics and degree of freedoms by different design of experiment
Assumptions
1. one or two categorical variables
2. independent observations
3. outcomes mutually exclusive
4. large n and no more than 20% of expected counts < 5
Ftest
Anova Analysis 
comparing the means of two or more continuous populations 
Oneway layout 
A test that allows one to make comparisons between the means of two or more groups of data. 
twoway layout 
A test that allows one to make comparisons between the means of two or more groups of data, where two independent variables are considered. 
Assumptions about data:
1. each data y is normally distributed
2. the variance of each treatment group is same
3. all observations are independent
Ttest
Types 
Hypothesis 
Two Sample Ttest 
If two independent groups have different mean 
Paired Ttest 
if one groups have different means at different times 
One Sample Ttest 
mean of a single group against a known mean 
Assumptions about data
1. independent
2. normally distributed
3. have a similar amount of variance within each group being compared


One sample Ttest
where
m = the mean of sample
s = standard deviation of sample
degree of freedom = n  1
Paired Ttest statistics
where
m = the mean of differences between two paired sets of data
n = size of differences
s = the standard deviation of differences between two paired sets of data
degree of freedom = n  1
Independent twosample Ttest statistics
where
m = the means of group A and B respectively
n = the sizes of group A and B respectively
degrees of freedom = nA + nB  2 (given two samples have the same variance)
Test of independence and Homogeneity of variance
where
Er,c = (Nr * Nc)/n
df = (r  1) * (c  1)
c = column number
r = row number
Goodness of fit test
where:
O = observed value of data
E = expected value of data
k = dimension of parameter
df = n 1  k
Carrying out oneway anova test
SST 
total variance 
sum(Yij  overall mean of Y)^{2} 
SSW 
intragroup variance 
sum(mean of each observations across different treatments  mean of each treatment)^{2} 
SSB 
intergroup variance 
sum(mean of each treatments  overall mean of Y)^{2} 
Null hypothesis: the differentiated effect in each treatment group is 0
Alternative hypothesis: not all differentiated effect is 0
SST = SSW + SSB
test statistics:
Fi1,i(j1) = SSB/(I1)/SSW/I(J1)
where
I = number of different treatments
J = number of observations within each treatment

Created By
Metadata
Favourited By
Comments
No comments yet. Add yours below!
Add a Comment
Related Cheat Sheets