Cheatography
https://cheatography.com
This is a cheat sheet that provides a basic introduction and summaries of different hypothesis testing in stats
Introduction
Statistical hypothesis |
Statistical hypothesis testing |
a hypothesis that is testable on the basis of observed data modeled as the realized values taken by a collection of random variables |
a statistical way of testing the assumption regarding a popular parameter |
steps of formulating a hypothesis
1. state the two hypothesis: Null hypothesis and Alternative hypothesis |
2. set the significance levels usually α = 0.05 |
3. carrying out the hypothesis testing and calculate the test statistics and corresponding P-value |
4. compare P-value with significance levels and then decide to accept or reject null hypothesis |
Errors in Testing
Error Types |
Description |
denotation |
correct inference |
Type I error |
Reject null when null is true |
α = P(Type I error) |
1 - α (significance level) |
Type II error |
Not reject null when null is false |
β = P(Type II error) |
1 - β (= power) |
Chi-Square Test
Types |
Description |
Test for independence |
tests for the independence of two categorical variables |
Homogeneity of Variance |
test if more than two subgroups of a population share the same multivariate distribution |
goodness of fit |
whether a multinomial model for the population distribution (P1,....Pm) fits our data |
Test for independence and homogeneity of variance share the same test statistics and degree of freedoms by different design of experiment
Assumptions
1. one or two categorical variables
2. independent observations
3. outcomes mutually exclusive
4. large n and no more than 20% of expected counts < 5
F-test
Anova Analysis |
comparing the means of two or more continuous populations |
One-way layout |
A test that allows one to make comparisons between the means of two or more groups of data. |
two-way layout |
A test that allows one to make comparisons between the means of two or more groups of data, where two independent variables are considered. |
Assumptions about data:
1. each data y is normally distributed
2. the variance of each treatment group is same
3. all observations are independent
T-test
Types |
Hypothesis |
Two Sample T-test |
If two independent groups have different mean |
Paired T-test |
if one groups have different means at different times |
One Sample T-test |
mean of a single group against a known mean |
Assumptions about data
1. independent
2. normally distributed
3. have a similar amount of variance within each group being compared
|
|
One sample T-test
where
m = the mean of sample
s = standard deviation of sample
degree of freedom = n - 1
Paired T-test statistics
where
m = the mean of differences between two paired sets of data
n = size of differences
s = the standard deviation of differences between two paired sets of data
degree of freedom = n - 1
Independent two-sample T-test statistics
where
m = the means of group A and B respectively
n = the sizes of group A and B respectively
degrees of freedom = nA + nB - 2 (given two samples have the same variance)
Test of independence and Homogeneity of variance
where
Er,c = (Nr * Nc)/n
df = (r - 1) * (c - 1)
c = column number
r = row number
Goodness of fit test
where:
O = observed value of data
E = expected value of data
k = dimension of parameter
df = n -1 - k
Carrying out one-way anova test
SST |
total variance |
sum(Yij - overall mean of Y)2 |
SSW |
intra-group variance |
sum(mean of each observations across different treatments - mean of each treatment)2 |
SSB |
inter-group variance |
sum(mean of each treatments - overall mean of Y)2 |
Null hypothesis: the differentiated effect in each treatment group is 0
Alternative hypothesis: not all differentiated effect is 0
SST = SSW + SSB
test statistics:
Fi-1,i(j-1) = SSB/(I-1)/SSW/I(J-1)
where
I = number of different treatments
J = number of observations within each treatment
|
Created By
Metadata
Favourited By
Comments
No comments yet. Add yours below!
Add a Comment
Related Cheat Sheets