Show Menu
Cheatography

Statistics in Behavioral Sciences Cheat Sheet by

Statistics in Behavioral Sciences: parametric and non-parametric tests

Stat­ist­ics

the branch of mathem­atics in which data are used descri­ptively or infere­ntially to find or support answers for scientific and other quanti­fiable questions.
It encomp­asses various techniques and procedures for recording, organi­zing, analyzing, and reporting quanti­tative inform­ation.

Difference - parametric test & non-pa­ram­etric test

PROPERTIES
PARAMETRIC
NON-PA­RAM­ETRIC
assump­tions
YES
NO
value for central tendency
mean
median­/mode
probab­ility distri­bution
normally distri­buted
user specific
population knowledge
required
not required
used for
interval data
nominal, ordinal data
correl­ation
pearson
spearman
tests
t test, z test, f test, ANOVA
Kruskal Wallis H test, Mann-w­hitney U, Chi-square

Correl­ation Coeffi­cient

a statis­tical measure of the strength of the relati­onship between the relative movements of two variables

value ranges from -1 to +1

-1 = perfect negative or inverse correl­ation
+1 = perfect positive correl­ation or direct relati­onship
0 = no linear relati­onship

Altern­atives

PARAMETRIC
NON-PA­RAM­ETRIC
one sample z test, one sample t test
one sample sign test
one sample z test, one sample t test
one sample Wilcoxon signed rank test
two way ANOVA
Friedman test
one way ANOVA
Kruskal wallis test
indepe­ndent sample t test
mann-w­hitney U test
one way ANOVA
mood's median test
pearson correl­ation
spearman correl­ation

Paired t-test

to compare means of two related groups
ex. compare weight of 20 mice before and after treatment

two condit­ions:
- pre post treatment
- two diff conditions ex two drugs

ASSUMP­TIONS
- random selection
- normally distri­buted
- no extreme outliers

FORMULA
t= m / s/√n
m= sample mean of differ­ences

df= n-1

t-dist­rib­ution

aka Student's t-dist­rib­ution = probab­ility distri­bution similar to normal distri­bution but has heavier tails
used to estimate pop parameters for small samples

Tail heaviness is determined by degrees of freedom = gives lower probab­ility to centre, higher to tails than normal distri­bution, also have higher kurtosis, symmet­rical, unimodal, centred at 0, larger spread around 0

df = n - 1
above 30df, use z-dist­rib­ution

t-score = no of SD from mean in a t-dist­rib­ution
we find:
- upper and lower boundaries
- p value

TO BE USED WHEN:
- small sample
- SD is unknown

ASSUMP­TIONS
- cont or ordinal scale
- random selection
- NPC
- equal SD for indep two-sample t-test

Two-sample z-test

to determine if means of two indepe­ndent popula­tions are equal or different
to find out if there is signif­icant diff bet two pop by comparing sample mean

knowledge of:
SD and sample >30 in each group

eg. compare perfor­mance of 2 students, average salaries, employee perfor­mance, compare IQ, etc

FORMULA:
z= x̄₁ - x̄₂ / √s₁2/n₁ + s₂2/n₂

s= SD

formula:
z= (x̄₁ - x̄₂) - (µ₁ - µ₂) / √σ₁2/n₁ + σ₂2/n₂

(µ₁ - µ₂) = hypoth­esized difference bet pop means

Point Biserial correl­ation

measures relati­onship between two variables

rpbi = correl­ation coeffi­cient

one continuous variable (ratio­/in­terval scale)
one naturally binary variable

FORMULA:
rpb= M1-M0/Sn * √ pq

Sn= SD

Two-sample z-test

to determine if means of two indepe­ndent popula­tions are equal or different
to find out if there is signif­icant diff bet two pop by comparing sample mean

knowledge of:
SD and sample >30 in each group

eg. compare perfor­mance of 2 students, average salaries, employee perfor­mance, compare IQ, etc

FORMULA:
z= x̄

z-test

for hypothesis testing
to check whether means of two popula­tions are equal to each other when pop variance is known

we have knowledge of:
- SD/pop­ulation variance and/or sample n=30 or more

if both unknown -> t-test


left-t­ailed
right-­tailed
two-tailed

REJECT NULL HYPOTHESIS IF Z STATISTIC IS STATIS­TICALLY SIGNIF­ICANT WHEN COMPARED WITH CRITICAL VALUE

z-stat­istic/ z-score = no repres­enting result from z-test

z critical value divides graph into acceptance and rejection regions
if z stat falls in rejection region­-> H0 can be rejected

TYPES
One-sample z-test

Two-sample z-test

ANOVA

Analysis of Variance
comparing several sets of scores
to test if means of 3 or more groups are equal
comparison of variance between and within groups
to check if sample groups are affected by same factors and to same degree
compare differ­ences in means and variance of distri­bution

ONE-WAY ANOVA=no of IVs
single IV with different (2) levels­/va­ria­tions have measurable effect on DV
compare means of 2 or more indep groups
aka:
- one-factor ANOVA
- one-way analysis of variance
- between subjects ANOVA

Assump­tions
- indepe­ndent samples
- equal sample sizes in groups­/levels
- normally distri­buted
- equal variance

F test is used to check statis­tical signif­icance
higher F value --> higher likelihood that difference observed is real and not due to chance
used in field studies, experi­ments, quasi-exp
CONDIT­IONS:
- min 6 subjects
- sample no of samples in each group

H0: µ1=µ2=µ3 ... µk i.e. all pop means are equal
Ha: at least one µi is different i.e atleat one of the k pop means is not equal to the others
µi is the pop mean of group

Spearman Correl­ation

non-pa­ram­etric version of Pearson correl­ation coeffi­cient
named after Charles Spearman
denoted by ρ(rho)

determine the strength and direction of monotonic variables bet two variables measured at ordinal, interval or ratio levels & whether they are correlated or not

monotonic function=one variable never increases or never decreases as its IV changes
- monoto­nically increa­sing= as X increases, Y never decreases
- monoto­nically decrea­sing= as X increases, Y never increases
- not monotonic= as X increases, Y sometimes dec and sometimes inc

for analysis with: ordinal data, continuous data
uses ranks instead of assump­tions of normality
aka Spearman Rank order test

FORMULA:
ρ= 1- 6Σdᵢ2/n(n2-1)

di= difference between two ranks of each observ­ation

-1 to +1
+1 = perfect associ­ation of ranks
0= no associ­ation
-1= perfect negative associ­ation of ranks

closer the value to 0, weaker the associ­ation

Value Ranges
0 to 0.3 = weak monotonic relati­onship
0.4 to 0.6 = moderate strength monotonic relati­onship
0.7 to 1 = strong monotonic relati­onship
 

Parametric and Non-pa­ram­etric test

Fixed set of parame­ters, certain assump­tions about distri­bution of population

PARAMETRIC - prior knowledge of pop distri­bution i.e NORMAL DISTRI­BUTION

NON-PA­RAM­ETRIC - no assump­tions, do not depend on popula­tion, DISTRI­BUTION FREE tests, values found on nominal or ordinal level
easy to apply, unders­tand, low complexity

decision based on - distri­bution of popula­tion, size of sample
parametric - mean & <30 sample
non-pa­ram­etric - median­/mode & >30 sample or regardless of size

Advantages & Disadv­antages - NON-PA­RAM­ETRIC TESTS

ADVANTAGES
DISADV­ANTAGES
simple, easy to understand
less powerful than parame­trics
no assump­tions
counte­rpart parametric if exists, is more powerful
more versatile
not as efficient as parametric tests
easier to calculate
may waste inform­ation
hypothesis tested may be more accurate
requires larger sample to be as powerful as parametric test
small sample sizes are okay
difficult to compute large samples by hand
can be used for all types of data (nominal, ordinal, interval)
tabular format of data required that may not be readily available
can be used with data having outliers

Applic­ation

PARAMETRIC TESTS
NON-PA­RAM­ETRIC TESTS
- quanti­tative & continuous data
- mixed data
- normally distri­buted
- unknown distri­bution of population
- data is estimated on ratio or interval scales
- different kinds of measur­ement scales

degrees of freedom

indepe­ndent values in the data sample that have freedom to vary

FORMULA:
no of values in a data set minus 1
df= N-1

t-test

statis­tical test to determine if signif­icant difference between avg scores of two groups
1908-W­illiam Sealy Gosset- student t-test and t-dist­irb­ution
for hypothesis testing

knowledge of:
distri­bution - normally distri­buted
no knowledge of SD

TYPES:
one-sample t-test - single group
FORMULA:
t= m - µ / s/√n

SD FORMULA:
σ= √Σ(X-µ)2 / N
s= √Σ(X-µ)2 / n-1

indepe­ndent two-sample t-test - two groups

paired­/de­pendent samples t-test - sig diff in paired measur­ements, compares means from same group at diff times (test-­retest sample)

H0: no effective difference = measured diff is due to chance
Ha: two-ta­iled/ one-tailed nonequ­ivalent means/­smaller or larger than hypoth­esized mean

PERFORM two-tailed test: to find out difference bet two popula­tions
one-tailed: one pop mean is > or < other

Indepe­ndent two-sample t-test

aka unpaired t-test
to compare mean of two indepe­ndent groups
ex. avg weight of males and females

two forms:
- student's t-test: assumes SD is equal
- welch's t-test: less restri­ctive, no assumption of equal SD
both provide more/less similar results

ASSUMP­TIONS:
- normally distri­buted
- SD is same
- indepe­ndent groups
- randomly selected
- indepe­ndent observ­ations
- measured on interval or ratio scale

FORMULA:
t= x̄₁ - x̄₂ / √s₁2/n₁ + s₂2/n₂

df= n1 + n2 - 2

S= √Σ (x1-x̄)2 + (x2-x̄)2 / n1+n2-2

One-sample z-test

to check if difference between sample mean & population mean when SD is known

FORMULA:
z=x-µ/SE

SE=σ/√n

z score is compared to a z table (includes % under NPC bet mean and z score), tells us whether the z score is due to chance or not

condit­ions:
knowledge of:
- pop mean
- SD
- simple random sample
- normal distri­bution

two approaches to reject H0:
- p-value approach - p-value is the smallest level of signif­icance at which H0 can be reject­ed...smaller p-value, stronger evidence
-critical value approach - comparing z stat to critical values... indicate boundary regions where stat is highly improbable to lie= critical region­s/r­eje­ction regions
if z stat is in critical region­-> reject H0
based on:
signif­icance level (0.1, 0.05, 0.01), alpha level, Ha

Biserial correl­ation

to measure relati­onship between quanti­tative variables and binary variables
given by Pearson - 1909

biserial correl­ation coeff varies bet -1 and 1
0= no associ­ation
ex. IQ scores and pass/fail correl­ation

continuous variable and binary variable (dicho­tomised to create binary variable)

rbis or rb = correl­ation index estimating strength of relati­onship between artifi­cially dichot­omous variable and a true continuous variable

ASSUMP­TIONS:
- data measured on continuous scale
- one variable to be made dichot­omous
- no outliers
- approx normally distri­buted
- equal variances (SD)

FORMULA
rb= M1-M0/SDt * pq/y

M1=mean of grp 1
M2= mean of grp 2
p= ratio of grp 1
q= ratio of grp 2
SDt= total SD
y= ordinate

Pearson Correl­ation

measures strength and direction of a linear relati­onship between two variables
how two data sets are correlated
gives us info about the slope of the line
r
aka:
- Pearson's r
- bivariate correl­ation
- Pearson produc­t-m­oment correl­ation coeffi­cient (PPMCC)

cannot determine dependence of variables & cannot assess nonlinear associ­ations

r value variation:
-0.1 to -.03 / 0.1 to 0.3 = weak correl­ation
-0.3 to -0.5 / 0.3 to 0.5 = averag­e/m­oderate correl­ation
-0.5 to -1.0 / 0.5 to 1.0 = strong correl­ation

FORMULA:
r=n(Σx­y)-­(Σx­)(Σy) / √[nΣx2-(Σx)2] [nΣy2-(Σy)2]

Mann-W­hitney U test

non-pa­ram­etric test to test the signif­icance of difference two indepe­ndently drawn groups OR compare outcomes between two indepe­ndent groups

equi to unpaired t test

CONDIT­IONS:
No NPC assump­tion, small sample size >30 with min 5 in each group, continuous data (able to take any no in range), randomly selected samples,

aka:
Mann-W­hitney Test
Wilcoxon Rank Sum test

H0: the two pop are equal
Ha: the two pop are not equal

denoted by U

FORMULA:
U1=n1n2+ n1(n1+1)/2 - R1

U2=n1n2+ n2(n2+1)/2 - R2

R= sum of ranks of group

One-way ANOVA test

One-way ANOVA test

One-way ANOVA test

 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          Basic Research Methods Cheat Sheet

          More Cheat Sheets by Sana_H

          Basic Research Methods Cheat Sheet
          Data Analysis in Psychological Research Cheat Sheet