Show Menu
Cheatography

Psychological Statistics Cheat Sheet (DRAFT) by

Psychological Statistics Overview of Tests

This is a draft cheat sheet. It is a work in progress and is not finished yet.

First look at the Data

Population
entire group that you want to draw conclu­sions about.
Sample
he specific group that you will collect data from. The size of the sample is always less than the total size of the population
Mean
average (μ mean of popula­tion; x̄ mean of sample)
Median
separates the sample (Mitte­lpunkt)
Mode
highest score
Variance
measures dispersion around the mean
Standart Deviation (SD)
estimates the SD of the sampling distri­bution
 
FORMULA
Standard Error
Square root of the variance (σ SD of popula­tion; s SD of sample)
 
s/√n
Confidence Intervalls (CI)
This is the range of values you expect your estimate to fall between if you redo your test, within a certain level of confid­ence. Confid­ence, in statis­tics, is another way to describe probab­ility
Quanti­tative data
s expressed in numbers and graphs and is analyzed through statis­tical methods.
Qualit­ative data
is expressed in words and analyzed through interp­ret­ations and catego­riz­ations.
Hypothesis Testing
H0
the null hypothesis of a test always predicts no effect or no relati­onship between variables
H1
altern­ative hypothesis states your research prediction of an effect or relati­onship
Random­isation
completely randomized design
every subject is assigned to a treatment group at random.
 
Ex. Subjects are all randomly assigned a level of phone use using a random number generator.
randomized block design
subjects are first grouped according to a charac­ter­istic they share, and then randomly assigned to treatments within those groups
 
Ex. Subjects are first grouped by age, and then phone use treatments are randomly assigned within these groups.
Betwee­n-s­ubjects vs. within­-su­bjects
betwee­n-s­ubjects design
AKA indepe­ndent measures design or classic ANOVA design
 
indivi­duals receive only one of the possible levels of an experi­mental treatment.
 
EX. Subjects are randomly assigned a level of phone use (none, low, or high) and follow that level of phone use throughout the experi­ment.
within­-su­bjects design
AKA repeated measures design
 
every individual receives each of the experi­mental treatments consec­uti­vely, and their responses to each treatment are measured.
 
EX. Subjects are assigned consec­utively to zero, low, and high levels of phone use throughout the experi­ment, and the order in which they follow these treatments is random­ized.

Different Scales of Measur­ement

Nominal Categories
do not correspond to numerical value
 
Ex. British Team, German Team, ...
Ordinal Measur­ement or Ranks
scores can be ordered from smallest to largest, only a rank order is implied
 
Ex. 1st, 2nd, 3rd, ...
Interval Measur­ement
size of the difference between scores is an indication of magnitude
 
Ex. Bill was 5 seconds behind the winner, ... (equal interval scale of measur­ement - interval of 1 second)
Ratio Measur­ement
like Interval Measur­ement, but allows ratios to be meanin­gfully calculated between scores
 
Ex. Tom took 50 seconds and Bill took 100 seconds -> Tom is twice as fast as Bill

Types of Variables

Dependent Variable
Variables that represent the outcome of the experi­ment.
 
Ex. Any measur­ement of plant health and growth: in this case, plant height and wilting.
Indepe­ndent Variable
Variables you manipulate in order to affect the outcome of an experiment
 
Ex. The amount of salt added to each plant’s water.
Controlled Variable
Variables that are held constant throughout the experi­ment.
 
Ex. The temper­ature and light in the room the plants are kept in, and the volume of water given to each plant.
Confou­nding Variable
A variable that hides the true effect of another variable in your experi­ment. This can happen when another variable is closely related to a variable you are interested in, but you haven’t controlled it in your experi­ment.
 
Ex. Pot size and soil type might affect plant survival as much or more than salt additions. In an experiment you would control these potential confou­nders by holding them constant.
Latent variables
A variable that can’t be directly measured, but that you represent via a proxy.
 
Ex. Salt tolerance in plants cannot be measured directly, but can be inferred from measur­ements of plant health in our salt-a­ddition experi­ment.
Composite variables
A variable that is made by combining multiple variables in an experi­ment. These variables are created when you analyze data, not when you measure it.
 
Ex. The three plant health variables could be combined into a single plant-­health score to make it easier to present your findings.
Quanti­tative Variables
Discrete/ integer variables
Counts of individual items or values.
 
Ex. Number of students in a class; Number of different tree species in a forest
Continuous variables (aka ratio variables)
Measur­ements of continuous or non-finite values.
 
Ex. Distance, Volume, Age
Categorial Variables
Binary­/di­cho­tomous variables
Yes/no outcomes
Nominal variables
Groups with no rank or order between them.
 
Ex. Species, Names, Colors, Brands
Ordinal variables
Groups that are ranked in a specific order.
 
Ex. Finishing place in a race, Rating scale responses in a survey
 

Sampling

Probab­ility sampling methods
Probab­ility sampling means that every member of the population has a chance of being selected. It is mainly used in quanti­tative research. If you want to produce results that are repres­ent­ative of the whole popula­tion, probab­ility sampling techniques are the most valid choice.
Simple random sampling
every member of the population has an equal chance of being selected. Your sampling frame should include the whole popula­tion.
Systematic sampling
is similar to simple random sampling, but it is usually slightly easier to conduct. Every member of the population is listed with a number, but instead of randomly generating numbers, indivi­duals are chosen at regular intervals.
Stratified sampling
involves dividing the population into subpop­ula­tions that may differ in important ways. It allows you draw more precise conclu­sions by ensuring that every subgroup is properly repres­ented in the sample. To use this sampling method, you divide the population into subgroups (called strata) based on the relevant charac­ter­istic (e.g. gender, age range, income bracket, job role).
Cluster sampling
also involves dividing the population into subgroups, but each subgroup should have similar charac­ter­istics to the whole sample. Instead of sampling indivi­duals from each subgroup, you randomly select entire subgroups
Non-pr­oba­bility sampling methods
In a non-pr­oba­bility sample, indivi­duals are selected based on non-random criteria, and not every individual has a chance of being included. This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias. That means the inferences you can make about the population are weaker than with probab­ility samples, and your conclu­sions may be more limited. If you use a non-pr­oba­bility sample, you should still aim to make it as repres­ent­ative of the population as possible. Non-pr­oba­bility sampling techniques are often used in explor­atory and qualit­ative research. In these types of research, the aim is not to test a hypothesis about a broad popula­tion, but to develop an initial unders­tanding of a small or under-­res­earched popula­tion.
Conven­ience sampling
A conven­ience sample simply includes the indivi­duals who happen to be most accessible to the resear­cher. This is an easy and inexpe­nsive way to gather initial data, but there is no way to tell if the sample is repres­ent­ative of the popula­tion, so it can’t produce genera­lizable results.
Voluntary response sampling
Similar to a conven­ience sample, a voluntary response sample is mainly based on ease of access. Instead of the researcher choosing partic­ipants and directly contacting them, people volunteer themselves (e.g. by responding to a public online survey). Voluntary response samples are always at least somewhat biased, as some people will inherently be more likely to volunteer than others.
Purposive sampling
This type of sampling, also known as judgement sampling, involves the researcher using their expertise to select a sample that is most useful to the purposes of the research. It is often used in qualit­ative research, where the researcher wants to gain detailed knowledge about a specific phenomenon rather than make statis­tical infere­nces, or where the population is very small and specific. An effective purposive sample must have clear criteria and rationale for inclusion.
Snowball sampling
If the population is hard to access, snowball sampling can be used to recruit partic­ipants via other partic­ipants. The number of people you have access to “snowb­alls” as you get in contact with more people.

Data Cleansing

Data cleansing involves spotting and resolving potential data incons­ist­encies or errors to improve your data quality.
Type I vs Type II error
Type I error (false positive)
Type II error (false negative)