Show Menu
Cheatography

Stats exam 3 Cheat Sheet (DRAFT) by

For A psychology statistics exam on normal distribution, random samplying, and hypothesis testing

This is a draft cheat sheet. It is a work in progress and is not finished yet.

The Normal Distri­bution and Standard Scores

Why is the normal distri­bution important?
1. Many naturally occurring data (e.g., height, weight, etc,) have many distri­butions which are approx­imately normal.
2. Many statis­tical tests covered later use normal distri­but­ions.
3. Many sampling distri­butions approx­imate a normal distri­bution with large sample sizes.
Properties of a normal distri­bution
- Unimodal
- Mean is middle most score
- Equal on each side
-Two injection points occurring at (x μ+1σ & μ–1σ)
Area under the normal distri­bution
Calculated in percen­tages, the total area under the curve = 100%. Broken up into 8 sections. (0.13, 2.15, 13.9, 34.13, 34.13,­(mean (No
Area under the normal curve it’s based on
The number of standard deviations from the mean is constant for all normal distri­but­ions.
For any score…
If we know how many standard deviations it is away from the mean
How do we calculate?
z = (X-µ)/σ

Z Scores

What is a standard (or z) Score?
z score is a transf­ormed score that designates how many standard deviation units the corres­ponding raw score is above or below the mean.
What are the properties of z scores?
1. Mean=0 (μ
z
=0)
2. Standard deviat­ion=1 (σ
z
=1)
3. Shape of z score distri­bution is the SAME as shape of raw score distribution
-> The relative positions of the scores in the distri­bution do not change either
Column A
Shows the z score
Column B
Area between mean and z
Column C
Area beyond z
Column B and C will always add up to...
0.5000
Area under the normal curve based on the number of standard deviations from the mean is...
constant for all normal distri­butions
The scores we calculate are also called
- z score
- normal scores
- standa­rdized scores*
Converting z scores will...
Standa­rdize any distri­bution without regard to the original mean or SD
Once it is standa­rdized it will...
Always have a mean of 0 and a SD of 1 which allows for comparison across different distri­butions

Probab­ility

What are the two types of questions in infere­ntial statis­tics?
1) Hypothesis testing
2) Parameter estimation
Hypothesis testing
We have a hypothesis about a certain population and we wish to test it using a sample drawn from that popula­tions
Parameter estimation
We wish to know the magnitude of a population charac­ter­istic, so we test a sample (e.g., how much salary do students who graduate with a psych degree make in Canada?)
The goal is to...
Infer something about the population based on the info from a sample, theref­orethis sample has to be repres­ent­ative of the population and it must be a random sample.
Random sample
A sample selected from the population that satisfies the following two condition
1) Each possible sample has an equal chance of being selected
2) Each member of the population has an equal chance of being selected into the sample.
Why do we need random samples?
1) If we wish to generalize to the popula­tion, the sample must be repres­ent­ative of the population.
2) The laws of probab­ility cannot be used if the sample isn't random
Probab­ility
1) Cannot be negative (between 0-1)
- Probab­ility = 0 (event is certain not to occur)
- Probab­ility = 1 (event is certain to occur
2)Usually expressed as a decimal number but can be written as a fraction (keep 4 decimal places)
Probab­ility can be calculated in two ways...
1) a priori probab­ility
- deduced from reason (i.e., theore­tically based), without observations
2) A posteriori probab­ility
- Calculated based on the actual observ­ations (i.e., empiri­cally based)
A priori
From before
A posteriori
After the fact

A priori probab­ility

A priori probab­ility
Based on reason without actual observ­ations
P(A) =
Number of events classi­fiable as "­A"/ Total number of possible events
What is the a priori probab­ility of flipping a coin and getting a "­hea­d"
p(A) = 0.5

A posteriori probab­ility

A posteriori probabiity
Based on the actual observ­ations
P(A)
Number of times "­A" has actually occurred/ Total number of occurr­ences
If we actually flipped a coin 50 times, and got a head 30 times, what is the a posteriori probab­ility of getting a "­hea­d"
p(A) = 0.60

Multip­lic­ation rule for probab­ility

Multip­lic­ation rule
Concerned with determ­ining the probab­ility of joint or successive occurrence of several events
Multip­lic­ation rule example: There are two events (event a , event B) We can ask...
1) What is the probab­ility of both A and B happening together
2) What is the probab­ility of A happening first and B happening second?
P(A)
Probab­ility of A
P(B|A)
Probab­ility of B, given that A has occurred
P(A and B)
P(A)p(B|A)
Indepe­ndent events
Two events are indepe­ndent if the occurrene of one event has no effect on the probab­ility of occurrence of the other event
Note:sampling with replac­ement results in INDEPE­NDENT EVENTS (p(A and B) = p(A)p(B)
Example question: There are two dice. What is the probab­ility of getting a "­3" on the 1st die and a "­4" on the 2nd die in one roll?
Event A: "­3" on the 1st die
-p("3" on the 1st die) = 1/6
Event B: "­4" on the 2nd die
-p("4" on the 2nd die|"3" on the 1st die) = 1/6 {{nl} (1/6)(1/6) = 0.0278
Dependent events
The two events are dependent if the occurrence of one event (e.g., A) has an effect on the probab­ility of occurrence of the other event (e.g., B).
Note: Sampleing WITHOUT replac­ement results in DEPENDENT EVENTS p(A and B) = p(A)p(B|A)

Addition for probab­ility

Mutually exclusive events
Two events are mutually exclusive when the occurrence of one precludes the occurrence of the other.
Two events that CANNOT occur together p(A and B) = 0
Addition rule for probab­ility
Concerned with determ­ining the probab­ility of occurrence of any one of several possible events
- Probab­ility of A or B
p(A or B) =
p(A) +p(B) - p(A and B)
Example: What is the probab­ility that you will draw a king or a diamond on the first card from the deck?
Event A: King on the 1st card
- p(king) = 4/52
Event B: Diamond on the 1st card
p (diamond) = 13/52
= (4/52) + (13/52) - (1/52)
= 16/52 = 0.3077
Exhaustive sets of events
A set of events is exhaustive if the set includes all of the possible events (rolling a die, the set of events of getting a 1, 2, 3, 4, 5, or 6 is exhaus­tive; flipping a coin, the set of events of getting a head or tail is exhaus­tive)
If a set of events (A, B, C ...) are exhaustive and mutually exclusive
p(A) + p(B) + p(C) + ... = 1
Example (M(*)&A(+)): If you have a regular deck of playing cards, what is the probab­ility that at least one of the next three cards will be red (w/o replac­ement)?
p(at least 1 out of 3 red) = 1-p(all black)
=1-(26/52)(25/51)(24/50)
=1-0.117647
=0.8824

Hypothesis Testing

Why can't we just look at the data?
The varaib­ility in data, it's very hard to "­see­" the difference between groups or conditions (could have happened due to chance). This is why we need to use infere­ntial stats to test hypoth­eses, to determine whether there's a real difference between groups or conditions that is due to IV (or subject variable).
Free throw distra­ctions in Basketball
Do free throw distra­ctions influence the player's ability to succes­sfully make free throws?
Example hypotheses
- Fan distra­ctions affects free throw accuracy (H
1
)
- Fan distra­ctions does not affect free throw accuracy (H
0
)
-Free throws are more difficult to make with distra­ctions (H
1
)
-Free throws are not more difficult ot make with distra­cti­ons(H
0
)
- Free throws are easier to make with distra­ctions (H
1
)
- Free throws ar enot easier to make with distra­ctions (H
0
)
Null hypothesis
-hypot­hesies no effect
- No dfiference bwtween groups
No difference between conditions
no relati­onship
NO DIFFERENCE - NO EFFECT
Altern­ative hypothesis
- Hypoth­esizes that ther will be difference between groups / conditions and hat this dfference is due to the indepe­ndent variable/ subject variable
H
0
and H
1
must be...
mutually exclusive and exhaustive
Decision rule
- there must be criteria by which we will decide3 if the indepe­ndent variable really did have an effect (we can use probab­ility)
IF the proability is low
We will reject H
0
and accept H
1
If the probab­iliyt is not that low
We will not reject H
0
a
Threashold
a (alpha) 0.05 or for more precision 0.01
Type 1 error
Decide to reject eh null hypothesis but the null is actually true
Type 2 error
Decided to keep the null hypothesis but it actually is'nt true.

Breakdown of Normal Distri­bution Curve

Find perc­entile rank of a particular raw score

Find actual # of cases below a particular z score

 

Sampling with or without replac­ement

Finding area betw­een two raw scores

P of normally distri­buted cont. var. E.g. 1

p(A) = Area under the curve corres­ponding ot A / Total area under the curve
 

Finding area beyond a particular raw score

Finding part­icular raw scores of a given area

P of normally distri­buted cont. var. E.g. 2

 

Finding area below a particular raw score

Find perc­entile point for a given percentage