Classical + Relative
P(A) = N(A)/N(S)
P(A) = f(A)/n |
Joint PMF
p(x,y) = P(X=x, Y=y) = P({X=x}∩{Y=y}) |
Geometric Distribution
X = # of trials until 1st success
X ~ g(p)
f(x) = (1-p)x-1p, for x=1,2,...
F(x) = 1-(1-p)x, for x=1,2,...
E[X] = 1/p
V[X] = (1-p)/p2 |
Continuous Variable
P(a<X<b) = ∫f(x)dx = F(b)-F(a)
f(x) = F'(x)
F(x) = P(X<x) = ∫f(t)dt
E[X] = ∫xf(x)dx
V[X] = ∫x2f(x)dx-E[X]2
E[g(X)] = ∫g(x)f(x)dx
V[g(X)] = ∫(g(x))2f(x)dx=E[g(X)]2 |
Normal Distribution
f(x) = 1/√(2πσ2)*e-(x-μ)^2/(2σ^2), -∞<x<∞
X ~ N(μ, σ2)
E[X] = μ
V[X] = σ2 |
Box Plot
Describe histogram: skewness, uni/bimodal
Constructing Confidence Interval
P = Y/n
Y ~ b(n,p)
Z = (P-p)/√(p(1-p)n) ~ N(0,1)
E = z_[α/2]√(p(1-p)/n) |
Sample Correlation
s_x and s_y are standard dev.
|
|
Permutations
n! = n(n-1)(n-2)*...*1 if n≥1
= 1 if n=0
nPr = n!/(n-r)! |
Variance
σ2 = V[X] = Σx2f(x)-E[X]2 |
Standard deviation = sqrt(V[X])
Joint Properties
E[g(X,Y)] = ΣxΣyg(x,y)p(x,y)
E[X] = Σxxp(x)
E[Y] = Σyyp(y)
E[X+Y] = E[X]+E[Y]
Cov[X,Y] = (ΣxΣyxyp(x,y))-E[X]E[Y]
V[X+Y] = V[X]+V[Y]+2Cov[X,Y] |
Poisson Distribution
X = # of event in time [0,1]
p(x) = e-μ*μx/x!, for x=0,1,...
X ~ P(μ)
E[X] = V[X] = μ
Approximation: binomial f(x) ≈ p(x), μ=np
Process: between [0,t], μ=λt |
Continuous Uniform Distribution
f(x) = 1/(b-a), a≤x≤b
= 0, elsewhere
X ~ U[a,b]
E[X] = (a+b)/2
V[X] = (b-a)2/12 |
Sample Variance
s2 = ((Σx2_i)-nx̄2)/(n-1) |
CLT
Z = (X̄-μ)/(σ/√n)
X̄ N(μ, σ2/n) ⇒ Z N(0,1) |
Confidence Level
α = P(Z>z_α) = 1-Φ(z)
μ ∈ [x̄-E, x̄+E]
σ2 known: E = z_[α/2]*σ/√n
σ2 unknown: T = (X̄-μ)/(S/√n) ~ T(n-1)
P(T>t_[α,v]) = α; z_α = t_[α,∞]
E = t_[α/2,n-1]*s/√n
σ2 unknown, n≥40: (X̄-μ)/(S/√n) ~ N(0,1)
E = z_[α/2]*s/√n
n≥((z_[α/2]σ)/E)2 |
|
|
Combinations
n = n_1*...*n_k
nCr = (nr) = n!/r!(n-r)! |
Multiplucation Rule
P(A∩B) = P(B|A)P(A) = P(A|B)P(B)
= P(A)P(B) if ind. |
Transformation
E[g(X)] = Σg(x)f(x)
V[g(X)] = [Σ(g(x))2f(x)]-(E[g(X)])2 |
Bernoulli Trial
S = {success, failure} = {p,q}
p = P(I=1)
I ~ Ber(p)
E[I] = p
V[I] = p(1-p) |
Negative Binomial Distribution
X = # of trials to until rth success
X ~ Nb(r,p)
f(x) = (x-1r-1)(1-p)x-rpr, for x=r,r+1,...
E[X] = r/p
V[X] = r(1-p)/p2 |
Erlang Distribution
T = time until rth outcome of Poisson process
F(x) = P(T≤x) = 1-P(T>x)
= 1-Σr-1e-λx(λx)k/k!
E[T] = r/λ
V[T] = r(1-λ)/λ2 |
Standardization Thm
Z = (X-E[X])/√(V[X])
F(x) = P(X≤x) = Ф((x-μ)/σ)
P(a<X<b) = F(b)-F(a) |
Percentile
Rank of kth percentile: (n+1)*k/100 = m+p, 0≤p<1
kth percentile = y_m+p(y_[m+1]-y_m)
IQR = q_3-q_1 |
Median is 50th percentile
Hypothesis
Null hyp: make no change
Alternate hyp: test according to question
⇒Test 1: μ ≠ μ_0; 2: μ > μ_0; 3: μ < μ_0;
Confidence interval decision: reject H_0 for H_1 if μ_0 is not in confidence interval
Z_0 or T_0 decision:
σ2 known: Z_0 = (X̄-μ_0)/(σ/√n) ~ N(0,1)
Test 1: reject if |z_0| > z_[α/2]; 2: z_0 > z_α; 3: z_0 < -z_α
σ2 unknown: T_0 = (X̄-μ_0)/(S/√n) ~ T_[n-1]
Test 1: |t_0| > t_[α/2,n-1]; 2: t_0 > t_[α,n-1]; 3: t_0 < -t_[α,n-1]
Pop. & σ2 unknown: replace σ with S from σ2 known
p-Value decision: reject if p-value < α
p-value = 2[1-Ф(|z_0|)], test 1 & z-value
= 1-Ф(z_0), test 2 & z-value
= Ф(z_0), test 3 & z-value
= 2P(T>|t_0|), test 1 & t-value
= P(T>t_0), test 2 & t-value
= P(T<t_0), test 3 & t-value |
|
|
Addition Rules
P(A∩B') = P(A)-P(A∩B)
P(A∪B) = P(A)+P(B)-P(A∩B)
P(A'∩B') = 1-P(A∪B)
P(A∪B∪C) = P(A)+P(B)+P(C)-P(A∩B)-P(A∩C)-P(B∩C)+P(A∩B∩C)
P(A_1∪...∪A_n) = 1-P(A_1'∩...∩A_n') |
Marginal PMF
p(x) = P(X=x) = Σyp(x,y)
p(y) = P(Y=y) = Σxp(x,y) |
Binomial Distribution
X = # of successes from n trials
X ~ b(n,p)
f(x) = (nx)px(1-p)n-x, for x=0,1,...,n
E[X] = np
V[X] = np(1-p) |
Exponential Distribution
Waiting time
X ~ Exp(λ)
f(x) = λe-λx, x>0
F(x) = 1-e-λx, x>0
E[X] = 1/λ
V[X] = 1/λ2
Lack of memory: P(X>s+t|X>s) = P(X>t) |
Standard Normal Distribution
Z ~ N(0,1)
PMF: ⌀(z) = 1/√(2π)*e-1/2*z^2
CDF: Φ(z) = P(Z≤z) = ∫⌀(t)dt
Φ(0) = 0.5
P(Z≤-z) = P(Z≥z)
Φ(-z) = 1-Φ(z)
P(a≤Z≤b) = Φ(b)-Φ(a)
P(-a≤Z≤-b) = Φ(a)-Φ(b) |
Linear Combination
Y ~ N(μ_Y, σ2_Y)
E[Y] = Σc_iE[X_i]
V[Y] = Σc2_iV[X_i]2
X̄ = 1/nΣX_i
E[X̄] = μ
V[X̄] = σ2/n |
Sample Covariance
cov = ((Σx_iy_i)-(Σx_i)(Σy_i)/n)/(n-1) |
Line of Best Fit
y = a+Bx
B = ((Σx_iy_i)-(Σx_i)(Σy_i)/n)/((Σx2_i)-(Σx_i)2/n) |
|