Foundation of Statistics Sec. 1 & 2 under Shirin Cheat Sheet

Basic Mathematical Symbols and Explanations

∑: Sum of the following set/set of values returned from a function. There's ususally a variable name and assignment underneath it, and a limit above - this means you are summing the values returned by the function with the value range from the bottom to to the top.

sum=0;
for(int i =0; i++; i<=10){ 
sum += do_function(i); }

∏: Product of the following set/set of values returned from a function. There's ususally a variable name and assignment underneath it, and a limit above - this means you are getting the product of all the values returned by the function with the value range from the bottom to to the top.

prod=0;
for(int i =0; i++; i<=10){ 
prod *= do_function(i); }

∀: For all/For every instance. Universal quantifier in predicate logic. ie. The stated holds true for every situation. Can be further expanded with ∀i (assuming i is defined in the previously stated function) followed by a subset or function which would read as "The stated holds true for all i in the following set/function"

ℝ: Real numbers. ie. Not imaginary numbers (sqrt of a negative number) and not infinity. Integers, negatives, floats, doubles etc. are all considered "Real numbers"

∫: Integral. Used for finding areas, volumes, central points etc. Not confident in my own summary, please follow this link https://www.mathsisfun.com/calculus/integration-introduction.html

lim

a->x

: This is the function to find/define the limit of possible values returned when a is fed into the following function ie.
lim

a->-∞

F(a)=0
Which means that the lowest value for F(a) where a gets as close to -∞ as possible (approaching from 0) is limited to 0. ie. Lower limit is 0. Can be used to define upper limits with +∞ and limits for discrete variables by specifying their unique upper and lower bounds ie.
lim

a->6

f(a)=1
where 0<=a<=6

Definitions, Properties, Rules and Laws

The Additivity Property	If A∩B = ∅ then P(A∪B) = P(A)+P(B) If A∩B != ∅ then P(A∪B) = P(A)+P(B)-P(A∩B) P(A^c)=1-P(A)
The Multiplication Rule	P(A∩B) = P(A\|B)P(B)
The Law of Total Probability	Given disjoint events B1,B2,...,Bm such that ∪^mi=1 Bi = Ω (ie. The union of all events B1 through Bm is the same as the entire sample space) Then the probability of a random/arbitrary event A is expressed as... P(A) = ∑^mi=1 P(A\|Bi)P(Bi) (ie. The sum of probabilities of all events Bi where A occurs)
Bayes' Rule	Given disjoint events B1,B2,...,Bm and ∪^mi=1 Bi = Ω (ie. The union of all events B1 through Bm is the same as the entire sample space) Then the conditional probability of Bi, given that a random/arbitrary event A occurs is... P(Bi\|A) = P(A\|Bi)P(Bi)/∑^mj=1P(A\|Bj)P(Bj) (ie. !!!VERIFY!!!THe probability of Bi given that A occurs is the calculated by dividing the the probabiltiy P(A∩B) <according to the multiplication rule> by the sum of the probabilities of A intersecting all other events in the sample space <according to the multiplication rules)>
Properties of the Probability Mass Function aka pmf	All probabilities are positive: fx(x) ≥ 0. Any event in the distribution (e.g. “scoring between 20 and 30”) has a probability of happening of between 0 and 1 (e.g. 0% and 100%). The sum of all probabilities is 100% (i.e. 1 as a decimal): Σfx(x) = 1. An individual probability is found by adding up the x-values in event A. P(X Ε A)=Σ `x∈A` f(X)
Properties of the Cumulative Distribution Function aka cdf	1. For a<=b then F(A)<=F(b) ie. if a<=b then the cdf of a<=the cdf of b. 2. F(a) is a probability 0<=F(a)<=1, and lim `a->+∞` F(a)=1 lim `a->-∞` F(a)=0 ie. F(a) will never return a result bigger than 1 or smaller than 0. 3. F is right-continuous lim `b->0` F(a+b) = F(a). *a<=b implies that the event {X<=a} is contained(subset of) the event {X<=b}
Properties of Expectation aka E(X)	E(aX) = aE(X) ∀ a is a constant E(XY) = E(X)E(Y) when X and Y are independent E(a+bX) = a+bE(X) linearity E(X+Y) = E(X)+E(Y) linearity E[Σ `i=1` ⁿXi] = Σ `i=1` ⁿE[Xi]
Properties of Variance aka V(X)	Vara(aX) = a²Var(X) ∀ a is a constant Var(a+X) = Var(X) ∀ a is a constant

Probability/Statistics & Set Notation

P(A)	Probability of event A occuring. (Number of ways event A can occur / Number of total outcomes possible))
Ω	Sample Space/Universe. P(Ω)=1
∅	Empty/Null set
P(A∩B)	Probability of A Intersection B
Disjoint/Independent/Mutually Exclusive	If A∩B = ∅ then disjoint/independent of each other
P(A∪B)	If disjoint/independent of one another P(A∪B) = P(A) + P(B) If not disjoint P(A∪B) = P(A) + P(B) - P(A∩B)
A^c	A complement. Everything outside A. P(A^c) = 1 - P(A)
A∈B / A∉B	A is an element of B / A is not an element of B
A: A ∈ B	A such that A is an element of B
n! aka Permutations	Counting method where ORDER matters. n! = n(n-1)(n-2)...(n-k+1) where k = sample size
(ⁿk) aka Combinations	Counting method where order does not matter. (ⁿk) = n!/k!(n-k)!)
P(A\|B) aka Conditional Probability	The Probability of A happening, given that B occurs. If A and B are disjoint/independent/mutually exclusive then P(A\|B)=P(A) as B has no effect on A. If A and B are dependent ie. B has an effect on the chances of A the P(A\|B) = P(A∩B)/P(B) P(A\|B)+P(A^c\|B)=1
P(Bi\|A)P(A) = P(A\|Bi)P(Bi)	Proven by the combination of Bayes' rule and Law of total probability applied to P(A)
Independence of more than 2 events	Events A1,A2,...,Am are independent if P(∩^mi=1Ai) = ∏^mi=1P(Ai) (ie. They are independent events if the probability of all of their intersections are equal to the product of all of their individual probabilities) A and B are independent. B and C are independent. This does not mean that A and C are independent, nor does it mean they must be dependent.
Random variable aka. rv	Any variable whose value is not known prior to the experiment and are subject to chance aka. Variability aka. Change. Has an associated probability aka. mass An rv is a type of mapping function over the whole sample space and is associated with measure theory. ie. An rv can transform the sample space.
Discrete	There is a set number of outcomes
Discrete Random Variable	Any function X: Ω→ℝ that takes on some value. eg. X could be S=sum or M=max ran on a sample space, getting the sum/max of each experiment outcome and constructing a new sample space out of it.
Probability Mass Function aka pmf	The pmf of some discrete rv X. Essentially creating a table/graph displaying all the probabilities of all possible values our discrete rv can be. Please refer to "Properties of the Probability Mass Function aka PMF for more details."Explained here http://www.statisticshowto.com/probability-mass-function-pmf/
Cumulative Distribution Function aka cdf	The cdf of some discrete rv can be used to determine the probability above, below and between values occuring. Please refer to "Properties of the Cumulative Distribution Function aka CDF for more details." Explained here http://www.statisticshowto.com/cumulative-distribution-function/
Continuous	An infinite number of possible values.
Continuous Random Variables	Is a function X: Ω→ℝ that takes on any value a∈ℝ Mass/Associated probability no longer considered for each possible value of X instead consider the likelihood that X∈(a,b) for a<b.
Probability Density Function aka pdf	Pdf on a continuous rv f(x) of X is an integrable function such that... P(a<=X<=b) = ∫^b `a` f(x)dx ie. it is the area under the cure between points a and b. Therefore it is the probability of a range of values occuring s.t. conditions on f f(x)>=0 ∀x∈Ω ∫^∞ `-∞` f(x)dx=1 ie. the complete area under the curve contains all outcomes. This is defined by the formula... F(x)=∫^x `-∞` f(u)du = P(X<=x)
Expectation aka. E(x)	The expected value of a random variable This is found using the formula when our rv is discrete E(X) = Σ `xi∈Ω` xi p(xi) and the following formula when the rv is continuous E(X)=∫ `Ω` x f(x)dx To make this easier to understand The expected value is simply the mean and is calculated as the sum of (each possible value muiltiplied by it's independent probability) ie The sum of weighted values to probabilities
Variance aka Var(X)	A method of measuring how far the actual value of a rv may be from the expected value. Given a discrete variable X the formula is.. Var(X)=Σ `xi∈Ω` x²i p(xi) - (Σ `xi∈Ω` xi p(xi))² Or given a continuous rv use the formula Var(X)=∫ `Ω` x² f(x)dx - (∫ `Ω` x f(x)dx)² In other words we sum up (the squared value's multiplied by their individual probabilities) and finally deduct the the squared expected value.
Standard Deviation	Another method similar to variance about looking at how far distribution goes from the mean ie. The actual value vs the expected value. Simply calculated with the sqrt(Var(X)). Benefit of this is that it is expressed in the same unit that X is expressed in rather than the squared as variance is.

Foundation of Statistics Sec. 1 & 2 under Shirin Cheat Sheet (DRAFT) by dylablo

Basic Mathematical Symbols and Explanations

Definitions, Properties, Rules and Laws

Probability/Statistics & Set Notation

Latest Cheat Sheet

Random Cheat Sheet

About Cheatography

Behind the Scenes

Recent Cheat Sheet Activity

Please Disable Your Ad Blocker

Foundation of Statistics Sec. 1 & 2 under Shirin Cheat Sheet (DRAFT) by dylablo

Basic Mathem­atical Symbols and Explan­ations

Defini­tions, Proper­ties, Rules and Laws

Probab­ili­ty/­Sta­tistics & Set Notation

Latest Cheat Sheet

Random Cheat Sheet

About Cheatography

Behind the Scenes

Recent Cheat Sheet Activity

Please Disable Your Ad Blocker

Basic Mathematical Symbols and Explanations

Definitions, Properties, Rules and Laws

Probability/Statistics & Set Notation