The population is the collection of units (people, objects or whatever) that researchers are interested in knowing about. The number of individuals in a population is called population size. Population may be finite (we know the numbers exactly) or infinite (no idea about the number).
A sample is a smaller collection of units selected from the population i.e. a finite subset of individuals in a population is called a sample and the number of individuals in a sample is called sample size.
The terms that describe the characteristics of a population
the terms describe the characteristics of sample
Collection of all items under study
Part or portion of population chosen for study
Population size = N
Sample size = n
Mean = μ
Mean = x̄
Standard Deviation = σ
Standard Deviation = S
Correlation Coeff. = ρ
Correlation Coeff. = r
Data are collected for each and every unit (person, household, shop, organization etc.) of the population.
Instead of every unit of the population, only a part of the population is studied and the conclusions are drawn on that basis for the entire population.
(Source list) The sampling frame is the list of items in the population (universe) from which sample is to be drawn.
Define the population
Specify the sampling frame
Specify sampling unit
Selection of sampling method
Determination of sample size
Specify the sampling plan
Select the sample
Technique or the procedure the researcher would adopt in selecting items for the sample. Sample design is determined before data are collected. It must consider:
the sampling frame
technique of selection of sample
Non random (Non probability)
Simple Random Sampling
Random: Simple Stratified System of Multistage Cluster
Non-Random: John Snow Convinces Queen
Simple Random Sampling
Every individual or item from a frame has the same chance of selection as every other individual or item.
n is used to represent the sample size and N is used to represent the frame size.
Every item in the frame is numbered from 1 to N. The chance that any particular member of the frame is selected on the first draw is 1/N.
Random sample can be obtained by any of the following methods:
Random number method
Random number generator (by different software)
Stratified Random Sampling
used when we have to select samples from a heterogeneous population such as male and female, or educated and uneducated, etc
the population is divided into subgroups or strata and a simple random sample is taken from each such subgroup.
each stratum is homogeneous internally and heterogeneous with other strata.
sampling can be either proportionate or disproportionate.
increases a sample’s statistical efficiency.
provides adequate data for analyzing the various subpopulations or strata
enables different research methods and procedures to be used in different strata.
Systematic Random Sampling
random selection of the first item and then the selection of a sample item at every kth interval.
The interval k is fixed by dividing the population by sample size.
K = Size of population / Size of sample required = N/n
involves dividing the population into non overlapping areas or clusters.
in contrast to stratified random sampling where strata are homogeneous, cluster sampling identifies clusters that tend to be internally heterogeneous.
cluster contains a wide variety of elements, and the cluster is a miniature, or microcosm, of the population. eg. city, homes, colleges, etc.
Often clusters are naturally occurring groups of the population
Two of the foremost advantages are convenience and cost.
further development of the principle of cluster sampling.
consists of first selecting the clusters and then selecting a specified number of elements from each selected cluster is known as sub sampling or two stage sampling.
clusters which form the units of sampling at the first stage are called the first stage units (fsu) or primary sampling units (psu)
the elements within clusters are called second stage units (ssu).
when elements selected for the sample are chosen by the judgment of the researcher
professional researchers might believe they can select a more representative sample than the random process will provide
saving time and money
The sampling error cannot be determined objectively because probabilities are based on nonrandom selection.
Example: Market selection for the construction of consumer price index
subjects are selected based on referral from other survey respondents.
The researcher identifies a person who fits the profile of subjects wanted for the study. The researcher then asks this person for the names and locations of others who also fit the profile of subjects wanted for the study.
particularly useful when subjects are difficult to locate
survey subjects can be identified cheaply and efficiently
main disadvantage is that it is nonrandom
elements for the sample are selected for the convenience of the researcher
researcher typically chooses elements that are readily available, nearby or willing to participate.
For example, a convenience sample of homes for door to door interviews might include houses people are at home, houses with no dogs, houses near the street, first floor apartments and houses with friendly people.
If a research firm is located in a mall, a convenience sample might be selected by interviewing only shoppers who pass the shop and look friendly.
Certain population subclasses, such as age group, gender or geographical region are used as strata.
instead of randomly sampling from each stratum, the researcher uses a nonrandom sampling method to gather data from each stratum until the desired quota of samples is filled.
a quota is based on the proportions of the subclasses in the population.
an interviewer would begin by asking a few filter questions; if the respondent represents a subclass whose quota has been filled, the interviewer would terminate the interview.