Population and Sample
What is it called to use characteristics of sample to describe a ‘population’?
Design- vs Model-based Infernce
What are important differences according to Thompson and Hankin et al.?
Benefits or costs of either?
Design- vs Model-based Infernce
Design vs. Sampling
Experimental Design:
Deliberately perturbing a part of a ‘population’ to compare it’s effect to a part that was not perturbed
Sampling Design :
The process of obtaining a representative sample to characterize a ‘population’ w/o necessarily perturbing it.
Sample Language
A sample (height in inches):
\(\textbf{y} = [69, 54, 72, 61, 58, 71]\)
A sample unit:
\(y_{2} =54\)
Sample size:
\(n = 6\)
Common Sample Units
plots/quadrats - small geographic area to measure/count plants, seeds, insects, etc.
points - measurements are taken from a set of points established thourghout a population
transects - straight-line segments
individual organisms - the ornganism is the sample unit or the organism defines the location of the sample unit
More Language
What is a statistic?
An estimate of a population parameter from a sample
\[\hat{\mu} = \left(\left(\sum_{i=1}^{n}y_{i}\right)\times \frac{1}{n}\right) = 4.1\]
\(n\) is a sample parameter (size of sample)
\(\hat{\mu}\) is an estimate of a population parameter (\(\mu\) ) from the estimator (mathematical rule for calculation)
4.1 is a statistic (specific value)
More Language
\[\mu =\left(\sum_{i=1}^{N}y_{i}\right)\times \frac{1}{N} = 4\]
\(\mu\) is a population parameter (measure of central tendency)
\(N\) is a population parameter (size of all possible sample units)
\(4\) is the value of the population parameter
Sampling Error
Sampling Error
The difference b/w a sample statistic (specific value) and the true value of a population paramter
4.1 - 4 = 0.1 sampling error
Due solely to incomplete enumeration of the population (chance)
Protection against this is large sample size
Sampling Bias
Sampling Bias
Systematic tendency of selecting certain sample units; makes the samples unrepresentative to the target population
Examples in fish/wildlife??
Sampling Variation and Error
Target Population: Weight of all black bears in a region
How would you describe a sampling frame relevant to this target population?
Sampling Variation and Error
Sampling Error
Vector of bear weights
[1] 543.55183 70.43038 343.21377 143.64493 268.94786
# Sample Size
n = 50
# Sample and estimate mean one time
sample.1 = mean (
sample (pop.weights,
n,
replace = FALSE
)
)
# Calculate Sampling Error
sample.1 - pop.mu.weights
Sampling Variation vs Error
Sampling variation is the process and sampling error is an outcome.
The differences between samples (sampling variation) lead to differences between sample statistics and population parameters (sampling error).
Sampling Variation
Calculate many many sample means
# Create function to sample 50 units and take the mean
sample.mean.fn = function (target,n){
mean (
sample (target,n)
)
}
#Repeat the above function 20000 times
set.seed (54343 )
mu.hat= replicate (20000 ,
sample.mean.fn (pop.weights,n)
)
Sampling Variation
Sampling Bias
We should be very interested in the characteristics of the sampling distribution and error.
Expected Bias = average sample mean - population mean
Sampling Variation
We can summarize the sampling variation into a probability
For n = 50, how likely is it that I’ll be within 10% of the true population parameter?
lower= pop.mu.weights - pop.mu.weights* 0.10
upper= pop.mu.weights + pop.mu.weights* 0.10
length (which (mu.hat> lower & mu.hat< upper)) / length (mu.hat)
Sampling Bias
Sample Population: Weight of harvested black bears in a region that allows food provisioning
Sampling Error and Bias
Sampling Bias
We only sample harvested bears with food supplementation
Sampling Bias
Relative Expected Bias = \(\frac{E(\hat{\mu})-\mu}{\mu}\)
Relative Expected Bias = 0.23
Biased Estimator
We sample all bears but use a different estimator for the population mean
\[
\hat{\mu} = \left(\sum_{i=1}^{n}\frac{(y_{i})^{0.91}}{1.3}\right)\times \frac{1}{n^{1/2}}
\]
Biased Estimator
Expected Bias = 215.14
Malicious Sampling Bias
Sample fish in streams.
200 streams to choose to sample. Called what?
Let’s consider a situation where we don’t take an equal probability sample
Probability of sampling an occupied cell