| Pond | egg.mass | strata |
|---|---|---|
| A | 2 | 1 |
| B | 6 | 1 |
| C | 8 | 1 |
| D | 10 | 2 |
| E | 10 | 2 |
| F | 12 | 2 |
Can we improve on an unbiased estimator using SRS?
We can improve our parameter variances!
How? Break our sampling frame into homogeneous parts.


What are strata used in fish/wildlife studies?
We sample \(y_{ih}\) within strata \(h\) from 1 … \(L\) and units \(i\) from 1 … \(n_h\).
\[ \bar{y}_{h} = \hat{\mu}_{h} = \frac{1}{n_h} \sum_{i=1}^{n_h} y_{hi} \]
\[ \hat{\mu}_{st} = \frac{1}{N} \sum_{h=1}^{L} N_{h}\hat{\mu}_{h} \]
\[ \hat{\sigma}^2_{\hat{\mu},h} = \frac{1}{n_h -1} \sum_{i=1}^{n_h}\left(y_{hi}-\hat{\mu}_{h}\right)^2 \]
\[ \hat{\sigma}^2_{\hat{\mu},st} = \sum_{h=1}^{L} \left(\frac{N_h}{N}\right)^2 \frac{N_h-n_h}{N_h}\frac{\hat{\sigma}^2_{\hat{\mu},h}}{n_h} \]
Goal: to know the mean number of boreal toad egg masses per pond in RMNP
Goal 2: Use stratification to reduce the sampling variance.
| Pond | egg.mass | strata |
|---|---|---|
| A | 2 | 1 |
| B | 6 | 1 |
| C | 8 | 1 |
| D | 10 | 2 |
| E | 10 | 2 |
| F | 12 | 2 |
How may unique combinations?
| Sample | S1.1 | S1.2 | S2.1 | S2.2 | Mean.S1 | Mean.S2 | Var.S1 | Var.S2 |
|---|---|---|---|---|---|---|---|---|
| 1 | A | B | D | E | 4 | 10 | 8 | 0 |
| 2 | A | B | D | F | 4 | 11 | 8 | 2 |
| 3 | A | B | E | F | 4 | 11 | 8 | 2 |
| 4 | A | C | D | E | 5 | 10 | 18 | 0 |
| 5 | A | C | D | F | 5 | 11 | 18 | 2 |
| 6 | A | C | E | F | 5 | 11 | 18 | 2 |
| 7 | B | C | D | E | 7 | 10 | 2 | 0 |
| 8 | B | C | D | F | 7 | 11 | 2 | 2 |
| 9 | B | C | E | F | 7 | 11 | 2 | 2 |
\[ \hat{\sigma}^2_{\hat{\mu},st} = \sum_{h=1}^{L} \left(\frac{N_h}{N}\right)^2 \frac{N_h-n_h}{N_h}\frac{\hat{\sigma}^2_{h}}{n_h} \]
| S1.1 | S1.2 | S2.1 | S2.2 | Mean.S1 | Mean.S2 | Var.S1 | Var.S2 | Var.mean |
|---|---|---|---|---|---|---|---|---|
| A | B | D | E | 4 | 10 | 8 | 0 | 0.33 |
| A | B | D | F | 4 | 11 | 8 | 2 | 0.42 |
| A | B | E | F | 4 | 11 | 8 | 2 | 0.42 |
| A | C | D | E | 5 | 10 | 18 | 0 | 0.75 |
| A | C | D | F | 5 | 11 | 18 | 2 | 0.83 |
| A | C | E | F | 5 | 11 | 18 | 2 | 0.83 |
| B | C | D | E | 7 | 10 | 2 | 0 | 0.08 |
| B | C | D | F | 7 | 11 | 2 | 2 | 0.17 |
| B | C | E | F | 7 | 11 | 2 | 2 | 0.17 |
E[Sampling Distribution Variance] = 0.44
E[Sampling Distribution Variance] = 4.26
Sample Size per Strata
Allocate most of our samples to the strata with the highest variance
| Pond | egg.mass | strata |
|---|---|---|
| A | 2 | 1 |
| B | 6 | 1 |
| C | 8 | 1 |
| D | 10 | 2 |
| E | 10 | 2 |
| F | 12 | 2 |
How many possible sample combinations are there?
| Sample | S1.1 | S1.2 | S1.3 | S2.1 | Mean.S1 | Mean.S2 | pop.means |
|---|---|---|---|---|---|---|---|
| 1 | A | B | C | D | 5.333333 | 10 | 7.666667 |
| 2 | A | B | C | E | 5.333333 | 10 | 7.666667 |
| 3 | A | B | C | F | 5.333333 | 12 | 8.666667 |
Reverse the situation- allocate more samples to the least variable stratum
| Pond | egg.mass | strata |
|---|---|---|
| A | 2 | 1 |
| B | 6 | 1 |
| C | 8 | 1 |
| D | 10 | 2 |
| E | 10 | 2 |
| F | 12 | 2 |
| Sample | S1.1 | S2.1 | S2.2 | S2.3 | Mean.S1 | Mean.S2 | pop.means |
|---|---|---|---|---|---|---|---|
| 1 | A | D | E | F | 2 | 10.66667 | 6.333333 |
| 2 | B | D | E | F | 6 | 10.66667 | 8.333333 |
| 3 | C | D | E | F | 8 | 10.66667 | 9.333333 |
What if we ignored the stratification and used the SRS sample mean estimator?
| S1 | S2 | S3 |
|---|---|---|
| 2 | 6 | 8 |
| 10 | 10 | 10 |
| 10 | 10 | 10 |
| 12 | 12 | 12 |
Population mean is 9.3333333
\(E[\hat{\mu}_{SRS}] \neq \mu\)
| Pond | egg.mass | strata |
|---|---|---|
| A | 2 | 1 |
| B | 6 | 1 |
| C | 8 | 1 |
| D | 10 | 2 |
| E | 10 | 2 |
| F | 12 | 2 |
\[ \hat{\mu}_{st} = \frac{1}{L}\sum_{h=1}^L \sum_{i=1}^{n_h} y_{hi}\times \text{weight}_h \]
| Strata | S1 | S2 | S3 | Weight |
|---|---|---|---|---|
| 1 | 2 | 6 | 8 | 1.0000000 |
| 2 | 10 | 10 | 10 | 0.3333333 |
| 2 | 10 | 10 | 10 | 0.3333333 |
| 2 | 12 | 12 | 12 | 0.3333333 |
Sample 1
\[ \hat{\mu}_{st} = \frac{\left(2\times1\right) + \left(10\times1/3 + 10\times1/3 + 12\times1/3 \right)}{2} \]
Quota Sampling)Observer Freedom