In Figure 2.4, the locations of objects (e.g., trees, mines, dwellings) in a study region are given by the centers of “+” symbols. The goal is to estimate the number of objects in the study region.
(a) A random sample without replacement of n = 10 units has been selected from the N = 100 units in the population. Units selected are indicated by shading in Figure 2.4. List the sample data. Use the sample to estimate the number of objects in the figure. Estimate the variance of your estimator.
If the point crossed the thick black line into the cell, I considered it a sample
y <- c(0, 1, 1, 1, 1, 3, 1, 0, 4, 0)
# Total number of cells
N <- 100
# The estimate of the number of objects in the figure.
N*mean(y)
## [1] 120
# Sample Size
n <- 10
# Variance of the estimator (tau) - first option
var_mu.hat = (N-n)/N*var(y)/n
N^2*var_mu.hat
## [1] 1560
# Variance of the estimator (tau) - second option
N*(N-n)*(var(y)/n)
## [1] 1560
(b) Repeat part (a), selecting another sample of size 10 by simple random sampling (without replacement) and making new estimates. Indicate the positions of the units of the samples on the sketch.
# locations of new sample
newsample <- sample(1:100,10, replace = FALSE)
# new sample
y <- c(1, 0, 1, 1, 3, 1, 1, 0, 2, 2)
# The estimate of the number of objects in the figure.
N*mean(y)
## [1] 120
# Variance of the estimator
((N-n)/N)*(var(y)/n)
## [1] 0.076
Give the inclusion probability for the unit in the upper left-hand corner. How many possiblesamples are there? What is the probability of selecting the sample you obtained in part (a)?
# Inclusion probability for a given cell for n = 1
1/N
## [1] 0.01
# The inclusion probability for a given cell for n =10
n/N
## [1] 0.1
# the probability of selecting the sample obtained in a
1/choose(100,10)
## [1] 5.776904e-14
Consider a small population of N = 5 units, labeled 1, 2, 3, 4, 5, with respective y-values 3, 1, 0, 1, 5. Consider a simple random sampling design with a sample size n = 3. For your convenience, several parts of the following may be combined into a single table.
Give the values of the population parameters \(\mu\), \(\tau\) , and \(\sigma^2\). List every possible sample of size n = 3. For each sample, what is the probability that it is the one selected?
y = c(3, 1, 0, 1, 5)
N = length(y)
mu = mean(y)
mu
## [1] 2
tau = N*mu
tau
## [1] 10
sigma2 = var(y)
sigma2
## [1] 4
# Combinations
all.possible.3=utils::combn(y, 3)
all.possible.3
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 3 3 3 3 3 3 1 1 1 0
## [2,] 1 1 1 0 0 1 0 0 1 1
## [3,] 0 1 5 1 5 5 1 5 5 5
# probability of a given sample
1/choose(N,3)
## [1] 0.1
For each sample, compute the sample mean y and the sample median m. Demonstrate that the sample mean is unbiased for the population mean and determine whether the sample median is unbiased for the population median.
sample.means = apply(all.possible.3,2,mean)
sample.medians = apply(all.possible.3,2,median)
# Unbiasedness of sample mean - Yes
mean(sample.means)-mu
## [1] 0
# Unbiasedness of sample median - No
mean(sample.medians)-mu
## [1] -0.4