We will estimate the annual survival probability of European Dipper using capture-recapture data via a hierarchical Bayesian version of the cormack-jolly-seber model.
Load the necessary packages.
library(stringr)
library(rjags)
library(coda)
library(runjags)
Next, read in the data. The first column is the capture-history. The second column indicates males (0) and females (1). The third column indicate females (1) and males (0). The fourth column are just semi-colons…it’s MARK thing.
#Load the Data
dipper = read.table("DIPPER.INP", skip = 2,sep="", colClasses = "character")
head(dipper)
## V1 V2 V3 V4
## 1 1111110 1 0 ;
## 2 1111100 0 1 ;
## 3 1111000 1 0 ;
## 4 1111000 0 1 ;
## 5 1101110 0 1 ;
## 6 1100000 1 0 ;
#The number of individuals marked
nrow(dipper)
## [1] 294
Next, lets manipulate the inputted data to create capture-histories in columns, which we will use to fit our model.
# split column 1 into columns
CH = matrix(as.integer(str_split_fixed(dipper[,1],"",7)),nrow=nrow(dipper))
head(CH)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 1 1 1 1 1 0
## [2,] 1 1 1 1 1 0 0
## [3,] 1 1 1 1 0 0 0
## [4,] 1 1 1 1 0 0 0
## [5,] 1 1 0 1 1 1 0
## [6,] 1 1 0 0 0 0 0
# Sex variable: female = 1; male = 0
sex = as.integer(dipper$V3)
We can see that there are 7 columns in the capture-history, 1 for each year. So, this study has 7 years of data total.
Next, we need to setup two specialized functions. ‘get.first’ will identify the first occasion each individual was initially captured. The next function will use the capture history to create logical designations of when we know individuals are alive to be used to initialize the state parameter z.
# Create vector with occasion of marking for each individual (row of CH)
get.first <- function(x) min(which(x!=0))
f <- apply(CH, 1, get.first)
z.init <- matrix(NA, nrow = nrow(CH), ncol = ncol(CH))
for(i in 1:dim(z.init)[1]){
z.init[i, f[i]:dim(z.init)[2]] <- 1
z.init[i,f[i]] <- NA
}
Fit a CJS survival model that includes a sex effect on survival probability. Adapt the jags model code in the other file and implementation code that is above.
# MCMC settings
ni <- 15000 # number of iterations
nt <- 2 # number of iterations to thin by
nb <- 5000 # number of iterations to burn (to toss out initially)
na <- 2000 # number of iterations to use to adapt to sample efficiently
nc <- 3 # number of chains
# Parameters monitored
parameters <- c("beta0","beta1", "p","male.phi","female.phi")
inits <- function(){list(beta0 = rnorm(1),
beta1 = rnorm(1),
p = runif(1, 0, 1),
z = z.init)}
# Bundle data
jags.data <- list(y = CH, f = f, nind = dim(CH)[1], n.occasions = dim(CH)[2],
sex=sex)
# Setup the Model
jm <- jags.model(file="cjs2.r", data=jags.data,n.chains=nc,n.adapt=na,inits=inits)
## Compiling model graph
## Resolving undeclared variables
## Allocating nodes
## Graph information:
## Observed stochastic nodes: 848
## Unobserved stochastic nodes: 851
## Total graph size: 4950
##
## Initializing model
# Update the model with the burnin
update(jm, n.iter=nb)
#Fit the model
post2 <- coda.samples(jm, variable.names=parameters, n.iter=ni, thin=nt)
#Look at the results
summary(post2)
##
## Iterations = 7002:22000
## Thinning interval = 2
## Number of chains = 3
## Sample size per chain = 7500
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## beta0 0.28591 0.14296 0.0009531 0.0016464
## beta1 -0.07254 0.19516 0.0013010 0.0021768
## female.phi 0.55289 0.03423 0.0002282 0.0002602
## male.phi 0.57064 0.03484 0.0002323 0.0004010
## p 0.89542 0.02891 0.0001928 0.0003324
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## beta0 0.009695 0.1881 0.28395 0.38121 0.5703
## beta1 -0.455818 -0.2034 -0.07401 0.05912 0.3061
## female.phi 0.486004 0.5294 0.55280 0.57619 0.6195
## male.phi 0.502424 0.5469 0.57051 0.59417 0.6388
## p 0.833283 0.8771 0.89748 0.91611 0.9458
The summary output shows us the quantiles of each model paramter that
was defined in the variable parameters
.
Check that parameters have converged. Show evidence of this by plotting and calculating the gelman-rubin convergence diagnostic, i.e, function gelman.diag.
plot(post2)
gelman.diag(post2)
## Potential scale reduction factors:
##
## Point est. Upper C.I.
## beta0 1 1
## beta1 1 1
## female.phi 1 1
## male.phi 1 1
## p 1 1
##
## Multivariate psrf
##
## 1
We see the diagnostic statistic (gelman-rubin statistics; known as \(\hat{R}\)) are at 1, indicating no sign of lack of convergence. Also, the posterior distribution traceplots look like fuzzy caterpillars and are overlapping, also demonstrating no signs of convergence issues.
Since the parameters have converged, we can combine the chains.
post.combined = runjags::combine.mcmc(post2)
head(post.combined)
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 13
## Thinning interval = 2
## beta0 beta1 female.phi male.phi p
## 1 0.57102956 -0.44287437 0.5319950 0.6390007 0.8578579
## 3 0.44030038 -0.12304907 0.5786542 0.6083306 0.8535086
## 5 0.19107304 -0.09355509 0.5243602 0.5476235 0.9525633
## 7 0.16692649 0.06846217 0.5585769 0.5416350 0.9351104
## 9 0.03426400 0.23970981 0.5680682 0.5085652 0.8783200
## 11 0.07074674 0.22307828 0.5729323 0.5176793 0.8846872
## 13 0.15026892 -0.08623061 0.5160041 0.5374967 0.8774565
# This is equivalent to using the do.call function to rbind each matrix of posterior samples across list elements
post.combined <- do.call("rbind", post2)
Since we setup derived parameters in our JAGS model, we simply need to grab these posterior distributions for plotting.
# Female and male survival probability
# Plot posteriors
plot(density(post.combined[,3]),lwd=3,col=1,"Annual Survival")
lines(density(post.combined[,4]),lwd=3,col=2)
legend("topright",lwd=3,col=c(1,2),legend=c("Female","Male"))
Alternativel, if we didn’t specify the derived parameters in JAGS, we can use the estimated posterior parameters of survival (\(\beta_{0}\) and \(\beta_{1}\)) to derive the posterior distributions for the probability of survival for males and females. These are exactly equivalent to the posteriors from above.
beta0 = post.combined[,1]
beta1 = post.combined[,2]
# Posteriors of survival of males and females
male.survival = plogis(beta0)
female.survival = plogis(beta0+beta1)
# Plot posteriors
plot(density(female.survival),lwd=3,col=1,"Annual Survival")
lines(density(male.survival),lwd=3,col=2)
legend("topright",lwd=3,col=c(1,2),legend=c("Female","Male"))
Here we see the posterior distributions of male and female annual survival. They are similar, indicating only a possible small differences in survival between the sexes.
Use Monte Carlo integration on the sex effect parameter to estimate the probability that the male survival is less than female survival.
# The probability of a sex effect- that female survival is less than male survival. We can determine this by how many samples are less than zero.
# But, to evaluate weather male survival is less than female survival, we want to know if beta1 is positive, so...
length(which(beta1>0))/length(beta1)
## [1] 0.3549333
The probability that \(\beta_1 > 0\) indicates there is no evidence of a difference. A high probability, values near 1, would indicate a statistical clear negative effect. However, if we saw probabilities near 0, this would indicate a clear statistical difference between female and male survival probability.