Closed capture-recapture

Closed population capture-recapture models using data augmentation

Author

Affiliations

Philip T. Patton

Marine Mammal Research Program

Hawaiʻi Institute of Marine Biology

Published

May 14, 2025

In this notebook, I explore fitting closed population capture-recapture models in PyMC. Capture-recapture, at least the Lincoln-Peterson estimator, has been around for almost 100 years. Since then, countless varieties of capture-recapture models have been developed for closed populations (Otis et al. 1978).

The basic steps in capture-recapture are: capture several individuals–e.g., via trapping–from the population of interest, mark these animals, then release them. We repeat this process several times, each time noting when we recapture individuals.

Table 1: Example capture history, where \(t\) is the sampling occasion and 1 indicates capture

Individual	\(t_1\)	\(t_2\)	\(t_3\)	\(t_4\)
001	1	1	0	1
002	0	1	1	1
003	0	0	1	1

This produces a capture history for each individual, which allows us to estimate the probability of capture and the number of individuals in the population \(N\).

Model \(M_0\)

%config InlineBackend.figure_format = 'retina'

# libraries 
from pymc.distributions.dist_math import binomln, logpow
from scipy.stats import binom
import arviz as az
import matplotlib.pyplot as plt 
import numpy as np
import pandas as pd
import pymc as pm
import seaborn as sns

# plotting parameters
plt.style.use('fivethirtyeight')
plt.rcParams['axes.facecolor'] = 'white'
plt.rcParams['figure.facecolor'] = 'white'
plt.rcParams['axes.spines.left'] = False
plt.rcParams['axes.spines.right'] = False
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.bottom'] = False
sns.set_palette("tab10")

# hyperparameters 
SEED = 808
RNG = np.random.default_rng(SEED)

I explore fitting the simplest closed capture-recapture model, Model \(M_0,\) through parameter-expanded data-augmentation (PX-DA, Royle and Dorazio 2008). The idea with PX-DA is to augment the capture histories with \(M-n\) all zero capture-histories, where \(M\) is a hyperparameter that should be much greater than the true population size \(N,\) and \(n\) is the total number of individuals that were captured during the study. This allows us to treat the data as a zero-inflated binomial distribution (see below).

def augment_history(history, M):
    '''Augment a capture history with all-zero histories.'''
    
    animals_captured, T = history.shape

    # create M - n all zero histories
    zero_history_count = M - animals_captured
    zero_history = np.zeros((zero_history_count, T))

    # tack those on to the capture history
    augmented = np.vstack((history, zero_history))

    return augmented

To demonstrate this approach, I use the salamander dataset from Bailey, Simons, and Pollock (2004), as demonstrated in Hooten and Hefley (2019), Chapter 24. These data were collected on two salamander species, the red-cheeked salamander (Plethodon jordani) and the pygmy salamander (Desmognathus wrighti), in Great Smoky Mountains National Park. The salamanders were counted in 15m by 15m square plots. In this case, we augment the history by setting \(M=1500\). There were \(n=92\) individual red-cheeked and \(n=132\) pygmy salamanders captured during the course of the survey.

def get_histories():
    '''Read, augment, and recombine the salamander histories.'''
    
    # read in salamander data 
    sal_data = pd.read_csv('sal_data.csv')
    
    # labels for capture history columns 
    col_labs = [f'y{t}' for t in range(1, 5)]

    # subset each dataset before augmenting 
    is_pyg = sal_data.spp == 1
    is_red = sal_data.spp == 0

    pyg = sal_data.loc[is_pyg, col_labs].to_numpy()
    red = sal_data.loc[is_red, col_labs].to_numpy()

    return {'pyg': pyg, 'red': red}

def augment_histories(histories, M):

    pyg_augmented = augment_history(histories['pyg'], M=M)
    red_augmented = augment_history(histories['red'], M=M)

    # recombine into one history 
    history = np.concatenate((pyg_augmented, red_augmented))

    return history

histories = get_histories()

n_red, T = histories['red'].shape
n_pyg, T = histories['pyg'].shape

# # summarize into binomial data
M = 1500
history_augmented = augment_histories(histories, M=M)
history_summarized = history_augmented.sum(axis=1)

For this model, I use the pm.ZeroInflatedBinomial class, just as I did in the occupancy notebook. That said, the parameters here are different. First, \(p\) represents the probability of capturing a given individual during the survey. Second, \(\psi\) represents a mysterious entity known as the inclusion probability. That is, the probability that an individual from the hypothetical superpopulation \(M\) is included in the population of interest \(N.\) Then, we can simulate the posterior distribution for \(N\) using \(M\) and the posterior distributions of \(\psi.\)

In this example, I combine the two species into one pm.Model object, making use of coords. That said, the parameters for each species are treated as independent. In other words, this is a “no-pooling” model.

# index for each species
species_idx = np.repeat([0, 1], M)

# coordinates identifying parameter each species  
coords = {'species': ['pygmy', 'red_cheeked']}

with pm.Model(coords=coords) as M0:

    # priors for the capture and inclusion probabilities
    psi = pm.Beta('psi', 0.001, 1, dims='species')
    p = pm.Uniform('p', 0, 1, dims='species')

    # likelihood for the summarized data
    pm.ZeroInflatedBinomial(
        'history', 
        p=p[species_idx], 
        psi=psi[species_idx], 
        n=T,
        observed=history_summarized
    )
    
pm.model_to_graphviz(M0)

Figure 1: Visual representation of model \(M_0.\) `MarginalMixture` refers to the zero-inflated binomial distribution.

with M0:
    M0_idata = pm.sample()

Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [psi, p]

Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 7 seconds.

ax = az.plot_trace(M0_idata, figsize=(8,4), var_names=['psi', 'p']);
plt.subplots_adjust(hspace=0.4)

Figure 2: Traceplots for the salamander \(M_0\) model. The red-cheeked salamander is in blue while the pygmy salamander is in red.

For faster sampling, it’s better to separate the two species into two separate models. On my machine, the individual species models finish sampling in 2-3 seconds, compared to 15-20 seconds for the two species model. That said, the two species model is somewhat more convenient.

Of course, the trace plots lack our true parameter of interest: the population size \(N.\) We can simulate the posterior of \(N\) as a derived quantity, using \(M\) and the posterior distribution of \(\psi\).

# az.extract flattens the chains
posterior = az.extract(M0_idata)
psi_samps = posterior.psi.values
p_samps = posterior.p.values

# posterior probabilities of being present in the population but not detected
p_if_present = psi_samps * binom.pmf(0, n=T, p=p_samps)
p_total = p_if_present + (1 - psi_samps)

# simulate the number of undetected animals in each population
number_undetected_pyg = RNG.binomial(M - n_pyg, p_if_present[0] / p_total[0])
number_undetected_red = RNG.binomial(M - n_red, p_if_present[1] / p_total[1])

# simulate N
N_pyg = n_pyg + number_undetected_pyg
N_red = n_red + number_undetected_red

Below I plotted the posterior distributions of \(N\) for both species, adding the estimates from Hooten and Hefley (2019), Chapter 24. Although note that they used a different prior for \(\psi.\)

N_hooten = [229.6, 450.9]
fig, ax = plt.subplots(figsize=(6,4))
ax.hist(N_pyg, color='C0', edgecolor='white', alpha=0.9, bins=30, label='Pygmy')
ax.hist(N_red, color='C1', edgecolor='white', alpha=0.9, bins=30, label='Red-cheeked')
ax.axvline(N_hooten[0], linestyle='--', color='black', linewidth=2)
ax.axvline(N_hooten[1], linestyle='--', color='black', linewidth=2)
ax.set_title('Posterior distributions of $N$')
ax.set_ylabel('Number of samples')
ax.legend()
plt.show()

Figure 3: Posterior distributions of \(N\) from the \(M_0\) model. Estimates from Hooten and Hefley (2019) are shown by the vertical lines’

We might expect estimates of capture probability \(p\) and the abundance \(N\) to be somewhat correlated. We can explore this relationship visually by plotting the posterior draws.

# create the plot
fig, ax = plt.subplots(1, 1, figsize=(4, 4))

# add the scatter for each species
labs = ['Pygmy', 'Red-backed']
ax.scatter(p_samps[0], N_pyg, s=10, alpha=0.2, label=labs[0])
ax.scatter(p_samps[1], N_red, s=10, alpha=0.2, label=labs[1])

# this removes the opacity for the dots in the legend
leg = ax.legend()
for lh in leg.legend_handles: 
    lh.set(sizes=[25], alpha=[1])

# update aesthetics 
ax.spines.right.set_visible(False)
ax.spines.top.set_visible(False)

ax.set_ylabel(r'$N$')
ax.set_xlabel(r'$p$')
ax.set_title('Posterior draws')

plt.show()

Figure 4: Posterior draws of \(N\) and \(p\) for both species of salamander.

Model \(M_b\)

Next, I fit model \(M_b,\) which accounts for the possibility that the capture probability changes after the animal is first caught. This could be from trap happiness, whereby animals are more likely to be trapped after their first time. Conversely, this could be from subsequent trap avoidance.

Mirroring (Royle and Dorazio 2008, chap. 5), I fit this model to the Microtus dataset reported in (Williams, Nichols, and Conroy 2002, 525). This version of the dataset includes encounter histories of \(n=56\) adult males that were captured on \(T=5\) consecutive days.

# read in the microtus data
microtus = np.loadtxt('microtus.data.txt').astype(int)

# the last column is not relevant
micro_hist = microtus[:,:-1]
n, T = micro_hist.shape

# augment with all zero histories
M = 100
micro_augmented = augment_history(micro_hist, M=M)

# note the occasion when each individual was first seen
first_seen = (micro_hist != 0).argmax(axis=1)

# create the covariate for the behavior effect
behavior_effect = np.zeros((M, T))
for i, f in enumerate(first_seen):
    behavior_effect[i, (f + 1):] = 1

# covariate matrix
x_int = np.ones((M, T))
X = np.stack((x_int, behavior_effect), axis=2)

I use the same custom distribution as the occupancy notebook, the zero-inflated model, except the zero-inflation happens at the row-level.

def logp(value, n, p, psi):
    
    binom = binomln(n, value) + logpow(p, value) + logpow(1 - p, n - value)
    bin_sum = pm.math.sum(binom, axis=1)
    bin_exp = pm.math.exp(bin_sum)

    res = pm.math.switch(
        value.sum(axis=1) > 0,
        bin_exp * psi,
        bin_exp * psi + (1 - psi)
    )
    
    return pm.math.log(res)

coords = {'alpha_coeffs': ['Intercept', 'B_Response']}
with pm.Model(coords=coords) as mb:

    # priors for the capture and inclusion probabilities
    psi = pm.Beta('psi', 0.001, 1)
    Alpha = pm.Normal('Alpha', 0, 2, dims='alpha_coeffs')

    nu = pm.math.dot(X, Alpha)
    p = pm.Deterministic('p', pm.math.invlogit(nu))

    # likelihood 
    pm.CustomDist(
        'y',
        1,
        p,
        psi,
        logp=logp,
        observed=micro_augmented
    )
    
pm.model_to_graphviz(mb)

Figure 5: Visual representation of model \(M_b.\)

with mb:
    mb_idata = pm.sample()

Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [psi, Alpha]

Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 1 seconds.

az.summary(mb_idata, var_names=['Alpha', 'psi'])

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
Alpha[Intercept]	0.104	0.248	-0.348	0.569	0.006	0.005	1762.0	1564.0	1.0
Alpha[B_Response]	0.625	0.290	0.071	1.152	0.007	0.006	1691.0	1678.0	1.0
psi	0.569	0.053	0.472	0.669	0.001	0.001	2214.0	1750.0	1.0

az.plot_forest(mb_idata, var_names=['Alpha'], combined=True, ess=True, figsize=(6,2));

Figure 6: Forest plot showing the catchability parameters from model \(M_b.\)

The forest plot indicates that there is some evidence of a weak, positive behavioral response. Although note that the 94% credible intervals between the baseline capture rate and the behavioral effect overlap considerably.

# # simulate draws of N 
psi_samps = az.extract(mb_idata).psi.values
p_samps = az.extract(mb_idata).p.values

# posterior probabilities of being present in the population but not detected
p_if_present = psi_samps * binom.pmf(0, n=T, p=p_samps)
p_total = p_if_present + (1 - psi_samps)

# simulate the number of undetected animals in each population
number_undetected = RNG.binomial(M - n, p_if_present / p_total)

# simulate N
N = n + number_undetected


# create the plot
fig, ax = plt.subplots(figsize=(4, 3))

# bar plot looks a little better than a histogram here imo
N_values, N_counts = np.unique(N, return_counts=True)
ax.bar(N_values, N_counts)

ax.annotate(
    'Number\ndetected $n$', 
    ha='left',
    xy=(N_values[0], N_counts[0]), 
    color='black',
    xytext=(n+5, 6.2e5), 
    arrowprops=dict(arrowstyle="->", color='black', linewidth=1,
                    connectionstyle="angle3,angleA=90,angleB=0")
)

ax.ticklabel_format(axis='y', style='sci', scilimits=(0,0))
ax.set_ylabel('Number of samples')
ax.set_title('Posterior of $N$')

plt.show()

Figure 7: Posterior distribution of \(N\) from model \(M_b.\) The number voles that were detected \(n\) is shown by the vertical red line.

Most of the posterior density of \(N\) is at \(n,\) the number of animals detected. The discovery curve hints at why this may be the case. It seems that all the voles in the population may have been captured by the end of the study.

# how many voles have been seen?
total_seen = micro_hist.sum(axis=0).cumsum()
total_seen = np.insert(total_seen, 0, 0)

# how many new voles have been seen?
first_seen = (micro_hist != 0).argmax(axis=1)
newbies = [sum(first_seen == t) for t in range(T)]
total_newbies = np.cumsum(newbies)
total_newbies = np.insert(total_newbies, 0, 0)

fig, ax = plt.subplots(figsize=(5, 3.5))
ax.plot(total_seen, total_newbies)
ax.fill_between(total_seen, total_newbies, alpha=0.2)
ax.set_title('Discovery curve')
ax.set_xlabel('Total voles captured')
ax.set_ylabel('Unique voles captured')
plt.show()

Figure 8: Discovery curve for the *Microtus* study.

We can also look at the behavioral effect by visualizing the posterior distributions of \(p.\) As we can see, the voles who have been captured before are more likely to be captured again.

first_detection = X[:, :, 1] == 0
p_first_detection = p_samps[first_detection].flatten()
p_seen_before = p_samps[~first_detection].flatten()

fig, ax = plt.subplots(figsize=(5, 3.5))
az.plot_dist(p_first_detection, ax=ax, label='First detection', color='C0')
az.plot_dist(p_seen_before, ax=ax, label='Seen before', color='C1')
ax.set_title('Posterior distributions of $p$')
ax.set_xlim((0,1))
ax.set_yticks([])
ax.legend()
plt.show()

Figure 9: Posterior distributions for the probability of detection given the behavioral effect.

%load_ext watermark

%watermark -n -u -v -iv -w

Last updated: Wed May 14 2025

Python implementation: CPython
Python version       : 3.13.2
IPython version      : 9.0.2

pandas    : 2.2.3
numpy     : 2.1.3
matplotlib: 3.10.1
seaborn   : 0.13.2
pymc      : 5.22.0
arviz     : 0.21.0

Watermark: 2.5.0

References

Bailey, Larissa L, Theodore R Simons, and Kenneth H Pollock. 2004. “Estimating Detection Probability Parameters for Plethodon Salamanders Using the Robust Capture-Recapture Design.” The Journal of Wildlife Management 68 (1): 1–13.

Hooten, Mevin B, and Trevor Hefley. 2019. Bringing Bayesian Models to Life. CRC Press.

Otis, David L, Kenneth P Burnham, Gary C White, and David R Anderson. 1978. “Statistical Inference from Capture Data on Closed Animal Populations.” Wildlife Monographs, no. 62: 3–135.

Royle, J Andrew, and Robert M Dorazio. 2008. Hierarchical Modeling and Inference in Ecology: The Analysis of Data from Populations, Metapopulations and Communities. Elsevier.

Williams, Byron K, James D Nichols, and Michael J Conroy. 2002. Analysis and Management of Animal Populations. Academic press.