When to Use the GLMM Estimator
If your pilot always flies at the same time of day — say, 10 AM — your instantaneous count systematically over- or under-represents total daily effort. Early-morning flights catch fewer anglers than are present at peak hours; late-afternoon flights may miss early starters entirely. When flight timing is non-random, the simple expansion (count × h_open / v) inherits this temporal bias.
The GLMM approach (Askey et al. 2018) models how angler counts change
through the day using a quadratic hour effect and a day-level random
intercept: count ~ poly(hour, 2) + (1 | date). Once the
diurnal curve is estimated, the model integrates predicted counts across
the full open-water window — correcting for wherever in the curve the
actual flight fell.
When to use the GLMM estimator:
- Flights always occur in the same part of the day (fixed morning or afternoon schedule).
- You have multiple overflights per day across several days (minimum 8–10 survey days recommended for stable random-effect estimation).
When to use the simple estimator:
- Flight timing is randomly assigned across the open-water window.
- You have only one count per day.
For the basic aerial workflow without GLMM correction, see the aerial surveys vignette.
Example Data
The example_aerial_glmm_counts dataset contains 12
survey days with 4 overflights per day at fixed hours (7, 10, 13, and 16
hours), producing 48 observations. Angler counts follow a diurnal curve
with day-level Poisson variability — representative of a scenario where
a fixed morning-to-afternoon flight schedule is used throughout the
season.
library(tidycreel)
data(example_aerial_glmm_counts)
head(example_aerial_glmm_counts)
#> date day_type n_anglers time_of_flight
#> 1 2024-06-03 weekday 3 7
#> 2 2024-06-03 weekday 30 10
#> 3 2024-06-03 weekday 65 13
#> 4 2024-06-03 weekday 50 16
#> 5 2024-06-06 weekday 5 7
#> 6 2024-06-06 weekday 15 10The four columns are date, day_type,
n_anglers (instantaneous count), and
time_of_flight (decimal hour of each overflight).
Building the Aerial Design
Build an aerial creel_design from the survey dates and
attach the count data. The h_open = 14 argument specifies
the number of hours the fishery is open each day — this enters the final
expansion after the diurnal curve is integrated.
aerial_cal <- unique(example_aerial_glmm_counts[, c("date", "day_type")])
aerial_cal <- aerial_cal[order(aerial_cal$date), ]
design <- creel_design(
aerial_cal,
date = date,
strata = day_type,
survey_type = "aerial",
h_open = 14
)
design <- add_counts(design, example_aerial_glmm_counts)
#> Warning in add_counts(design, example_aerial_glmm_counts): Duplicate PSU values detected in count data.
#> ℹ Found 36 duplicate value(s) in column date.
#> ℹ If multiple counts were taken per day, specify `count_time_col`.
#> ℹ Example: `add_counts(design, counts, count_time_col = count_time)`
#> Warning in svydesign.default(ids = psu_formula, strata = strata_formula, : No
#> weights or probabilities supplied, assuming equal probability
print(design)
#>
#> ── Creel Survey Design ─────────────────────────────────────────────────────────
#> Type: "aerial"
#> Date column: date
#> Strata: day_type
#> Calendar: 12 days (2024-06-03 to 2024-07-06)
#> day_type: 2 levels
#> Counts: 48 observations
#> PSU column: date
#> Count type: "instantaneous"
#> Survey: <survey.design2> (constructed)
#> Interviews: "none"
#> Sections: "none"
#>
#> ── Aerial Survey Design ──
#>
#> Hours open (h_open): 14GLMM Effort Estimation
Call estimate_effort_aerial_glmm() with
time_col = time_of_flight. The default model fits a
negative-binomial GLMM with a quadratic temporal effect and a day-level
random intercept:
glmm_result <- estimate_effort_aerial_glmm(design, time_col = time_of_flight)
#> Warning in theta.ml(Y, mu, weights = object@resp$weights, limit = limit, :
#> iteration limit reached
print(glmm_result)
#>
#> ── Creel Survey Estimates ──────────────────────────────────────────────────────
#> Method: aerial_glmm_total
#> Variance: delta
#> Confidence level: 95%
#>
#> # A tibble: 1 × 7
#> estimate se se_between se_within ci_lower ci_upper n
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 4399. 381. 381. NA 3653. 5146. 48The model fits the diurnal count curve over all 48 observations, then
numerically integrates the predicted mean count across the full
open-water window (from 0.5 hours before the earliest flight to 0.5
hours after the latest). The integrated area is scaled by
h_open / visibility_correction to convert counts to
angler-hours.
Variance and Confidence Intervals
Two variance methods are available.
Delta method (default): Propagates the fixed-effect
covariance matrix from lme4::vcov() to the derived integral
via a gradient vector. This is fast and analytic, and is the default
when boot = FALSE.
Parametric bootstrap: lme4::bootMer()
re-fits the model under parametric resampling nsim times
and uses the SD of the resulting totals as the SE. This method can give
more accurate CIs for skewed count distributions. Use it for final
production analyses; the delta method is appropriate for exploratory
work.
# Bootstrap CIs — use nboot = 500 for production analyses
glmm_boot <- estimate_effort_aerial_glmm(
design,
time_col = time_of_flight,
boot = TRUE,
nboot = 100L
)
print(glmm_boot)Downstream Estimation
GLMM effort feeds directly into the standard downstream estimators.
Attach interview data and call estimate_catch_rate() and
estimate_total_catch() on the same design object.
# Build a complementary design with matching interview dates
aerial_int_cal <- unique(example_aerial_counts[, c("date", "day_type")])
aerial_int_cal <- aerial_int_cal[order(aerial_int_cal$date), ]
design_int <- creel_design(
aerial_int_cal,
date = date,
strata = day_type,
survey_type = "aerial",
h_open = 14
)
design_int <- add_counts(design_int, example_aerial_counts)
#> Warning in svydesign.default(ids = psu_formula, strata = strata_formula, : No
#> weights or probabilities supplied, assuming equal probability
design_int <- add_interviews(design_int, example_aerial_interviews,
catch = walleye_catch,
effort = hours_fished,
trip_status = trip_status
)
#> ℹ No `n_anglers` provided — assuming 1 angler per interview.
#> ℹ Pass `n_anglers = <column>` to use actual party sizes for angler-hour
#> normalization.
#> Warning: 15 interviews have zero catch.
#> ℹ Zero catch may be valid (skunked) or indicate missing data.
#> ℹ Added 48 interviews: 48 complete (100%), 0 incomplete (0%)
catch_rate <- estimate_catch_rate(design_int)
#> ℹ Using complete trips for CPUE estimation
#> (n=48, 100% of 48 interviews) [default]
print(catch_rate)
#>
#> ── Creel Survey Estimates ──────────────────────────────────────────────────────
#> Method: Ratio-of-Means CPUE
#> Variance: Taylor linearization
#> Confidence level: 95%
#>
#> # A tibble: 1 × 5
#> estimate se ci_lower ci_upper n
#> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 0.413 0.0601 0.295 0.531 48
total_catch <- estimate_total_catch(design_int)
print(total_catch)
#>
#> ── Creel Survey Estimates ──────────────────────────────────────────────────────
#> Method: Total Catch (Effort × CPUE)
#> Variance: Taylor linearization
#> Confidence level: 95%
#> Effort target: sampled_days
#>
#> # A tibble: 1 × 5
#> estimate se ci_lower ci_upper n
#> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 251. 45.3 162. 339. 48Comparison: Simple vs. GLMM Estimator
Both estimators are applied to the same design. The simple estimator
treats each overflight count as a representative sample of the full
open-water window and expands by h_open / v directly. The
GLMM estimator corrects for the fixed-hour flight schedule by
integrating the fitted diurnal curve.
# Simple aerial estimator — no diurnal correction
simple_result <- estimate_effort(design)
# GLMM result from above
# glmm_result already computed
# Side-by-side comparison
comparison <- rbind(
data.frame(
method = "GLMM",
estimate = glmm_result$estimates$estimate,
se = glmm_result$estimates$se,
ci_lower = glmm_result$estimates$ci_lower,
ci_upper = glmm_result$estimates$ci_upper,
stringsAsFactors = FALSE
),
data.frame(
method = "Simple",
estimate = simple_result$estimates$estimate,
se = simple_result$estimates$se,
ci_lower = simple_result$estimates$ci_lower,
ci_upper = simple_result$estimates$ci_upper,
stringsAsFactors = FALSE
)
)
print(comparison)
#> method estimate se ci_lower ci_upper
#> 1 GLMM 4399.453 380.6978 3653.299 5145.607
#> 2 Simple 20370.000 1891.0861 16156.398 24583.602The GLMM estimate corrects for the fact that all flights occurred at fixed hours (7, 10, 13, 16). The simple estimator treats each count as representative of the full open-water window, which inflates or deflates the total depending on where the peak falls in the diurnal curve.
Custom Formula
For surveys where a linear temporal term is sufficient — or where the
analyst prefers to control the model structure directly — pass a custom
formula. The formula must reference the actual count column
(n_anglers) and the time column by its exact name in the
data.
glmm_linear <- estimate_effort_aerial_glmm(
design,
time_col = time_of_flight,
formula = n_anglers ~ time_of_flight + (1 | date)
)
#> boundary (singular) fit: see help('isSingular')
print(glmm_linear)
#>
#> ── Creel Survey Estimates ──────────────────────────────────────────────────────
#> Method: aerial_glmm_total
#> Variance: delta
#> Confidence level: 95%
#>
#> # A tibble: 1 × 7
#> estimate se se_between se_within ci_lower ci_upper n
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
#> 1 4282. 374. 374. NA 3548. 5015. 48A linear temporal term reduces flexibility but can improve stability when only a few survey days are available. Use the default quadratic formula when you have 8 or more survey days.
References
Askey, P. J., Parkinson, E. A., Rehill, P., & Post, J. R. (2018). Correcting for non-random flight timing in aerial creel surveys using a generalized linear mixed model. North American Journal of Fisheries Management, 38(5), 1204–1215. https://doi.org/10.1002/nafm.10010
Jones, C. M., & Pollock, K. H. (2012). Recreational survey methods: estimation of effort, harvest, and abundance. Chapter 19 in Fisheries Techniques (3rd ed.), pp. 883–919. American Fisheries Society.
