Skip to contents

Introduction

The tidycreel package is built on top of R’s survey package. Every estimation function in tidycreel — estimate_effort(), estimate_catch_rate(), estimate_total_catch() — is a wrapper around a corresponding survey package call. The design objects tidycreel constructs are standard svydesign objects under the hood, and the estimators delegate directly to svymean(), svytotal(), and svyratio().

This vignette walks through the same effort + catch rate + total catch analysis twice: first using raw survey package calls, then using tidycreel. The goal is to make the mapping explicit, so that R users familiar with the survey package can immediately understand what tidycreel is doing and why. If you already know svyratio() and the delta method for variance propagation, you will recognize those computations in tidycreel’s output.

Data Setup

Both workflows use the same three example datasets included with tidycreel. Load them once and they are shared across both parts of this vignette.

library(tidycreel)

data(example_calendar)
data(example_counts)
data(example_interviews)

head(example_calendar)
#>         date day_type
#> 1 2024-06-01  weekend
#> 2 2024-06-02  weekend
#> 3 2024-06-03  weekday
#> 4 2024-06-04  weekday
#> 5 2024-06-05  weekday
#> 6 2024-06-06  weekday
head(example_counts)
#>         date day_type effort_hours
#> 1 2024-06-01  weekend         45.2
#> 2 2024-06-02  weekend         52.8
#> 3 2024-06-03  weekday         12.5
#> 4 2024-06-04  weekday         18.3
#> 5 2024-06-05  weekday         15.7
#> 6 2024-06-06  weekday         22.1
head(example_interviews)
#>         date hours_fished catch_total catch_kept trip_status trip_duration
#> 1 2024-06-01          2.0           5          2    complete           2.0
#> 2 2024-06-01          3.5           8          5    complete           3.5
#> 3 2024-06-02          1.5           2          1    complete           1.5
#> 4 2024-06-02          2.0           3          2  incomplete           1.0
#> 5 2024-06-03          2.5           6          3    complete           2.5
#> 6 2024-06-03          4.0          12          8    complete           4.0
#>   interview_id angler_type angler_method species_sought n_anglers refused
#> 1            1        bank          bait        walleye         2   FALSE
#> 2            2        boat    artificial        walleye         1   FALSE
#> 3            3        bank          bait           bass         3   FALSE
#> 4            4        bank           fly        panfish         2   FALSE
#> 5            5        boat    artificial        walleye         1   FALSE
#> 6            6        boat          bait           bass         4   FALSE

The calendar covers 14 days (10 weekdays, 4 weekend days). The counts record observed effort (angler-hours) on each sampled day. The interviews record catch and effort for individual completed and incomplete trips.


Part 1 — Raw survey Package Workflow

1a. Survey Design from a Stratified Calendar

A creel survey with weekday/weekend stratification is a stratified simple random sample. The finite population correction (FPC) for each stratum is the total number of days in that stratum over the survey period.

example_counts already carries a day_type column (copied from the calendar), so we can build the FPC column directly from the calendar frequency table.

library(survey)
#> Loading required package: grid
#> Loading required package: Matrix
#> Loading required package: survival
#> 
#> Attaching package: 'survey'
#> The following object is masked from 'package:graphics':
#> 
#>     dotchart

# Stratum sizes: total days per day_type in the calendar
stratum_sizes <- table(example_calendar$day_type)
stratum_sizes
#> 
#> weekday weekend 
#>      10       4

# Attach FPC to the count frame
counts_frame <- example_counts
counts_frame$fpc <- as.integer(stratum_sizes[counts_frame$day_type])
counts_frame
#>          date day_type effort_hours fpc
#> 1  2024-06-01  weekend         45.2   4
#> 2  2024-06-02  weekend         52.8   4
#> 3  2024-06-03  weekday         12.5  10
#> 4  2024-06-04  weekday         18.3  10
#> 5  2024-06-05  weekday         15.7  10
#> 6  2024-06-06  weekday         22.1  10
#> 7  2024-06-07  weekday         14.9  10
#> 8  2024-06-08  weekend         48.6   4
#> 9  2024-06-09  weekend         55.3   4
#> 10 2024-06-10  weekday         16.8  10
#> 11 2024-06-11  weekday         19.4  10
#> 12 2024-06-12  weekday         13.2  10
#> 13 2024-06-13  weekday         17.6  10
#> 14 2024-06-14  weekday         20.1  10

# Build the stratified survey design
svy_counts <- svydesign(
  ids    = ~1,
  strata = ~day_type,
  fpc    = ~fpc,
  data   = counts_frame
)
svy_counts
#> Stratified Independent Sampling design
#> svydesign(ids = ~1, strata = ~day_type, fpc = ~fpc, data = counts_frame)

Because all 14 survey days are observed (a census of the 14-day period), the FPC reduces to 1 and the standard error of the totals is 0. In a real survey where only a subset of days are sampled, the FPC would be less than 1 and variances would be positive.

1b. Effort Estimation via svytotal

Total effort is the sum of observed angler-hours across the 14 days, estimated via svytotal() on the count frame design.

effort_total <- svytotal(~effort_hours, svy_counts)
effort_total
#>              total SE
#> effort_hours 372.5  0
confint(effort_total)
#>              2.5 % 97.5 %
#> effort_hours 372.5  372.5

The result is the estimated total angler-hours for the survey period, with a 95% confidence interval. Because this example uses a complete census of all 14 days, the standard error is 0. In practice, creel surveys sample a subset of days, and svytotal() extrapolates from the observed days to the full survey period using the stratum weights.

1c. Catch Rate via svyratio and Total Catch via the Delta Method

Catch per unit effort (CPUE) is estimated from the interview data using a ratio-of-means estimator: the total catch across all complete interviews divided by the total effort across those same interviews. This is the svyratio() approach.

# Use complete trips only (standard practice)
complete_trips <- subset(example_interviews, trip_status == "complete")
nrow(complete_trips)
#> [1] 17

# Build a simple survey design over the interview frame
svy_interviews <- svydesign(ids = ~1, data = complete_trips)
#> Warning in svydesign.default(ids = ~1, data = complete_trips): No weights or
#> probabilities supplied, assuming equal probability

# Ratio estimator: total catch / total effort (ratio-of-means CPUE)
cpue_ratio <- svyratio(~catch_total, ~hours_fished, svy_interviews)
cpue_ratio
#> Ratio estimator: svyratio.survey.design2(~catch_total, ~hours_fished, svy_interviews)
#> Ratios=
#>             hours_fished
#> catch_total     2.285714
#> SEs=
#>             hours_fished
#> catch_total     0.112297

The CPUE estimate is the ratio of total catch to total effort across all complete interviews. The svyratio() function computes the Taylor linearization variance for this ratio, accounting for the correlation between catch and effort within each interview.

Now combine the effort total and CPUE using the delta method to propagate variance through the product E × C:

effort_coef <- coef(effort_total)[[1]]
cpue_coef <- coef(cpue_ratio)[[1]]
var_effort <- vcov(effort_total)[[1]]
var_cpue <- vcov(cpue_ratio)[[1]]

# Delta method: Var(effort * cpue) = effort^2 * Var(cpue) + cpue^2 * Var(effort)
total_catch_est <- effort_coef * cpue_coef
total_catch_var <- effort_coef^2 * var_cpue + cpue_coef^2 * var_effort
total_catch_se <- sqrt(total_catch_var)

cat("Total catch estimate:", round(total_catch_est, 1), "\n")
#> Total catch estimate: 851.4
cat("Standard error:      ", round(total_catch_se, 1), "\n")
#> Standard error:       41.8
cat(
  "95% CI: [",
  round(total_catch_est - 1.96 * total_catch_se, 1), ",",
  round(total_catch_est + 1.96 * total_catch_se, 1), "]\n"
)
#> 95% CI: [ 769.4 , 933.4 ]

Part 2 — tidycreel Equivalent

2a. creel_design() + add_counts()

creel_design() wraps the calendar and stratification into a design object. add_counts() attaches the count frame and builds the internal svydesign object — the equivalent of the svydesign() call in Part 1a above.

design <- creel_design(example_calendar, date = date, strata = day_type)
design <- add_counts(design, example_counts)
#> Warning in svydesign.default(ids = psu_formula, strata = strata_formula, : No
#> weights or probabilities supplied, assuming equal probability
design
#> 
#> ── Creel Survey Design ─────────────────────────────────────────────────────────
#> Type: "instantaneous"
#> Date column: date
#> Strata: day_type
#> Calendar: 14 days (2024-06-01 to 2024-06-14)
#> day_type: 2 levels
#> Counts: 14 observations
#> PSU column: date
#> Count type: "instantaneous"
#> Survey: <survey.design2> (constructed)
#> Interviews: "none"
#> Sections: "none"

The design reports the same 14 observations across two strata (weekday, weekend) as the raw svydesign object above.

2b. estimate_effort()

estimate_effort() calls svytotal() on the internal design object and returns the result as a tidy tibble with labelled columns.

effort_est <- estimate_effort(design)
effort_est
#> 
#> ── Creel Survey Estimates ──────────────────────────────────────────────────────
#> Method: Total
#> Variance: Taylor linearization
#> Confidence level: 95%
#> Effort target: sampled_days
#> 
#> # A tibble: 1 × 7
#>   estimate    se se_between se_within ci_lower ci_upper     n
#>      <dbl> <dbl>      <dbl>     <dbl>    <dbl>    <dbl> <int>
#> 1     372.  13.2       13.2         0     344.     401.    14

The estimate (372.5 angler-hours) matches the raw svytotal() result in Part 1b. tidycreel adds the CI directly to the output and labels the column as estimate instead of the raw column name.

2c. estimate_catch_rate() + add_interviews()

Before calling estimate_catch_rate(), attach the interview data via add_interviews(). This maps interview columns to the design vocabulary and filters to complete trips by default.

design <- add_interviews(design, example_interviews,
  catch         = catch_total,
  effort        = hours_fished,
  harvest       = catch_kept,
  trip_status   = trip_status,
  trip_duration = trip_duration
)
#>  No `n_anglers` provided — assuming 1 angler per interview.
#>  Pass `n_anglers = <column>` to use actual party sizes for angler-hour
#>   normalization.
#>  Added 22 interviews: 17 complete (77%), 5 incomplete (23%)

Now estimate CPUE, which calls svyratio() internally on the complete-trip subset of the interview frame:

cpue_est <- estimate_catch_rate(design)
#>  Using complete trips for CPUE estimation
#>   (n=17, 77.3% of 22 interviews) [default]
#> Warning: Small sample size for CPUE estimation.
#> ! Sample size is 17. Ratio estimates are more stable with n >= 30.
#>  Variance estimates may be unstable with n < 30.
cpue_est
#> 
#> ── Creel Survey Estimates ──────────────────────────────────────────────────────
#> Method: Ratio-of-Means CPUE
#> Variance: Taylor linearization
#> Confidence level: 95%
#> 
#> # A tibble: 1 × 5
#>   estimate    se ci_lower ci_upper     n
#>      <dbl> <dbl>    <dbl>    <dbl> <int>
#> 1     2.29 0.114     2.06     2.51    17

The CPUE estimate (2.29) and SE (0.11) match the svyratio() output from Part 1c.

2d. estimate_total_catch()

estimate_total_catch() applies the delta method to combine the effort and CPUE estimates, exactly as in Part 1c:

total_catch <- estimate_total_catch(design)
total_catch
#> 
#> ── Creel Survey Estimates ──────────────────────────────────────────────────────
#> Method: Total Catch (Effort × CPUE)
#> Variance: Taylor linearization
#> Confidence level: 95%
#> Effort target: sampled_days
#> 
#> # A tibble: 1 × 5
#>   estimate    se ci_lower ci_upper     n
#>      <dbl> <dbl>    <dbl>    <dbl> <int>
#> 1     858.  48.4     763.     953.    17

The total catch estimate (approximately 851) is consistent with the raw delta-method calculation in Part 1c. Small numerical differences reflect tidycreel’s internal handling of the complete-trip filter and variance components — the statistical method is identical.


Mapping Table

Step survey package tidycreel
Design construction svydesign(ids=~1, strata=~day_type, fpc=~fpc, data=counts) creel_design() + add_counts()
Attach interview data subset + svydesign(ids=~1, data=complete_trips) add_interviews()
Effort estimation svytotal(~effort_hours, design) estimate_effort(design)
Catch rate svyratio(~catch_total, ~hours_fished, int_design) estimate_catch_rate(design)
Total catch E × C with delta method Var(E×C) = E² Var(C) + C² Var(E) estimate_total_catch(design)
Variance method Taylor linearization (default in survey) Taylor linearization (default in tidycreel)

When to Use Each Approach

Use tidycreel for standard creel survey designs where the main analysis follows the effort + CPUE + total catch workflow. tidycreel handles the bookkeeping (FPC construction, complete-trip filtering, delta method variance) and returns tidy tibbles ready for reporting. The domain vocabulary (creel_design, estimate_effort, estimate_catch_rate) matches the language creel biologists already use.

Use raw survey package calls when you need capabilities outside tidycreel’s scope: custom clustering structures, probability-proportional-to-size (PPS) sampling, replicate-weight designs, or nonstandard variance estimators. The as_survey_design() function extracts the internal svydesign object from any tidycreel design, giving you full access to the survey package toolbox while still using tidycreel for the initial data setup.

# Extract the internal svydesign object for advanced use
internal_svy <- as_survey_design(design)
# Now use any survey package function directly
svymean(~effort_hours, internal_svy)