Calculates malaria Test Positivity Rate (TPR) at the health facility-month level and applies structured fallback logic to derive proxy values when confirmed or tested data are missing. Input data must be at the health facility-month level, with one observation per facility per reporting month. Returns a validated TPR dataset with quality flags and source tracking.
Usage
calc_tpr(
data,
hf_var = "hf_uid",
adm0_var = NULL,
adm1_var = "adm1",
adm2_var = "adm2",
date_var = "date",
conf_var = "conf",
test_var = "test",
extreme_threshold = c(0.01, 0.99),
include_flags = FALSE,
activity_indicators = c("conf", "test"),
activity_method = 3,
nonreport_window = 6,
fallback_method = c("rolling", "adm2", "prev_year", "adm1", "adm0"),
prev_year_window = 2,
fallback_triggers = c("missing", "extreme", "low_test")
)Arguments
- data
Routine health facility data at the facility-month level (data.frame or tibble). Must contain one row per facility per month.
- hf_var
Column name for health facility unique identifier (default: "hf_uid").
- adm0_var
Column name for national/country level. If NULL (default), creates a single "country" value for all records.
- adm1_var
Column name for first administrative level/region (default: "adm1").
- adm2_var
Column name for second administrative level/district (default: "adm2").
- date_var
Column name for date of reporting period (default: "date").
- conf_var
Column name for number of confirmed malaria cases (default: "conf").
- test_var
Column name for number of individuals tested (default: "test").
- extreme_threshold
Numeric vector of length 2 specifying lower and upper bounds for flagging extreme TPR values (default:
c(0.01, 0.99)).- include_flags
Logical; if
TRUE, includes all quality flag columns in the output. IfFALSE(default), returns only the core TPR variables without flags.- activity_indicators
Character vector of indicator columns used to determine facility activity. Default is
c("conf", "test").- activity_method
Numeric. Classification method for facility activity (1, 2, or 3). Default is 3 (dynamic activation/inactivation). Used to flag inactive facility-months which are excluded from proxy calculations.
- nonreport_window
Integer. Number of consecutive non-reporting periods before a facility is considered inactive (for method 3). Default is 6.
- fallback_method
Character vector specifying which proxy fallback levels to use and their order. Valid options: "rolling" (3-month rolling average from same facility), "adm2" (district-level), "adm1" (regional-level), "prev_year" (same month previous year), "adm0" (national-level). Default is
c("rolling", "adm2", "prev_year", "adm1", "adm0"). Proxies are applied sequentially in the order specified. Set to NULL to disable all fallbacks (raw TPR only).- prev_year_window
Integer specifying the seasonal window in months for previous year fallback. 0 = exact month match only (default), 1 = +/-1 month window (3-month average), 2 = +/-2 months (5-month average), etc. Maximum value is 6. When window > 0, averages all available months within the window from the previous year, weighted by test counts.
- fallback_triggers
Character vector specifying when to apply proxy fallbacks. Valid options: "missing" (conf/test is NA), "extreme" (TPR outside extreme_threshold), "low_test" (test < 5). Default is
c("missing", "extreme", "low_test"). Set to NULL to disable all triggers (keep raw TPR values without replacement). Note: impossible values (conf > test) are always included in fallback along with missing values. Inactive facilities are always excluded.
Value
A list containing three elements:
data: A tibble with TPR estimates per facility-month including:hf_uid: Health facility identifieradm0: National/country leveladm1: First administrative leveladm2: Second administrative leveldate: Standardised date (first of month)year: Year extracted from datemonth: Month extracted from dateconf: Confirmed casestest: Number testedtpr: Final validated or proxy TPR (0-1 scale)tpr_source: Source of TPR value (facility_raw, proxy_adm2, proxy_adm1, proxy_prev_year, or proxy_adm0)flag_tpr_valid: TRUE if raw TPR could be calculatedflag_tpr_extreme: TRUE if TPR outside extreme_thresholdflag_tpr_proxy: TRUE if proxy value was usedflag_tpr_missing: TRUE if no TPR could be assignedflag_conf_gt_test: TRUE if conf > test (impossible value)flag_zero_test: TRUE if test == 0flag_low_test: TRUE if test < 5flag_missing_conf: TRUE if conf is NAflag_inactive: TRUE if facility-month is inactive
Details
The default fallback hierarchy for proxy TPR values is:
Rolling 3-month average from same facility (+/-1 month)
District (adm2) TPR from same month
Same month from previous year (facility-level)
Regional (adm1) TPR from same month
National (adm0) TPR from same month
Reporting Rates
This function does not calculate reporting rates. If you need reporting
rates for analysis or filtering, calculate them separately using
sntutils::calculate_reporting_metrics() and join to your data before
calling calc_tpr().
Example:
# Calculate reporting rate at district level
reprate_data <- sntutils::calculate_reporting_metrics(
data = facility_data,
vars_of_interest = "conf",
x_var = "date",
y_var = "adm2",
hf_col = "hf_uid",
key_indicators = c("conf", "test")
)
# Join back to facility data (include all admin levels for proper join)
data_with_reprate <- facility_data |>
dplyr::left_join(
reprate_data |> dplyr::select(adm0, adm1, adm2, date, reprate),
by = c("adm0", "adm1", "adm2", "date")
)
# Now filter or use reprate in your analysis
tpr_result <- calc_tpr(data_with_reprate) |>
dplyr::filter(reprate >= 0.8)Examples
# Example with minimal data
# facility_data <- tibble::tibble(
# hf_uid = c("HF001", "HF002", "HF003"),
# adm1 = rep("RegionA", 3),
# adm2 = rep("DistrictX", 3),
# date = as.Date(c("2023-01-01", "2023-02-01", "2023-01-01")),
# conf = c(10, 15, NA),
# test = c(100, 120, 80),
# report_complete = c(1, 1, 1)
# )
#
# result <- calc_tpr(facility_data)
# tpr_data <- result$data
# tpr_dict <- result$dict
# result$metadata
