
Validate TPR Proxy Quality Using Leave-One-Out Cross-Validation
Source:R/calc_tpr.R
validate_tpr_proxies.RdEvaluates the accuracy of TPR proxy estimates by comparing them against actual facility-level TPR values. Uses leave-one-out cross-validation to calculate what each proxy level (adm2, adm1, prev_year, rolling, adm0) would have estimated for facility-months with valid raw TPR data.
Usage
validate_tpr_proxies(
data,
hf_var = "hf_uid",
adm0_var = "adm0",
adm1_var = "adm1",
adm2_var = "adm2",
year_var = "year",
month_var = "month",
conf_var = "conf",
test_var = "test",
min_facilities = 2,
generate_plots = TRUE
)Arguments
- data
Output from
calc_tpr()withinclude_flags = TRUE.- hf_var
Column name for health facility unique identifier. Should match the value used in
calc_tpr(). Default is "hf_uid".- adm0_var
Column name for national/country level. Default is "adm0".
- adm1_var
Column name for first administrative level/region. Default is "adm1".
- adm2_var
Column name for second administrative level/district. Default is "adm2".
- year_var
Column name for year. Default is "year".
- month_var
Column name for month. Default is "month".
- conf_var
Column name for confirmed cases. Default is "conf".
- test_var
Column name for tests. Default is "test".
- min_facilities
Minimum number of facilities in an admin unit to include in validation. Default is 2.
- generate_plots
Logical; if TRUE (default), generates diagnostic plots. Set to FALSE for metrics only.
Value
A list containing:
metrics: Data frame with validation metrics for each proxy levelvalidation_data: Data frame with actual vs proxy comparisonsplots: List of ggplot objects (if generate_plots = TRUE):scatter: Actual vs proxy scatterplotserror_dist: Error distribution by proxy levelmae_by_tests: MAE vs number of testsstability_map: Error patterns by test count and TPR value
summary: Character vector with key findings