What is sntutils?
sntutils is an R package developed by AHADI to support
the Subnational Tailoring (SNT) of malaria
interventions. It bundles the small, repeated operations you find
yourself doing in every country support analysis: reading messy DHIS2
exports, harmonising admin names across vintages of shapefiles,
validating facility coordinates, calculating reporting rates, extracting
climate and population rasters to admin units, and producing plots and
maps in the language the country team works in.
The package is built around four ideas:
-
One way in, one way out.
read()andwrite()handle every common format.read_snt_data()andwrite_snt_data()add hashing and metadata sidecars when you need a reproducible audit trail. -
Tidy outputs. Every function returns a tibble or an
sfdata frame with stable, documented column names - never a list of lists. -
Country-aware defaults. Functions accept
target_languageso plots, labels and month names match the team’s working language. - Small surface area, big leverage. The R/ folder has ~50 source files; this site groups them by what they do, not what they are.
New to Subnational Tailoring? The AHADI SNT
Code Library is the methodology-level companion to this package -
country examples, analytical reasoning, and the workflow behind every
step sntutils automates. Good starting points: About,
For
Analysts, Producing
high-quality outputs.
Install
# 1) install pak if needed
install.packages("pak")
# 2) install sntutils from GitHub
pak::pkg_install("ahadi-analytics/sntutils")System dependencies: sntutils uses sf and
terra, which require GDAL, GEOS and PROJ. On macOS install
them with brew install gdal proj geos; on Ubuntu use the
GDAL PPA. RStudio and recent Posit Workbench builds already ship with
these.
The shape of an SNT pipeline
Most SNT projects move through the same stages. sntutils
provides a function (or a small family of functions) at each step:
| Stage | Typical task | Key sntutils functions |
|---|---|---|
| 1. Project setup | Create the AHADI folder skeleton, get standard paths |
initialize_project_structure(),
setup_project_paths()
|
| 2. Ingest | Read CSV / Excel / Stata / RDS / shapefile inputs |
read(), read_snt_data()
|
| 3. Clean | Parse dates, infer types, standardise admin names |
autoparse_dates(), auto_parse_types(),
standardize_names(), prep_geonames()
|
| 4. Validate | Check facility coordinates and admin geometries |
validate_process_coordinates(),
validate_process_spatial()
|
| 5. Analyse | Reporting rates, consistency, outliers |
calculate_reporting_metrics(),
consistency_check(), detect_outliers()
|
| 6. Enrich | Pull climate / population / DHS, extract to admin units |
download_chirps(), download_era5(),
download_worldpop(),
process_raster_collection()
|
| 7. Communicate | Maps, plots, translated labels, compressed PNGs |
reporting_rate_plot(), dhis2_map(),
translate_text(), compress_png()
|
The articles in the Workflows menu walk through each stage in detail. This page just shows the whole pipeline once, end-to-end.
A tiny end-to-end example
Below we read a Sierra Leone DHIS2 sample, parse its dates, standardise column names, calculate reporting rates by district-month, and draw a plot. The dataset ships with the package.
library(sntutils)
# 1. read - sntutils::read() picks the importer from the file extension
sl_dhis2 <- read(
system.file("extdata", "sl_exmaple_dhis2.rds", package = "sntutils")
)
# 2. clean - make column names lowercase_with_underscores and parse dates
sl_dhis2 <- sl_dhis2 |>
standardize_names() |>
autoparse_dates(date_cols = "date") |>
dplyr::rename(year_mon = date) |>
dplyr::mutate(
hf_uid = vdigest(paste0(adm1, adm2, hf), algo = "xxhash32"),
record_id = vdigest(paste(hf_uid, year_mon), algo = "xxhash32")
)
# 3. analyse - reporting rate by district-month
rates <- calculate_reporting_metrics(
data = sl_dhis2,
vars_of_interest = c("conf", "pres"),
x_var = "year_mon",
y_var = "adm2",
hf_col = "hf_uid",
key_indicators = c("allout", "test", "treat", "conf", "pres")
)
tail(rates)
#> # A tibble: 6 × 6
#> year_mon adm2 rep exp reprate missrate
#> <chr> <chr> <int> <int> <dbl> <dbl>
#> 1 2023-12 Moyamba District Council 106 108 0.981 0.0185
#> 2 2023-12 Port Loko City Council 2 2 1 0
#> 3 2023-12 Port Loko District Council 99 103 0.961 0.0388
#> 4 2023-12 Pujehun District Council 96 104 0.923 0.0769
#> 5 2023-12 Tonkolili District Council 109 115 0.948 0.0522
#> 6 2023-12 Western Area Rural District Council 62 64 0.969 0.0312
# 4. visualise - facility-level reporting plot, in English
reporting_rate_plot(
data = sl_dhis2,
vars_of_interest = "conf",
x_var = "year_mon",
y_var = "adm2",
hf_col = "hf_uid",
key_indicators = c("allout", "test", "treat", "conf", "pres")
)That’s a complete SNT mini-pipeline: read → clean → metric → plot, with no external state and no manual format wrangling. The rest of the articles drill into each step.
Where to next
-
Data I/O and cleaning - every
read()/write()shortcut plus name and date harmonisation. - Spatial validation and mapping - admin geometries, coordinates, shapefile crosswalks, fuzzy facility matching.
- Reporting rates and data quality - the three reporting-rate scenarios, consistency checks and outlier methods.
- Climate and population downloads - CHIRPS, ERA5, MODIS, NASA POWER, WorldPop and DHS, and how to extract them to admin units.
- Project setup and utilities - folder scaffolding, paths, caching, translation and the small numeric helpers.
If something is missing or surprising, please open an issue on GitHub - the package evolves with the SNT support work we do.
