
Summarise population by available admin levels and build a dictionary
Source:R/snt_process_population.R
snt_process_population.RdProduces population summaries at the admin levels present in pop_data
(any of adm0..adm3) and returns a list containing per-level tables, a
bilingual (EN + optional FR) data dictionary, and a record of which
admin levels were found. By default, column types are first inferred
via auto_parse_types() with apply = TRUE and return = "data".
Usage
snt_process_population(
pop_data,
pop_cols = "pop",
translate = TRUE,
language = NULL,
trans_cache_path = if (requireNamespace("here", quietly = TRUE)) {
here::here("cache/translations")
} else {
"cache/translations"
},
infer_types = TRUE
)Arguments
- pop_data
data.frame/tibble with
adm0, optionaladm1/adm2/adm3,year, and population column(s).yearmay be integer, numeric, Date/POSIXt, factor, or character; it is coerced to integer years when feasible.- pop_cols
character vector of population column names to process. Default is "pop" for backward compatibility.
- translate
logical; when TRUE, add a translated label column (default FR unless
languageis provided) using the translation cache.- language
Optional ISO code (e.g., "fr"). When provided, a
label_<language>column is added to the dictionary and placed immediately afterlabel_en.- trans_cache_path
character path to the translation cache directory. default:
here::here("cache/translations")if available, else"cache/translations".- infer_types
logical; when TRUE (default) coerce
pop_datafirst usingauto_parse_types(apply = TRUE, return = "data").
Value
named list with elements (only levels present are included):
pop_data_adm0,pop_data_adm1,pop_data_adm2,pop_data_adm3(aggregated by level and year)data_dictionary(frombuild_dictionary())levels_present(character vector of admin levels detected)
Details
Data uniqueness requirement
Input data must have unique rows per admin-year at the finest level present. If your data contains sex, age, or residence strata, aggregate these upstream before passing to this function. The function will error if duplicates are detected to prevent silent double counting.
Automatic proportion handling
The function automatically detects which population columns contain proportions (all non-NA values <= 1) versus counts. Proportion columns are aggregated using the mean, while count columns are summed. If a column contains mixed data (some values <= 1 and some > 1), the function will abort with an error message.
Examples
if (FALSE) { # \dontrun{
# Basic usage with default "pop" column
example_pop <- tibble::tibble(
adm0 = c("A", "A", "B"),
adm1 = c("X", "X", "Y"),
adm2 = c("P", "Q", "R"),
year = c(2020L, 2020L, 2021L),
pop = c(100, 50, 70)
)
out <- snt_process_population(example_pop)
names(out)
out$levels_present
# Using custom population column names
example_multi_pop <- tibble::tibble(
adm0 = c("A", "A", "B"),
adm1 = c("X", "X", "Y"),
year = c(2020L, 2020L, 2021L),
total_pop = c(100, 50, 70),
under5_pop = c(20, 10, 15)
)
out_multi <- snt_process_population(
example_multi_pop,
pop_cols = c("total_pop", "under5_pop")
)
names(out_multi)
} # }