Dev Site — You are viewing the development build. Go to Main Site

  • English
  • Français
  1. 3. Stratification
  2. 3.1 Epidemiological Stratification
  3. Incidence adjustment 3: treatment-seeking
  • Code library for subnational tailoring
    English version
  • 1. Getting Started
    • 1.1 About and Contact Information
    • 1.2 For Everyone
    • 1.3 For the SNT Team
    • 1.4 For Analysts
    • 1.5 Producing High-Quality Outputs
  • 2. Data Assembly and Management
    • 2.1 Working with Shapefiles
      • Spatial data overview
      • Basic shapefile use and visualization
      • Shapefile management and customization
      • Merging shapefiles with tabular data
    • 2.2 Health Facilities Data
      • Fuzzy matching of names across datasets
      • Health facility coordinates and point data
    • 2.3 Routine Surveillance Data
      • Routine data extraction
      • DHIS2 data preprocessing
      • Determining active and inactive status
      • Contextual considerations
      • Missing data detection methods
      • Health facility reporting rate
      • Data coherency checks
      • Outlier detection methods
      • Imputation methods
      • Final database
    • 2.4 Stock Data
      • LMIS
    • 2.5 Population Data
      • National population data
      • WorldPop population raster
    • 2.6 National Household Survey Data
      • DHS data overview and preparation
      • Prevalence of malaria infection
      • All-cause child mortality
      • Treatment-seeking rates
      • ITN ownership, access, and usage
      • Wealth quintiles analysis
    • 2.7 Entomological Data
      • Entomological data
    • 2.8 Climate and Environmental Data
      • Climate and environment data extraction from raster
    • 2.9 Modeled Data
      • Generating spatial modeled estimates
      • Working with geospatial model estimates
      • Modeled estimates of malaria mortality and proxies
      • Modeled estimates of entomological indicators
  • 3. Stratification
    • 3.1 Epidemiological Stratification
      • Incidence overview and crude incidence
      • Incidence adjustment 1: incomplete testing
      • Incidence adjustment 2: incomplete reporting
      • Incidence adjustment 3: treatment-seeking
      • Incidence stratification
      • Prevalence and mortality stratification
      • Combined risk categorization
      • Risk categorization REMOVE?
      • Risk categorization REMOVE?
    • 3.2 Stratification of Determinants of Malaria Transmission
      • Seasonality
      • Access to Care
  • 4. Review of Past Interventions
    • 4.1 Case Management
    • 4.2 Routine Interventions
    • 4.3 Campaign Interventions
    • 4.4 Other Interventions
  • 5. Targeting of Interventions
  • 6. Retrospective Analysis
    • 6.1: Trend analysis
  • 7. Urban Microstratification

On this page

  • Overview
  • Step-by-Step Instructions
    • Step 1: Load required packages and files
  1. 3. Stratification
  2. 3.1 Epidemiological Stratification
  3. Incidence adjustment 3: treatment-seeking

Incidence adjustment 3: treatment-seeking

Overview

Note: At this point, we assume that you have available a data file which includes - Case seeking behaviour (CSB): % public, % private, % not seeking care at - operational admin unit level - for each year of incidence data - You should also know the age group this data applies to and have in mind what to do for the other age groups

Third adjustment: A third level of adjustment (N3) is made to control for differences in care seeking behaviour (CSB) per area, which consequently affects the number of outpatients observed at the public health facilities (HFs) that generally reported routine data to the HMIS. This adjustment has a substantial inflation effect on the number of cases adjusted for testing and reporting rates, and should be interpreted with caution given the underlying assumptions:

  1. the TPR of febrile children who seek care in the private sector or who do not seek care is the same as the TPR observed in the public sector; -> can be relaxed. present options

  2. the patterns of care-seeking behaviour in adults resemble those in children (from whom estimates are collected during the surveys); -> can be relaxed. present options. limited by your age disaggregation in your routine data. also you would need to calculate crude cases, adj1 cases, adj2 cases for separate age groups.

  3. CSB does not change through time unless specific relationships are assumed between two survey points; -> not the case. we ask CSB through time to be estimated already before coming to adj 3. this can be a very simple estimate, or not, depending on available data.

  4. assumes isolations between sectors, under the premise that a child who reported seeking care from the private sector, or who did not seek care after a fever never presents to the public sector, which may not be the case if the fever or other symptoms worsen. -> in routine data, same episode could be showing up multiple times. only way to disambiguate is individual-level data. If this is available, you can try and do that. otherwise it may be difficult to estimate how much multiple counting could exist in aggregated routine data.

  5. all districts that belong to the same region are assigned the region’s public, private or non-seeking rates; unless geospatial models are used to estimate treatment seeking behaviours at more granular units. -> not necessarily. whether or not region-level estimates are used is set in the data file that people bring to adj 3.

These assumptions should be revised and are not all relevant or required in every country if more granular data or evidence are available (e.g. if malaria metrics are available for the private sector, evidence of care seeking behaviour in adults or its relationship with children is known, treatment seeking behaviours between sector and/or at district level are available).

The final number of cases after adjusting for testing (N1) and reporting rates (N2), as well as for care seeking behaviour (N3) are obtained through the formulae below. The equations presented here represent the most conservative approach for adjustments, but can be altered as appropriate to represent the reality of case management, surveillance and care seeking patterns in a given context:

N3= N2 x (1+(f/e)+(g/e))

Where:

  • N2 are the corrected number of cases for testing and reporting rates; and N3 are the corrected number of cases for testing, reporting and careseeking rates.
  • e is the fraction of the febrile children who sought care from the public sector;
  • f is the fraction of the febrile children who sought care from the private sector; which can be weighted to represent the proportion of the private sector that reports to the surveillance system and that is already captured in the routine data.
  • g is the fraction of the febrile children who did not seek care; which can also be weighted if believed that a fraction of the children who report not seeking care end up seeking care later on.

The estimation of N1 and N2 is highly encouraged at the monthly level to ensure that the seasonality patterns within a year are captured in the adjustments. As such, it is recommended that the first and second adjustments are applied to the monthly crude cases before aggregating the crude and adjusted (C, N1 and N2) cases to the annual level to apply the third adjustment and obtain N3.

Objectives
  • TBD

Step-by-Step Instructions

To skip the step-by-step explanation, jump to the full code at the end of this page.

Step 1: Load required packages and files

Step 1.1: Load packages

The first step is to install and load the libraries required for this section.

  • R
  • Summary
  • Full code
# Install pacman only if it's not already installed
if (!requireNamespace("pacman", quietly = TRUE)) {
  install.packages("pacman")
}

# install or load relevant packages
pacman::p_load(
  readxl,    # import and read Excel files
  ggplot2,   # plotting
  rio,       # for importing and exporting files
  gridExtra, # plot arrangements
  here,    # shows path to file
  stringr,    # clean up names,
  xts,       # return first or last element of a vector
  tidyverse,  # contains functions for data manipulations
  sf,          # spatial features for use in mapping
  scales      # calculates "pretty" breaks
)
Step 1.2: Load files

We bring in the annual incidence data we saved in the adjusted2 incidence page and the treatment seeking data

In most countries, regional-level estimates of care seeking behaviour (CSB) exist from DHS/MIS surveys. If incidence data is for multiple years, it is advisable to use CSB estimates estimates for survey years close to the year of incidence data since health system changes year-on could affect CSB of the population. Modeled estimates of treatment seeking at operational admin level is highly recommended (if available).

  • R
  • Python

Step 2: Join incidence data with treatment seeking data

Regional-level care seeking rates usually defined as the proportion of febrile children who sought care from a public or private health facility, or who did not seek care – are obtained from the latest household survey conducted in the country and with data available from the Demographic Health Surveys (DHS) program usually for children under five years old. Geospatial models can be used to estimate treatment seeking behaviours at more granular units, despite the difficulty in producing these given the various unmeasurable factors associated with care seeking patterns.

Here we join the CSB

  • R
  • Python
 #aggregate to adm3 for each year
 ann_inc_data <- treatment_seeking_adm1 %>%  # treatment seeking data is at adm1
  left_join(inc_data, ., by("adm1", "year"),
            relationship = "one-to-many")

Step 2.1: Calculate unreported cases from private sector and non-seekers

Calculate the additional number of cases from those who sort case from non-formal private sector and none-seekers

  • R
  • Python
ann_inc_data <- ann_inc_data  %>%
  mutate(
         # additional cases coming from private sector
         CSpriv_cases = (adjcases2 * CSpriv) / CSpub,

         # additional cases coming from those who did not seek care
         CSn_cases = (adjcases2 * CSn) / CSpub)

Step 3: Calculate annual adjusted cases 3 and adjusted 3 incidence (N3)

Calculate the adjusted 3 cases and adjusted incidence. Add additional number of cases from previous step to adjusted cases2

  • R
  • Python
ann_inc_data <- ann_inc_data  %>%
  mutate(
         # total adjusted 3 case
         adjcases3 = adjcases2 + CSpriv_cases + CSn_cases,

         # adjusted incidence 3
         adjinc3 = adjcases3/pop * 1000)

Step 2: Alternate approach - Using yearly operational level estimates

Using geospatial modeling techniques, treatment seeking estimates can be obtained for all years and at operational administrative levels. The alternative approach suggested below assumed such data is available. Such data ususally has estimates for - proportion of children seeeking care from facilities reporting to DHIS2 (CSpub) - proportion of children seeking care from private sector or health facilities not reporting to DHIS2 (CSpriv) - proportion of children seeking care from unregulated or non-approved source (CSn)

Step 2.2: Join incidence data with treatment seeking data

Here we join the the annual incidence dataset with adm3 treatment seeking dataset

  • R
  • Python
 #aggregate to adm3 for each year
 ann_inc_data <- treatment_seeking_adm3 %>%  # treatment seeking data is at adm1
  left_join(inc_data, ., by("adm3", "year"),
            relationship = "one-to-one")

Step 2.1: Compute expected number of confirmed cases from CSpriv and CSn

Calculate the additional number of confirmed cases from those who sort case from non-formal private sector and none-seekers

  • R
  • Python
ann_inc_data <- ann_inc_data  %>%
  mutate(
         # additional cases coming from private sector
         CSpriv_cases = (adjcases2 * CSpriv) / CSpub,

         # additional cases coming from those who did not seek care
         CSn_cases = (adjcases2 * CSn) / CSpub)

Step 3: Calculate annual adjusted cases 3 and adjusted 3 incidence (N3)

Calculate the adjusted 3 cases and adjusted incidence. Add additional number of cases from previous step to adjusted cases2

  • R
  • Python
ann_inc_data <- ann_inc_data  %>%
  mutate(
         # total adjusted 3 case
         adjcases3 = adjcases2 + CSpriv_cases + CSn_cases,

         # adjusted incidence 3
         adjinc3 = adjcases3/pop * 1000)

Step 3: Alternate approach - Using RDT tpr from DHS/MIS dataset and suspected cases from routine dataset

The alternative approach suggested below is very conservative. It is based on the assumption that the reporting system captures a larger proportion of cases in the public and formal private sector.

We calculate the following by subsetting from children who had fever in the past two weeks in DHS/MIS data, the following variables:

  1. the proportion of fevers among children who sought care from public (ml_fev_pub), non-formal private (ml_fev_priv), and none seekers (ml_fev_no_seeking)

  2. RDT positivity rate among children who sought care from non-formal private and non-seekers

  3. Using the observed suspected cases, find the expected number of suspected cases for ml_fev_priv and ml_fev_no_seeking

  4. Using the results from (2) compute the expected number of confirmed cases by multiplying (3) and (2)

Step 3.1: Compute expected number of suspected cases

Calculate the expected suspected cases from children who sought care from non-reporting private sector and those who did not seek care

  • R
  • Python
ann_inc_data <- ann_inc_data |>
  mutate(
    # expected suspected cases from private sector
    susp_priv = (susp * ml_fev_priv)/ml_fev_pub,

    # expected suspected cases from none seekers
    susp_no_seek = (susp * ml_fev_no_seeking)/ml_fev_pub)

Step 3.3: Calculate corresponding number of confirmed cases

Find the corresponding number of confirmed cases by applying respective tpr for each sector

  • R
  • Python
ann_inc_data <- ann_inc_data |>
  mutate(
        # confirmed cases from non-reporting private sector seekers
         conf_priv = susp_prive * tpr_priv,

      # confirmed cases from none seekers
         conf_no_seek = susp_no_seek * tpr_no_seek)

Step 3.3 Calculate annual adjusted 3 cases and adjusted incidence

Calculate the adjusted 3 cases and adjusted incidence. Add expected confirmed cases from private and non-seekers from previous step to adjusted cases2

  • R
  • Python
ann_inc_data <- ann_inc_data %>%
    mutate(
        # adj 3 cases
        adjcases3 = adjcases2 + conf_priv + conf_no_seek,

        # adjusted incidence 3
        adjinc3 = adjcases3/pop * 1000)

Countries are highly encouraged to review the standard approach provided here and adapt the equations and sources of data as they see fit for their context.

  • R
  • Python
Show full code
#===============================================================================
# End of Script
#===============================================================================
 

©2025 Applied Health Analytics for Delivery and Innovation. All rights reserved