• English
  • Français
  1. 2. Data Assembly and Management
  2. 2.2 Health Facilities Data
  3. Health facility coordinates and point data
  • Code library for subnational tailoring
    English version
  • 1. Getting Started
    • 1.1 About and Contact Information
    • 1.2 For Everyone
    • 1.3 For the SNT Team
    • 1.4 For Analysts
    • 1.5 Producing High-Quality Outputs
  • 2. Data Assembly and Management
    • 2.1 Working with Shapefiles
      • Spatial data overview
      • Basic shapefile use and visualization
      • Shapefile management and customization
      • Merging shapefiles with tabular data
    • 2.2 Health Facilities Data
      • Fuzzy matching of names across datasets
      • Health facility coordinates and point data
      • Determining active and inactive status
    • 2.3 Routine Surveillance Data
      • Routine data extraction
      • DHIS2 data preprocessing
      • Assessing missing data
      • Health facility reporting rate
      • Data coherency checks
      • Outlier detection methods
      • Imputing missing data and correcting outliers
      • Final database
    • 2.4 Stock Data
      • LMIS
    • 2.5 Population Data
      • National population data
      • WorldPop population raster
    • 2.6 National Household Survey Data
      • DHS data overview and preparation
      • Prevalence of malaria infection
      • All-cause child mortality
      • Treatment-seeking rates
      • ITN ownership, access, and usage
    • 2.7 Entomological Data
      • Entomological data
    • 2.8 Climate and Environmental Data
      • Climate and environment data extraction from raster
    • 2.9 Modeled Data
      • Generating spatial modeled estimates
      • Working with geospatial model estimates
      • Modeled estimates of malaria mortality and proxies
      • Modeled estimates of entomological indicators
  • 3. Stratification
    • 3.1 Epidemiological Stratification
      • Incidence overview and crude incidence
      • Incidence adjustment 1: incomplete testing
      • Incidence adjustment 2: incomplete reporting
      • Incidence adjustment 3: treatment-seeking
      • Incidence stratification
      • Prevalence and mortality stratification
      • Combined risk categorization
    • 3.2 Stratification of Determinants of Malaria Transmission
      • Seasonality
      • Access to care
  • 4. Review of Past Interventions
    • 4.1 Case Management
    • 4.2 Routine Interventions
    • 4.3 Campaign Interventions
    • 4.4 Other Interventions
  • 5. Targeting of Interventions
  • 6. Retrospective Analysis
  • 7. Urban Microstratification

On this page

  • Overview
  • Key concepts
    • Coordinate reference systems (CRS)
    • Coordinate formats
    • Common issues with point coordinate data
  • Step-by-Step
    • Step 1: Install and load packages
    • Step 2: Load and prepare input data
    • Step 3: Define, validate, and clean coordinates
      • Step 3.1: Split coordinates column (if needed)
      • Step 3.2: Define coordinate columns and clean data
    • Step 4: Convert to spatial object
    • Step 5: Save cleaned and spatial data
    • Step 6: Create and save health facility maps
      • 6.1: Basic health facility map
      • 6.2: Health facilities by type
      • 6.3: Specific health facility types
  • Summary
  • Full code
  1. 2. Data Assembly and Management
  2. 2.2 Health Facilities Data
  3. Health facility coordinates and point data

Health facility coordinates and point data

Beginner

Overview

It can be useful in SNT to visualize point coordinate data, such as locations of health facilities, community health workers, villages, or survey clusters, on a map. Mapping indicators at their point coordinates can also illustrate spatial heterogeneity at fine scale. This page uses health facility coordinate mapping as an example use case of working with point coordinate data, and the code presented here can be adapted to work with any kind of point coordinates.

More on spatial data

For background information on shapefiles and links to all the spatial data content in the SNT code library, including rasters and point data, please visit Spatial data overview.

Health facility coordinate mapping is the process of visualizing where health facilities are located by plotting their coordinates (latitude and longitude) on maps with administrative boundaries. This spatial visualization shows the geographic distribution of healthcare infrastructure and helps identify areas with adequate health facility coverage versus areas with service gaps.

Objectives
  • Understand the basics of working with point data
  • Visualize the geographic distribution of health facilities across administrative boundaries
  • Show health facility locations spatially by type across administrative boundaries

Key concepts

Coordinate reference systems (CRS)

The coordinate reference system (CRS) defines how geographic locations are represented numerically. For comprehensive background on coordinate reference systems, projections, and spatial data management principles, see the Spatial data overview.

For mapping point coordinates, WGS84 and UTM are the most relevant CRS. Unlike for working with shapefiles, where you will likely want to use WGS84 for all applications, when working with point coordinates UTM may sometimes be more appropriate as it is more accurate for measuring distances and areas.

In the SNT process, health facility data is typically shared by the SNT team in Excel or CSV format, with coordinates provided in longitude and latitude. Therefore, the Geographic Coordinate System (WGS84 / EPSG:4326) is usually used for mapping.

Coordinate formats

Example coordinates in WGS84:

  • Latitude: 8.4820° (North of the equator)
  • Longitude: -13.2334° (West of the Prime Meridian)
  • Location: This represents a point in Freetown, Sierra Leone
  • Format: (lat, lon) = (8.48206, -13.23345)
Note

Note: Another coordinate format called DMS (Degrees-Minutes-Seconds) exists but is less often used in health data. DMS format looks like 8°28’55”N, 13°14’00”W and is mainly found in traditional marine navigation, aviation, and land surveying applications. For more details, refer to the documentation at: Merging Spatial Vectors

Common issues with point coordinate data

  • Missing coordinate values

    Example: A health facility row has a blank entry for longitude or latitude, preventing it from being mapped.

    Solution: Follow up with the SNT team for help in finding missing coordinates. If ultimately not available, filter out facilities with missing coordinates before mapping.

  • Incorrect formats

    Example: Coordinates provided as 8°29'30"N instead of decimal degrees like 8.4917, causing errors during mapping.

    Solution: Convert to decimal format using tools like the measurements package in R.

  • Projection mismatches between datasets

    Example: A shapefile uses UTM Zone 29N projection while facility data uses WGS84, resulting in misaligned map layers that don’t overlay correctly.

    Solution: Before mapping or overlaying spatial layers, ensure all datasets use the same projection. Convert to WGS84 (EPSG:4326) for web mapping or to the appropriate UTM zone for precise measurements.

  • Inaccurate locations due to low precision

    Example: Coordinates given as 8.4, -11.7 instead of 8.412356, -11.736981, placing the facility far from its true location (potentially several kilometers off).

    Solution: Identify all records with low-precision coordinates (e.g., fewer than 5 decimal places) and consult the SNT team for accurate location data. Request that these points be re-shared with coordinates containing at least 5 decimal places, if available. This level of precision is necessary to ensure reliable mapping and prevent any misinterpretation of spatial analyses.

  • Combined coordinate column

    Health facility data is usually in the form of Excel or CSV files containing coordinates. The coordinate data can have either one column with both coordinates or separate longitude and latitude columns:

    Example 1 - Separate longitude and latitude columns:

    | facility_name    | type     | longitude| latitude|
    |------------------|----------|----------|---------|
    | Central Hospital | HOSPITAL | -13.2334 | 8.4820  |
    | Health Post A    | CHC      | -13.1456 | 8.5123  |
    | District Clinic  | CLINIC   | -13.0987 | 8.4567  |

    Example 2 - Combined coordinate column:

    | facility_name    | type     | coordinates          |
    |------------------|----------|----------------------|
    | Central Hospital | HOSPITAL | 8.4820, -13.2334     |
    | Health Post A    | CHC      | 8.5123; -13.1456     |
    | District Clinic  | CLINIC   | 8.4567  -13.0987     |

    Solution: If your data has a combined coordinate column, you will need to separate out the latitude and longitude before analysis or visualization.

Step-by-Step

This section guides users through working with points coordinates data by creating comprehensive health facility maps with professional styling and multiple visualization approaches. The example uses Sierra Leone health facility data with administrative boundaries.

Before working with health facilities coordinates data, verify with the SNT team the following:

  • Coordinate accuracy and completeness (no missing lat/lon values)
  • Facility names and identifiers are consistent and unique
  • Facility type classifications are standardized

To skip the step-by-step explanation, jump to the full code at the end of this page.

Step 1: Install and load packages

Install and load the necessary packages for handling spatial data, reading various file formats, data manipulation, and advanced visualization for health facility mapping.

  • R
  • Python
# =============================================================================
# Step 1: Install and Load Packages
# =============================================================================

# Check if 'pacman' is installed; install it if missing
if (!requireNamespace("pacman", quietly = TRUE)) {
  install.packages("pacman")
}

# Load all required packages using pacman
pacman::p_load(
  sf,           # for reading and handling spatial data
  readxl,       # for reading Excel files
  readr,        # for reading CSVs
  dplyr,        # for data manipulation
  stringr,      # for string cleaning and formatting
  ggplot2,      # for creating plots
  RColorBrewer, # for color palettes
  scales,       # for formatting scales and labels
  here,         # for cross-platform file paths
  tidyr         # for data reshaping and coordinate splitting
)

To adapt the code:

  • Do not modify anything in the code above

Step 2: Load and prepare input data

This step involves loading administrative boundary data along with health facility coordinate datasets and setting up file paths for consistent data access.

In this example, we are loading our points data (health facility coordinates) from an Excel file. If your points data is stored in shapefile format, please follow the steps in Basic shapefile use and visualization to load your data, then proceed directly to visualization in Step 7. Be sure to check that the CRS of your points data and the CRS of your administrative boundary shapefile are the same.

  • R
  • Python
Show the code
# =============================================================================
# Step 2: Load and Prepare Input Data
# =============================================================================

# Define base directories and filenames
shp_dir <- "english/data_r/shapefiles"
hf_dir  <- "english/data_r/health_facility"

# Define filenames
shp_file <- "Chiefdom2021.shp"
hf_file  <- "mfl_hfs.xlsx"

# Compose full file paths for data loading
shp_path <- here::here(shp_dir, shp_file)
hf_path  <- here::here(hf_dir, hf_file)

# Load shapefile and ensure proper CRS
df1 <- sf::st_read(shp_path, quiet = TRUE)

# Check and handle CRS
if (is.na(sf::st_crs(df1))) {
  # If CRS is missing, assign WGS84 (EPSG:4326)
  sf::st_crs(df1) <- 4326
  cat("CRS was undefined - assigned WGS84 (EPSG:4326)\n")
} else {
  # If CRS exists but is not WGS84, transform it
  current_crs <- sf::st_crs(df1)$input
  cat("Current CRS:", current_crs, "\n")
  
  if (sf::st_crs(df1)$epsg != 4326) {
    df1 <- sf::st_transform(df1, crs = 4326)
    cat("Transformed to WGS84 (EPSG:4326) for consistency\n")
  } else {
    cat("Already in WGS84 (EPSG:4326)\n")
  }
}

# Load health facility Excel data
df2 <- readxl::read_excel(hf_path)

# Print data verification
cat("=== DATA VERIFICATION ===\n")
cat("Administrative boundary columns:\n")
print(names(df1))
cat("\nHealth facility data columns:\n")
print(names(df2))
cat("\nSample of health facility data:\n")
print(head(df2, 3))
Output
CRS was undefined - assigned WGS84 (EPSG:4326)
=== DATA VERIFICATION ===
Administrative boundary columns:
[1] "CCODE"      "FIRST_REGI" "FIRST_RE_1" "FIRST_DNAM" "FIRST_DCOD"
[6] "FIRST_CHIE" "geometry"  

Health facility data columns:
[1] "adm1"       "adm3"       "community"  "hf"         "type"      
[6] "Ownership"  "functional" "w_lat"      "w_long"    

Sample of health facility data:
# A tibble: 3 × 9
  adm1    adm3           community hf    type  Ownership functional w_lat w_long
  <chr>   <chr>          <chr>     <chr> <chr> <chr>     <chr>      <dbl>  <dbl>
1 Bombali Mara           Kiampkak… Kiam… MCHP  Governme… Functional  8.61  -12.2
2 Bombali Bombali Sebora Station … Lore… CLIN… Faith Ba… Functional  8.89  -12.0
3 Bombali Bombali Sebora Makama R… Make… HOSP… Governme… Functional  8.87  -12.1

To adapt the code:

  • Lines 6-7: Change directory paths to match the folder structure
  • Lines 10-11: Update filenames to match the actual files

Step 3: Define, validate, and clean coordinates

Step 3.1: Split coordinates column (if needed)

This step is only required if longitude and latitude are combined in a single column. If your data already has separate columns, skip this step.

  • R
  • Python
# =============================================================================
# Step 3: Split Coordinates Column (if needed)
# =============================================================================
# This step is only required if longitude and latitude are combined 
# in a single column. If columns are separate, skip this step.

# Replace separators (; or space) with a comma
df2$coordinates <- gsub("[; ]+", ",", df2$coordinates)

# Split coordinates column by comma into two columns
df2 <- tidyr::separate(df2, col = "coordinates", 
                       into = c("w_lat", "w_long"), 
                       sep = ",", convert = TRUE)

cat("Coordinates successfully split into separate columns\n")
cat("Sample of split coordinates:\n")
print(df2[1:3, c("w_lat", "w_long")])

To adapt the code:

  • Line 8 & 11: Replace "coordinates" with the combined coordinate column name
  • Line 12: Update "w_lat" and "w_long" with the preferred column names
  • Skip this entire step if the data already has separate latitude and longitude columns

Step 3.2: Define coordinate columns and clean data

Define the coordinate columns and clean the health facility data by removing invalid coordinates and ensuring data quality. This code also checks if coordinate columns are within the valid global ranges:

  • Longitude (X): Must be between -180 and 180. Represents East-West position.
  • Latitude (Y): Must be between -90 and 90. Represents North-South position.
  • R
  • Python
# Define longitude (x) and latitude (y) coordinates
df2$x <- df2$w_long
df2$y <- df2$w_lat

# Identify facilities with invalid coordinates for follow-up
facilities_to_review <- df2[
  is.na(df2$x) | is.na(df2$y) | 
  df2$x < -180 | df2$x > 180 | 
  df2$y < -90 | df2$y > 90,
]

# Save the list of facilities to review
write.csv(facilities_to_review, "health_facilities_coordinates_to_review.csv", row.names = FALSE)

# Filter to KEEP only valid, in-range coordinates
facilities_clean <- df2[
  !is.na(df2$x) & !is.na(df2$y) &
  df2$x >= -180 & df2$x <= 180 &
  df2$y >= -90 & df2$y <= 90,
]

# Print cleaning results
cat("=== DATA CLEANING RESULTS ===\n")
cat("Original health facilities:", nrow(df2), "\n")
cat("Clean health facilities:", nrow(facilities_clean), "\n")
cat("Removed facilities:", nrow(df2) - nrow(facilities_clean), "\n")
cat("Facilities saved for review: 'health_facilities_coordinates_to_review.csv' (n =", nrow(facilities_to_review), ")\n")
cat("Coordinate range - Longitude:", min(facilities_clean$x), "to", max(facilities_clean$x), "\n")
cat("Coordinate range - Latitude:", min(facilities_clean$y), "to", max(facilities_clean$y), "\n")
Output
=== DATA CLEANING RESULTS ===
Original health facilities: 1556 
Clean health facilities: 1531 
Removed facilities: 25 
Coordinate range - Longitude: -13.29207 to -10.30744 
Coordinate range - Latitude: 6.96792 to 9.973346 

To adapt the code:

  • Lines 7-8: Update w_long and w_lat to match the longitude and latitude column names

Step 4: Convert to spatial object

Convert the cleaned health facility data to a spatial object for advanced spatial operations and analysis.

Why create a spacial object?

A spatial object is a special type of dataset that understands the points are on a map, not just numbers in a table. This is required for any geographic operation.

You need to complete this step if you plan to:

  • Map the facilities.
  • Calculate distances or catchment areas.
  • Join facility data to administrative boundaries (e.g., which district is each facility in?).
  • Perform any analysis that uses the geographic location of the facilities.
  • R
  • Python
# =============================================================================
# Step 4: Convert to Spatial Object
# =============================================================================

# Convert to spatial object
facilities_sf <- sf::st_as_sf(
  facilities_clean,
  coords = c("w_long", "w_lat"),
  crs = 4326
)

cat("Successfully converted to spatial object\n")
cat("Spatial object summary:\n")
print(facilities_sf[1:3, ])
Output
Successfully converted to spatial object
Spatial object summary:
Simple feature collection with 3 features and 9 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -12.20295 ymin: 8.610318 xmax: -12.04387 ymax: 8.886622
Geodetic CRS:  WGS 84
# A tibble: 3 × 10
  adm1    adm3           community  hf    type  Ownership functional     x     y
  <chr>   <chr>          <chr>      <chr> <chr> <chr>     <chr>      <dbl> <dbl>
1 Bombali Mara           Kiampkako… Kiam… MCHP  Governme… Functional -12.2  8.61
2 Bombali Bombali Sebora Station R… Lore… CLIN… Faith Ba… Functional -12.0  8.89
3 Bombali Bombali Sebora Makama Ro… Make… HOSP… Governme… Functional -12.1  8.87
# ℹ 1 more variable: geometry <POINT [°]>

To adapt the code:

  • Line 8: Update coordinate column names to match the data

Step 5: Save cleaned and spatial data

Save the cleaned health facility data and spatial objects for future use, verification, and sharing with team members.

  • R
  • Python
# =============================================================================
# Step 5: Save Cleaned and Spatial Data
# =============================================================================

# Create output directory
output <- "english/outputs"
dir.create(output, recursive = TRUE, showWarnings = FALSE)

# Define output filenames and paths
hf <- "facilities.csv"
sf <- "facilities_spatial.shp"

hf_path <- file.path(output, hf)
sf_path <- file.path(output, sf)

# Save cleaned data as CSV
write.csv(facilities_clean, hf_path, row.names = FALSE)

# Save spatial data as shapefile
sf::st_write(facilities_sf, sf_path, append = FALSE, quiet = TRUE)

cat("Data saved successfully:\n")
cat("CSV file:", hf_path, "\n")
cat("Shapefile:", sf_path, "\n")
Output
Data saved successfully:
CSV file: english/outputs/facilities.csv 
Shapefile: english/outputs/facilities_spatial.shp 

To adapt the code:

  • Line 6: Set or change the output directory path
  • Lines 10-11: Update output filenames as needed

Step 6: Create and save health facility maps

Create comprehensive maps displaying health facility locations with different visualization approaches to show the spatial distribution of healthcare infrastructure.

6.1: Basic health facility map

  • R
  • Python
# =============================================================================
# Step 6.1: Basic Health Facility Map
# =============================================================================

# Define map title and output filename
title <- "Health facilities in Sierra Leone"
map   <- "health_facility_locations.png"

# Create the basic map
facilities_map <- ggplot2::ggplot() +
  ggplot2::geom_sf(data = df1, fill = "white",
                   color = "black",
                   size = 0.3) +
  ggplot2::geom_point(
    data = facilities_clean,
    ggplot2::aes(x = x, y = y),
    color = "#47B5FF", size = 0.8, alpha = 0.7
  ) +
  ggplot2::coord_sf() +
  ggplot2::theme_void() +
  ggplot2::theme(
    plot.title = ggplot2::element_text(hjust = 0.5,
                                       size = 14,
                                       face = "bold"),
    plot.margin = ggplot2::margin(10, 10, 10, 10)
  ) +
  ggplot2::labs(title = title)

# Print and save the map
print(facilities_map)

ggplot2::ggsave(
  filename = file.path(output, map),
  plot = facilities_map,
  width = 10,
  height = 7,
  bg = "white",
  dpi = 300
)

cat("Map saved as:", file.path(output, map), "\n")
Output

Map saved as: english/outputs/health_facility_locations.png 

To adapt the code:

  • Lines 6-7: Update the map title and output filename
  • Lines 11, 12 & 13: Update fill, color, and size for boundary styling
  • Line 17: Update color, size, and alpha for health facility points

6.2: Health facilities by type

  • R
  • Python
# =============================================================================
# Step 6.2: Health Facilities by Type
# =============================================================================

# Define map title and output filename for facility type map
title_type <- "Health facility types in Sierra Leone"
map_type   <- "health_facility_by_type.png"

# Create health facility type map
facility_type_map <- ggplot2::ggplot() +
  ggplot2::geom_sf(data = df1, fill = "white", 
                   color = "gray50", size = 0.3) +
  ggplot2::geom_point(
    data = facilities_clean,
    ggplot2::aes(x = x, y = y, color = type),
    size = 0.8, alpha = 0.8
  ) +
  ggplot2::coord_sf() +
  ggplot2::scale_color_brewer(palette = "Set1", name = "Facility Type") +
  ggplot2::theme_void() +
  ggplot2::theme(
    legend.position = "bottom",
    legend.title = ggplot2::element_text(size = 12, face = "bold"),
    legend.text = ggplot2::element_text(size = 10),
    plot.title = ggplot2::element_text(hjust = 0.5, size = 14, face = "bold"),
    plot.margin = ggplot2::margin(10, 10, 10, 10)
  ) +
  ggplot2::labs(title = title_type)

# Print and save the type map
print(facility_type_map)

ggplot2::ggsave(
  filename = file.path(output, map_type),
  plot = facility_type_map,
  width = 10,
  height = 7,
  bg = "white",
  dpi = 300
)

cat("Facility type map saved as:", file.path(output, map_type), "\n")
Output

Facility type map saved as: english/outputs/health_facility_by_type.png 

To adapt the code:

  • Lines 6-7: Update the map title and output filename
  • Line 15: Replace type with the facility type column name
  • Line 19: Update palette for different color schemes

6.3: Specific health facility types

  • R
  • Python
# =============================================================================
# Step 6.3: Faceted Map by Health Facility Type
# =============================================================================

# Define map title and output filename for faceted map
title_facet <- "Health facility distribution by type in Sierra Leone"
map_facet   <- "health_facility_faceted_by_type.png"

# Create faceted map by facility type
faceted_type_map <- ggplot2::ggplot() +
  ggplot2::geom_sf(data = df1, fill = "white", 
                   color = "gray50", size = 0.2) +
  
  ggplot2::geom_sf(data = adm2, fill = NA, color = "gray30", size = 0.3) +

  
  ggplot2::geom_point(
    data = facilities_clean,
    ggplot2::aes(x = x, y = y),
    color = "#47B5FF", size = 0.6, alpha = 0.7
  ) +
  ggplot2::facet_wrap(~ type) +
  ggplot2::coord_sf() +
  ggplot2::theme_void() +
  ggplot2::theme(
    plot.title = ggplot2::element_text(hjust = 0.5, size = 14, face = "bold"),
    strip.text = ggplot2::element_text(face = "bold", size = 10),
    plot.margin = ggplot2::margin(10, 10, 10, 10)
  ) +
  ggplot2::labs(title = title_facet)

# Print the faceted map
print(faceted_type_map)

# Save the faceted map
ggplot2::ggsave(
  filename = file.path(output, map_facet),
  plot = faceted_type_map,
  width = 12,
  height = 9,
  bg = "white",
  dpi = 300
)

cat("Faceted type map saved as:", file.path(output, map_facet), "\n")
Output

Faceted type map saved as: english/outputs/health_facility_faceted_by_type.png 

To adapt the code:

  • Line 6: Change "HOSPITAL" to match your desired facility type
  • Lines 10-11: Update map title and filename accordingly
  • Line 20: Update point color and size as desired

Summary

This streamlined workflow outlines steps to visualize health facility coordinates effectively:

  • Organized file management: Use consistent directory structure and clear variable naming for better code maintainability

  • Flexible coordinate handling: Support both combined and separate coordinate columns with automatic splitting when needed

  • Data quality assurance: Validate coordinate ranges and remove invalid entries before mapping

  • Efficient spatial conversion: Create spatial objects for advanced geographic operations and analysis

Full code

  • R
  • Python

Complete script for mapping health facility coordinates with administrative boundaries

# =============================================================================
# Health facility coordinate mapping and visualization
# =============================================================================
# This script creates comprehensive health facility maps with administrative
# boundaries, showing facility locations and distributions.
# Optimized for SNT (Sub-National Tailoring) processes and health system planning.

# =============================================================================
# Step 1: Install and Load Packages
# =============================================================================

# Check if 'pacman' is installed; install it if missing
if (!requireNamespace("pacman", quietly = TRUE)) {
  install.packages("pacman")
}

# Load all required packages using pacman
pacman::p_load(
  sf,           # for reading and handling spatial data
  readxl,       # for reading Excel files
  readr,        # for reading CSVs
  dplyr,        # for data manipulation
  stringr,      # for string cleaning and formatting
  ggplot2,      # for creating plots
  RColorBrewer, # for color palettes
  scales,       # for formatting scales and labels
  here,         # for cross-platform file paths
  tidyr         # for data reshaping and coordinate splitting
)

# =============================================================================
# Step 2: Load and Prepare Input Data
# =============================================================================

# Define base directories and filenames
shp_dir <- "english/data_r/shapefiles"
hf_dir  <- "english/data_r/health_facility"

# Define filenames
shp_file <- "Chiefdom2021.shp"
hf_file  <- "mfl_hfs.xlsx"

# Compose full file paths for data loading
shp_path <- here::here(shp_dir, shp_file)
hf_path  <- here::here(hf_dir, hf_file)

# Load shapefile and transform CRS to WGS84
df1 <- sf::st_read(shp_path, quiet = TRUE) |>
  sf::st_transform(crs = sf::st_crs(4326))

# Load health facility Excel data
df2 <- readxl::read_excel(hf_path)

# =============================================================================
# Step 3: Split Coordinates Column into Latitude and Longitude (if needed)
# This step is only required if longitude and latitude are combined in a single column
# If columns are separate, skip this step
# =============================================================================

# Replace separators (; or space) with a comma
df2$coordinates <- gsub("[; ]+", ",", df2$coordinates)

# Split coordinates column by comma into two columns
df2 <- tidyr::separate(df2, col = "coordinates", 
                       into = c("w_lat", "w_long"), 
                       sep = ",", convert = TRUE)

df2$x <- df2$w_long
df2$y <- df2$w_lat

# =============================================================================
# Step 4: Clean Health Facility Coordinate Data
# =============================================================================

# Filter to remove missing or out-of-range coordinates
facilities_clean <- df2[
  !is.na(df2$x) & !is.na(df2$y) &
  df2$x >= -180 & df2$x <= 180 &
  df2$y >= -90 & df2$y <= 90,
]

# =============================================================================
# Step 5: Convert to Spatial Object
# =============================================================================

facilities_sf <- sf::st_as_sf(
  facilities_clean,
  coords = c("w_long", "w_lat"),
  crs = 4326
)

# =============================================================================
# Step 6: Save Cleaned and Spatial Data
# =============================================================================

# Create output directory
output <- "english/outputs"
dir.create(output, recursive = TRUE, showWarnings = FALSE)

# Define output filenames and paths
hf <- "facilities.csv"
sf <- "facilities_spatial.shp"

hf_path <- file.path(output, hf)
sf_path <- file.path(output, sf)

# Save cleaned data as CSV
write.csv(facilities_clean, hf_path, row.names = FALSE)

# Save spatial data as shapefile
sf::st_write(facilities_sf, sf_path, append = FALSE, quiet = TRUE)

# =============================================================================
# Step 7: Create and Save Health Facility Maps
# =============================================================================

# Define map titles and output filenames
title_basic <- "Health facilities in Sierra Leone"
title_type  <- "Health facility types in Sierra Leone"
title_hospitals <- "Hospital locations in Sierra Leone"

map_basic <- "health_facility_locations.png"
map_type  <- "health_facility_by_type.png"
map_hospitals <- "hospitals_only.png"

# 8.1: Basic health facility map
facilities_map <- ggplot2::ggplot() +
  ggplot2::geom_sf(data = df1, fill = "white",
                   color = "black",
                   size = 0.3) +
  ggplot2::geom_point(
    data = facilities_clean,
    ggplot2::aes(x = x, y = y),
    color = "#47B5FF", size = 0.8, alpha = 0.7
  ) +
  ggplot2::coord_sf() +
  ggplot2::theme_void() +
  ggplot2::theme(
    plot.title = ggplot2::element_text(hjust = 0.5,
                                       size = 14,
                                       face = "bold"),
    plot.margin = ggplot2::margin(10, 10, 10, 10)
  ) +
  ggplot2::labs(title = title_basic)

# Save basic map
ggplot2::ggsave(
  filename = file.path(output, map_basic),
  plot = facilities_map,
  width = 10,
  height = 7,
  bg = "white",
  dpi = 300
)

# 8.2: Health facilities by type
facility_type_map <- ggplot2::ggplot() +
  ggplot2::geom_sf(data = df1, fill = "white", 
                   color = "gray50", size = 0.3) +
  ggplot2::geom_point(
    data = facilities_clean,
    ggplot2::aes(x = x, y = y, color = type),
    size = 0.8, alpha = 0.8
  ) +
  ggplot2::coord_sf() +
  ggplot2::scale_color_brewer(palette = "Set1", name = "Facility Type") +
  ggplot2::theme_void() +
  ggplot2::theme(
    legend.position = "bottom",
    legend.title = ggplot2::element_text(size = 12, face = "bold"),
    legend.text = ggplot2::element_text(size = 10),
    plot.title = ggplot2::element_text(hjust = 0.5, size = 14, face = "bold"),
    plot.margin = ggplot2::margin(10, 10, 10, 10)
  ) +
  ggplot2::labs(title = title_type)

# Save type map
ggplot2::ggsave(
  filename = file.path(output, map_type),
  plot = facility_type_map,
  width = 10,
  height = 7,
  bg = "white",
  dpi = 300
)

# 8.3: Specific health facility types (hospitals only)
hospitals_only <- facilities_clean[facilities_clean$type == "HOSPITAL", ]
cat("Mapping", nrow(hospitals_only), "hospitals\n")

hospitals_map <- ggplot2::ggplot() +
  ggplot2::geom_sf(data = df1, fill = "white", 
                   color = "gray70", size = 0.3) +
  ggplot2::geom_point(
    data = hospitals_only,
    ggplot2::aes(x = x, y = y),
    color = "purple", size = 1.2, alpha = 0.9
  ) +
  ggplot2::coord_sf() +
  ggplot2::theme_void() +
  ggplot2::theme(
    plot.title = ggplot2::element_text(hjust = 0.5, size = 14, face = "bold"),
    plot.subtitle = ggplot2::element_text(hjust = 0.5, size = 11, color = "gray40"),
    plot.margin = ggplot2::margin(10, 10, 10, 10)
  ) +
  ggplot2::labs(
    title = title_hospitals,
    subtitle = paste("Showing", nrow(hospitals_only), "hospitals")
  )

# Save hospitals map
ggplot2::ggsave(
  filename = file.path(output, map_hospitals),
  plot = hospitals_map,
  width = 10,
  height = 7,
  bg = "white",
  dpi = 300
)

# =============================================================================
# The End. Happy coding!!!
# =============================================================================
 

©2025 Applied Health Analytics for Delivery and Innovation. All rights reserved