gadm-ts/cpp/README.md
2026-03-23 17:35:02 +01:00

3.8 KiB

gadm-boundaries (C++)

Native C++ boundaries batch generator for the GADM pipeline.
Uses GDAL/GEOS/PROJ for geometry union + GHS raster enrichment (population & built-up surface).

Prerequisites

Tool Version
CMake >= 3.20
C++ compiler C++20 (MSVC, GCC, or Clang)
vcpkg Latest (set VCPKG_ROOT env var)

vcpkg dependencies (auto-installed via vcpkg.json): GDAL, GEOS, PROJ, nlohmann-json, CLI11, spdlog, Catch2.

Build

# Configure + build (Windows MSVC)
cmake --preset vcpkg-win
cmake --build build --config Release

The post-build step copies boundaries.exe, runtime DLLs, and proj.db to dist/win-x64/.

Usage

# From the gadm package root (not cpp/)

# All countries, all levels (263 countries x 6 levels)
.\dist\win-x64\boundaries.exe --country=all

# Single country, all levels
.\dist\win-x64\boundaries.exe --country=DEU

# Single country + level
.\dist\win-x64\boundaries.exe --country=DEU --level=0

# Force regeneration (ignore cached files)
.\dist\win-x64\boundaries.exe --country=NGA --force

CLI Options

Option Default Description
--country all ISO3 code or all for batch
--level -1 Admin level 0-5, or -1 for all
--cache-dir cache/gadm Output directory
--gpkg data/gadm_410-levels.gpkg GADM GeoPackage
--continent-json data/gadm_continent.json Continent mapping
--pop-tiff data/ghs/GHS_POP_...tif GHS population raster
--built-tiff data/ghs/GHS_BUILT_S_...tif GHS built-up raster
--force false Regenerate even if cached

npm Scripts

npm run boundaries:cpp           # --country=all (full batch)
npm run boundaries:cpp -- --country=DEU  # single country

Output

Files written to cache/gadm/boundary_{CODE}_{LEVEL}.json — drop-in replacement for the TS pipeline.

Each feature includes:

  • code — admin region code
  • name — admin region name
  • geometry — GeoJSON (MultiPolygon via WKB-precision GEOS union)
  • ghsPopulation — total population (GHS-POP 2030)
  • ghsPopMaxDensity — peak population density
  • ghsPopCenter — weighted population center [lon, lat]
  • ghsPopCenters — top-N population peaks [[lon, lat, density], ...]
  • ghsBuiltWeight — total built-up surface weight
  • ghsBuiltMax — peak built-up value
  • ghsBuiltCenter — weighted built-up center [lon, lat]
  • ghsBuiltCenters — top-N built-up peaks [[lon, lat, value], ...]

Architecture

See docs/cpp-port.md for the full spec.

src/
├── main.cpp          # CLI entry, country loop, cache logic, PROJ_DATA setup
├── gpkg_reader.h/cpp # GeoPackage -> features (OGR C API)
├── geo_merge.h/cpp   # Geometry union via WKB roundtrip (GEOS C API)
├── ghs_enrich.h/cpp  # GeoTIFF raster sampling + PIP (GDAL + PROJ)
├── pip.h             # Inline ray-casting point-in-polygon
└── types.h           # BoundaryFeature, BoundaryResult structs

Key Design Decisions

  • No OpenMP — GDAL/PROJ/GEOS are not thread-safe for concurrent raster reads + transform creation. Sequential processing is I/O-bound anyway.
  • WKB precision — geometry union uses WKB serialization to avoid floating-point drift from WKT roundtrips.
  • Mollweide projection — uses +proj=moll string directly (not EPSG:54009 which isn't in the PROJ database). Transforms are normalized via proj_normalize_for_visualization for correct lon/lat axis order.
  • Windowed raster I/O — GDAL GDALRasterIO reads only the bbox-clipped window from multi-GB GeoTIFFs, keeping memory bounded.
  • PROJ_DATA auto-discoverymain.cpp sets PROJ_DATA at startup pointing to the exe directory where proj.db is co-located.