# gadm-boundaries (C++) Native C++ boundaries batch generator for the GADM pipeline. Uses GDAL/GEOS/PROJ for geometry union + GHS raster enrichment (population & built-up surface). ## Prerequisites | Tool | Version | |------|---------| | CMake | >= 3.20 | | C++ compiler | C++20 (MSVC, GCC, or Clang) | | vcpkg | Latest (set `VCPKG_ROOT` env var) | vcpkg dependencies (auto-installed via `vcpkg.json`): GDAL, GEOS, PROJ, nlohmann-json, CLI11, spdlog, Catch2. ## Build ### Windows (MSVC) ```bash cd packages/gadm/cpp cmake --preset vcpkg-win cmake --build build --config Release ``` Output: `dist/win-x64/boundaries.exe` + runtime DLLs + `proj.db` ### Linux (Ubuntu) ```bash # System dependencies (build tools) sudo apt install cmake ninja-build pkg-config g++ git curl zip unzip tar # Install vcpkg (if not already) git clone https://github.com/microsoft/vcpkg.git ~/vcpkg ~/vcpkg/bootstrap-vcpkg.sh export VCPKG_ROOT=~/vcpkg # Build cd packages/gadm/cpp cmake --preset vcpkg-linux cmake --build build --config Release ``` Output: `dist/linux-x64/boundaries` + `proj.db` > **Note:** First build takes 10-20 min as vcpkg compiles GDAL/GEOS/PROJ from source. Subsequent builds are fast. ## Usage ```bash # From the gadm package root (not cpp/) # All countries, all levels (263 countries x 6 levels) # Windows: .\dist\win-x64\boundaries.exe --country=all # Linux: ./dist/linux-x64/boundaries --country=all # Single country, all levels ./dist/linux-x64/boundaries --country=DEU # Single country + level ./dist/linux-x64/boundaries --country=DEU --level=0 # Force regeneration (ignore cached files) ./dist/linux-x64/boundaries --country=NGA --force ``` ### CLI Options | Option | Default | Description | |--------|---------|-------------| | `--country` | `all` | ISO3 code or `all` for batch | | `--level` | `-1` | Admin level 0-5, or -1 for all | | `--cache-dir` | `cache/gadm` | Output directory | | `--gpkg` | `data/gadm_410-levels.gpkg` | GADM GeoPackage | | `--continent-json` | `data/gadm_continent.json` | Continent mapping | | `--pop-tiff` | `data/ghs/GHS_POP_...tif` | GHS population raster | | `--built-tiff` | `data/ghs/GHS_BUILT_S_...tif` | GHS built-up raster | | `--force` | `false` | Regenerate even if cached | ### npm Scripts ```bash npm run boundaries:cpp # --country=all (full batch, Windows) npm run boundaries:cpp -- --country=DEU # single country ``` On Linux, run the binary directly since the npm script points to the Windows path. ## Output Files written to `cache/gadm/boundary_{CODE}_{LEVEL}.json` — drop-in replacement for the TS pipeline. The TS wrapper (`gpkg-reader.ts`) automatically discovers these cache files and serves them through the API. Sub-region queries (e.g. `getBoundary('DEU.2_1', 3)`) are resolved by prefix-filtering the full country file. Each feature includes: - `code` — admin region code - `name` — admin region name - `geometry` — GeoJSON (MultiPolygon via WKB-precision GEOS union) - `ghsPopulation` — total population (GHS-POP 2030) - `ghsPopMaxDensity` — peak population density - `ghsPopCenter` — weighted population center `[lon, lat]` - `ghsPopCenters` — top-N population peaks `[[lon, lat, density], ...]` - `ghsBuiltWeight` — total built-up surface weight - `ghsBuiltMax` — peak built-up value - `ghsBuiltCenter` — weighted built-up center `[lon, lat]` - `ghsBuiltCenters` — top-N built-up peaks `[[lon, lat, value], ...]` ## Architecture See [docs/cpp-port.md](../docs/cpp-port.md) for the full spec. ``` src/ ├── main.cpp # CLI entry, country loop, cache logic, PROJ_DATA setup ├── gpkg_reader.h/cpp # GeoPackage -> features (OGR C API) ├── geo_merge.h/cpp # Geometry union via WKB roundtrip (GEOS C API) ├── ghs_enrich.h/cpp # GeoTIFF raster sampling + PIP (GDAL + PROJ) ├── pip.h # Inline ray-casting point-in-polygon └── types.h # BoundaryFeature, BoundaryResult structs ``` ### Key Design Decisions - **No OpenMP** — GDAL/PROJ/GEOS are not thread-safe for concurrent raster reads + transform creation. Sequential processing is I/O-bound anyway. - **WKB precision** — geometry union uses WKB serialization to avoid floating-point drift from WKT roundtrips. - **Mollweide projection** — uses `+proj=moll` string directly (not `EPSG:54009` which isn't in the PROJ database). Transforms are normalized via `proj_normalize_for_visualization` for correct lon/lat axis order. - **Windowed raster I/O** — GDAL `GDALRasterIO` reads only the bbox-clipped window from multi-GB GeoTIFFs, keeping memory bounded. - **PROJ_DATA auto-discovery** — `main.cpp` sets `PROJ_DATA` at startup pointing to the exe directory where `proj.db` is co-located. - **Cross-platform** — builds on Windows (MSVC) and Linux (GCC/Clang) via vcpkg. Platform-specific code is guarded by `#ifdef _WIN32`.