gadm-ts/docs/geo.md
2026-03-24 10:50:17 +01:00

41 lines
3.8 KiB
Markdown

# Geospatial Architecture & Rendering Strategy
This document outlines the architectural decisions and trade-offs regarding how we process, store, and render GADM administrative boundaries in our web application using MapLibre GL JS.
## The Problem: High-Resolution Boundaries vs. Web Performance
GADM boundaries are highly accurate and meticulously detailed. A single raw boundary for a Region/Level 2 area (like a large province in Poland) can easily exceed 5MB in its raw GeoJSON format.
Sending 5MB+ payloads to a frontend map client to render a basic region outline is an architectural anti-pattern for several reasons:
- **Bandwidth:** High download times break the fluidity of the app, especially on slower connections or mobile devices.
- **Parsing Overhead:** Browsers struggle and block the main thread when parsing JSON payloads of this size.
- **Rendering Lag:** Rendering hundreds of thousands of microscopic, tightly-packed vertices degrades panning and zooming performance because the GPU is forced to process unneeded geometry.
## Evaluation of Architectural Approaches
### 1. Geometry Simplification (Chosen Approach)
Instead of serving 100% full-resolution geometries, we apply a geometry simplification algorithm (such as Douglas-Peucker via `OGR_G_SimplifyPreserveTopology` in GDAL) during the backend C++ pipeline step before the files are saved to the cache.
By using a small geographical tolerance (e.g., `0.001` to `0.005` degrees), we eliminate up to 90-95% of vertices that lie essentially on a straight line.
- **Pros:** A 5MB file drops to ~100-200KB with zero visually perceptible difference at standard region-view zoom levels. It is extremely easy to cache and serve over a standard API without changing the frontend geometry-loading concepts.
- **Cons:** Shared borders between different polygons can sometimes develop tiny gaps or slivers if simplified individually without a shared topological graph/mesh.
### 2. Mapbox Vector Tiles (MVT / Protocol Buffers)
Instead of returning monolithic GeoJSON boundary files, the map client natively requests data in small 256x256 pixel binary tiles (`.mvt` or `.pbf`). The geometries within these tiles are aggressively simplified and clipped based on the user's current zoom level.
- **Pros:** Phenomenal performance for rendering global, tremendously heavy datasets. Only data intersecting the user's viewport is loaded at any given time.
- **Cons:** High infrastructure overhead. It requires running an active live tile server (like `pg_tileserv` or `martin`) or pre-generating massive pyramidal tile caches (e.g., using `tippecanoe`).
### 3. TopoJSON
An extension of GeoJSON that encodes *topology* rather than discrete geometries. Shared borders between adjacent regions are only recorded once.
- **Pros:** Can shrink file sizes by up to 80% compared to GeoJSON. Completely eliminates topological gaps when scaling/simplifying.
- **Cons:** Requires the frontend to bundle and run `topojson-client` to decode the data back into GeoJSON on-the-fly before MapLibre can consume it natively.
## Conclusion & Current Direction
Given our specific usecase: **We are loading boundaries into MapLibre primarily to display outlines and population centers for *certain/selected* GADM regions, rather than rendering the entire administrative globe all at once.**
Because we are operating on an "on-demand, selected region" basis, setting up a full-blown Vector Tile infrastructure (Option 2) introduces unnecessary complexity.
**Our strategy is to leverage Option 1: Pre-cached, heavily simplified GeoJSON.**
By applying a GDAL geometry simplification threshold and limiting coordinate precision down to 5 decimal places during the C++ build pipeline, we yield lightweight, highly performant payloads that MapLibre can ingest natively. This solves the file size bottleneck while preserving our simple and highly efficient file-based caching architecture.