Geospatial Architecture & Rendering Strategy

This document outlines the architectural decisions and trade-offs regarding how we process, store, and render GADM administrative boundaries in our web application using MapLibre GL JS.

The Problem: High-Resolution Boundaries vs. Web Performance

GADM boundaries are highly accurate and meticulously detailed. A single raw boundary for a Region/Level 2 area (like a large province in Poland) can easily exceed 5MB in its raw GeoJSON format.

Sending 5MB+ payloads to a frontend map client to render a basic region outline is an architectural anti-pattern for several reasons:

Bandwidth: High download times break the fluidity of the app, especially on slower connections or mobile devices.
Parsing Overhead: Browsers struggle and block the main thread when parsing JSON payloads of this size.
Rendering Lag: Rendering hundreds of thousands of microscopic, tightly-packed vertices degrades panning and zooming performance because the GPU is forced to process unneeded geometry.

Evaluation of Architectural Approaches

1. Geometry Simplification (Chosen Approach)

Instead of serving 100% full-resolution geometries, we apply a geometry simplification algorithm (such as Douglas-Peucker via OGR_G_SimplifyPreserveTopology in GDAL) during the backend C++ pipeline step before the files are saved to the cache.

By using a small geographical tolerance (e.g., 0.001 to 0.005 degrees), we eliminate up to 90-95% of vertices that lie essentially on a straight line.

Pros: A 5MB file drops to ~100-200KB with zero visually perceptible difference at standard region-view zoom levels. It is extremely easy to cache and serve over a standard API without changing the frontend geometry-loading concepts.
Cons: Shared borders between different polygons can sometimes develop tiny gaps or slivers if simplified individually without a shared topological graph/mesh.

2. Mapbox Vector Tiles (MVT / Protocol Buffers)

Instead of returning monolithic GeoJSON boundary files, the map client natively requests data in small 256x256 pixel binary tiles (.mvt or .pbf). The geometries within these tiles are aggressively simplified and clipped based on the user's current zoom level.

Pros: Phenomenal performance for rendering global, tremendously heavy datasets. Only data intersecting the user's viewport is loaded at any given time.
Cons: High infrastructure overhead. It requires running an active live tile server (like pg_tileserv or martin) or pre-generating massive pyramidal tile caches (e.g., using tippecanoe).

3. TopoJSON

An extension of GeoJSON that encodes topology rather than discrete geometries. Shared borders between adjacent regions are only recorded once.

Pros: Can shrink file sizes by up to 80% compared to GeoJSON. Completely eliminates topological gaps when scaling/simplifying.
Cons: Requires the frontend to bundle and run topojson-client to decode the data back into GeoJSON on-the-fly before MapLibre can consume it natively.

Conclusion & Current Direction

Given our specific usecase: We are loading boundaries into MapLibre primarily to display outlines and population centers for certain/selected GADM regions, rather than rendering the entire administrative globe all at once.

Because we are operating on an "on-demand, selected region" basis, setting up a full-blown Vector Tile infrastructure (Option 2) introduces unnecessary complexity.

Our strategy is to leverage Option 1: Pre-cached, heavily simplified GeoJSON. By applying a GDAL geometry simplification threshold and limiting coordinate precision down to 5 decimal places during the C++ build pipeline, we yield lightweight, highly performant payloads that MapLibre can ingest natively. This solves the file size bottleneck while preserving our simple and highly efficient file-based caching architecture.

3.8 KiB Raw Blame History