| .vs | ||
| cache/gadm | ||
| cpp | ||
| data | ||
| dist | ||
| docs | ||
| scripts | ||
| src | ||
| tests | ||
| .gitignore | ||
| .npmignore | ||
| .npmrc | ||
| package-lock.json | ||
| package.json | ||
| README.md | ||
| tsconfig.json | ||
| vitest.config.ts | ||
@polymech/gadm
Pure TypeScript interface to the GADM v4.1 administrative boundaries database.
Zero Python dependencies — parquet data, tree construction, iterators, and caching all run in Node.js.
Overview
| Feature | Description |
|---|---|
| Database | 356K rows from GADM 4.1, stored as a 6 MB Parquet file |
| Admin Levels | L0 (country) → L5 (municipality/commune) |
| Tree API | Build hierarchical trees, walk with DFS/BFS/level iterators |
| Name Search | Fuzzy search across all levels with Levenshtein suggestions |
| GeoJSON | Fetch boundaries from GADM CDN with corrected names |
| Caching | File-based JSON cache for trees and API results |
| VARNAME | Alternate names / English translations via VARNAME_1..5 columns |
Installation
npm install @polymech/gadm
Internal monorepo — referenced via workspace protocol in package.json.
Acknowledgments & PyGADM Port
This package is a direct Node.js/TypeScript port of the excellent Python library pygadm (which powers the core parquet-based data structure and fetching methodology).
While bringing these capabilities natively to the javascript ecosystem, we built and added several critical enhancements designed specifically for web applications and browser performance:
- Aggressive Geometry Simplification: Natively integrates
@turf/simplifyand@turf/truncatewith a configurableresolutionparameter (1=full detail, 10=max simplification, default=4). Compresses raw unoptimized 25MB boundary polygons down to ~1MB browser-friendly payloads while rounding all coordinates (geometry + GHS metadata) to 5 decimal places. - Unified Cascading Caches: Intelligent caching ladders that auto-resolve across global
process.env.GADM_CACHE, activeprocess.cwd(), and local workspace../cachemounts. - Target-Level Subdivision Extraction: A unified
targetLevelAPI design that distinctly differentiates between extracting an outer merged geographic perimeter vs. an array of granular inner subdivided states natively derived from recursive.merge()operations. - Smart Pre-cacher Script: Includes
boundaries.ts, an auto-resuming build script that iterates downwards to pre-calculate, dissolve, and aggressively compress hierarchy layers 0–5 for instant sub-ms API delivery, bypassing heavy mathematical geometry intersections at runtime.
Quick Start
import { buildTree, walkDFS, findNode, searchRegions, getNames } from '@polymech/gadm';
// Build a tree for Spain
const tree = await buildTree({ admin: 'ESP', cacheDir: './cache/gadm' });
console.log(tree.root.children.length); // 18 (comunidades)
// Find a specific region
const bcn = findNode(tree.root, 'Barcelona');
console.log(bcn?.gid); // ESP.6.1_1
// Walk all nodes
for (const node of walkDFS(tree.root)) {
console.log(' '.repeat(node.level) + node.name);
}
// Search via wrapper API
const result = await searchRegions({ query: 'France', contentLevel: 2 });
console.log(result.data?.length); // ~101 departments
API Reference
Tree Module
buildTree(opts: BuildTreeOptions): Promise<GADMTree>
Builds a hierarchical tree from the flat parquet data. Results are cached to disk when cacheDir is set.
interface BuildTreeOptions {
name?: string; // Region name: "Spain", "Cataluña", "Bayern"
admin?: string; // GADM code: "ESP", "DEU.2_1", "FRA.11_1"
cacheDir?: string; // Path for JSON cache files (optional)
}
Either name or admin must be set (not both).
Throws if the region is not found in the database.
GADMTree and GADMNode
interface GADMTree {
root: GADMNode; // Root node of the tree
maxLevel: number; // Deepest admin level reached (0–5)
nodeCount: number; // Total nodes across all levels
}
interface GADMNode {
name: string; // Display name: "Barcelona"
gid: string; // GADM ID: "ESP.6.1_1"
level: number; // Admin level 0–5
children: GADMNode[]; // Sub-regions (sorted alphabetically)
}
Iterators
All iterators are generators — use for...of or spread into arrays.
| Function | Description |
|---|---|
walkDFS(node) |
Depth-first traversal, top-down |
walkBFS(node) |
Breadth-first, level by level |
walkLevel(node, level) |
Only nodes at a specific admin level |
leaves(node) |
Only leaf nodes (deepest, no children) |
findNode(root, query) |
First DFS match by name or GID (case-insensitive) |
// Get all provinces (level 2) under Cataluña
const provinces = [...walkLevel(tree.root, 2)];
// → [{ name: 'Barcelona', ... }, { name: 'Girona', ... }, ...]
// Count municipalities
const municipios = [...leaves(tree.root)];
console.log(municipios.length); // 955
// Find by GID
const girona = findNode(tree.root, 'ESP.6.2_1');
Names Module
getNames(opts: NamesOptions): Promise<NamesResult>
Searches the parquet database for admin areas. Returns deduplicated rows with fuzzy match suggestions on miss.
interface NamesOptions {
name?: string; // Search by name
admin?: string; // Search by GADM code
contentLevel?: number; // Target level (0–5), -1 = auto
complete?: boolean; // Return all columns up to contentLevel
}
interface NamesResult {
rows: GadmRow[]; // Matched records
level: number; // Resolved content level
columns: string[]; // Column names in result
}
On miss, throws with Levenshtein-based suggestions:
The requested "Franec" is not part of GADM.
The closest matches are: France, Franca, Franco, ...
Items Module
getItems(opts: ItemsOptions): Promise<GeoJSONCollection>
Fetches GeoJSON boundaries from the GADM CDN, with name correction from the local parquet database (workaround for camelCase bug in GADM GeoJSON responses).
interface ItemsOptions {
name?: string | string[]; // Region name(s)
admin?: string | string[]; // GADM code(s)
contentLevel?: number; // Target level, -1 = auto
includeOuter?: boolean; // Also include the containing region's external perimeter
geojson?: boolean; // Return geometries instead of just properties (metadata)
}
Supports continent expansion: getItems({ name: ['europe'] }) fetches all European countries.
Wrapper Module (Server API)
Higher-level API designed for HTTP handlers. Includes file-based caching via GADM_CACHE env var (default: ./cache/gadm).
| Function | Description |
|---|---|
searchRegions(opts) |
Search by name, returns metadata or GeoJSON |
getBoundary(gadmId, contentLevel?, cache?, enrichOpts?, resolution?) |
Get GeoJSON boundary for a GADM ID |
getRegionNames(opts) |
List sub-region names with depth control |
Integration Example (Server API)
Here is a real-world example of wrapping the GADM engine inside an HTTP handler (like Hono or Express) to fetch dynamically chunked boundaries and enrich their GeoJSON metadata on the fly:
import { getBoundary } from '@polymech/gadm';
import * as turf from '@turf/turf';
async function handleGetRegionBoundary(c) {
const id = c.req.param('id'); // e.g. "DEU" or "ESP.6_1"
const targetLevel = c.req.query('targetLevel'); // e.g. "1" for inner states
const enrich = c.req.query('enrich') === 'true';
try {
const parsedTargetLevel = targetLevel !== undefined ? parseInt(targetLevel) : undefined;
// Instantly fetches Boundary FeatureCollection (already cached and compressed)
const result = await getBoundary(id, parsedTargetLevel);
if ('error' in result) {
return c.json({ error: result.error }, 404);
}
// On-the-fly Geometry Enrichment
if (enrich && result.features) {
for (const feature of result.features) {
// Calculate geographical square kilometers organically using Turf
const areaSqkm = Math.round(turf.area(feature as any) / 1000000);
feature.properties.areaSqkm = areaSqkm;
// Construct bounding box for client camera tracking
const bbox = turf.bbox(feature as any);
feature.properties.bbox = bbox;
}
}
return c.json(result, 200);
} catch (error) {
return c.json({ error: error.message }, 500);
}
}
Data Enrichment (Optional GeoTIFFs)
The GADM engine includes built-in optional enrichers that can rapidly query European Commission GHSL (Global Human Settlement Layer) GeoTIFFs directly in Node.js to instantly yield the exact simulated population and built-up concrete metric weight perfectly inside any requested boundary.
Because getBoundary() natively projects bounding boxes to Mollweide EPSG:54009 and extracts spatial windows from the raw satellite TIFF data, you get perfect 100m² resolution density analytics on the fly, saving you from setting up heavy PostGIS/QGIS servers.
Prerequisites (GHSL Data)
You must download the raw GeoTIFF datasets from the EU JRC Open Data portal and store them locally (e.g. in data/ghs/). Warning: These files are >1GB.
| Dataset | Metric | URL |
|---|---|---|
GHS_POP |
Population (2030 Projections) | GHS_POP_E2030_GLOBE_R2023A_54009_100_V1_0.tif |
GHS_BUILT_S |
Built-up Area / Concrete Surface | GHS_BUILT_S_E2030_GLOBE_R2023A_54009_100_V1_0.tif |
Option 1: Native Wrapper Option (Recommended)
Simply pass { pop: true, built: true } into getBoundary(). It will automatically discover the .tif datasets (looking in data/ghs, cache/ghs, and environment variables), scan the density per feature, calculate the true Population and Physical Centers of Mass, and append them directly to the GeoJSON feature.properties (deep cloning and caching the result!)
const result = await getBoundary(gadmId, targetLevel, undefined, {
pop: true,
built: true
});
// result.features[0].properties will now contain:
// {
// ...
// "population": 5666, <- the standard property overwritten with hyper-accurate bounds
// "ghsPopMaxDensity": 125, <- highest density 100x100m block
// "ghsPopCenter": [2.1019, 41.8130], <- true center of mass (where residents actually live vs geographical center)
// "ghsPopCenters": [ <- up to 5 distinct population clusters [lon, lat, max_density]
// [2.1019, 41.8130, 125]
// ],
// "ghsBuiltWeight": 744080, <- concrete physical size index
// "ghsBuiltCenter": [2.1039, 41.8130], <- true center of concrete (industrial + urban spread)
// "ghsBuiltCenters": [ <- up to 5 distinct concrete clusters [lon, lat, max_density]
// [2.1039, 41.8130, 300]
// ]
// }
Option 2: Standalone Feature Module
If you already have arbitrary GeoJSON polygons, you can extract the exact same density metrics natively:
import { enrichFeatureWithGHS } from '@polymech/gadm';
const myCustomPolygon = { type: 'Feature', geometry: { ... } };
const stats = await enrichFeatureWithGHS(myCustomPolygon, {
pop: true
});
console.log(stats.ghsPopulation, stats.ghsPopCenter);
Boundary Geometries & Caching
Fetching complex geospatial polygons (like country borders or district subdivisions) requires merging and calculating hundreds of complex geometries. Doing this mathematically at runtime for a user request is too slow, so @polymech/gadm handles this with pre-compiled caches and aggressive size compression.
Resolving Boundary Target Levels
When building interactive user interfaces or fetching boundaries through the top-level API (handleGetRegionBoundary), the returned FeatureCollection granularity is controlled strictly through the targetLevel (or programmatic contentLevel).
- Outer Boundary: Set
targetLevelexactly equal to the region's intrinsic level (e.g., TargettingLevel 0for Spain). The engine usesturfto automatically dissolve internal geometries, returning a single merged bounding polygon mimicking the total region envelope. - Inner Subdivisions: Provide a
targetLeveldeeper than the intrinsic level (e.g., TargettingLevel 1for Spain). The engine filters for the exact constituent parts and returns aFeatureCollectionwhere each active sub-group (the 17 Spanish States) is a distinctly preserved geometry feature.
Geometry Simplification & Resolution
Both the TypeScript and C++ pipelines apply geometry simplification controlled by a resolution parameter (default: 4):
| Resolution | Tolerance | Coordinate Precision | Use Case |
|---|---|---|---|
| 1 | 0.0001 | 5 decimals | Maximum detail |
| 4 | 0.005 | 5 decimals | Default — good balance |
| 10 | 0.5 | 5 decimals | Maximum compression |
The formula: tolerance = 0.0001 * 10^((resolution-1) * 4/9). GHS metadata coordinates (ghsPopCenter, ghsBuiltCenters, etc.) are also rounded to 5 decimal places to match geometry precision.
Smart Caching & Cache Resolution Order
To ensure instantaneous delivery (sub-10ms) of these polygons to your HTTP APIs:
- Pre-Caching Scripts: Run
npm run boundaries -- --country=all(TypeScript) ornpm run boundaries:cpp(C++). Both iterate downwards to compute and compress hierarchical layers 0 through 5 for each country. Existing files are skipped for easy resume. - Cascading Cache Lookups: The package resolves caches in order:
- Exact sub-region cache file:
boundary_{gadmId}_{level}.json - Full country cache file:
boundary_{countryCode}_{level}.json(prefix-filtered for sub-region queries) - Environment paths:
process.env.GADM_CACHE, thenprocess.cwd()/cache/gadm, then../cache/gadm - Live GeoPackage query (fallback)
- Exact sub-region cache file:
- Payload Compression (~25MB -> ~1MB): Boundary geometries are compressed using
@turf/simplify(TS) or GEOSGEOSSimplify_r(C++) with matching tolerance, ensuring consistent output from both pipelines.
Database Module (Low-Level)
| Function | Description |
|---|---|
loadDatabase() |
Load parquet into memory (lazy, singleton) |
getColumns() |
Return column names |
resetCache() |
Clear the in-memory row cache |
GadmRow is Record<string, string> — all values normalized to strings.
Types
All types are exported from the package entry point:
import type {
GADMNode, GADMTree, BuildTreeOptions, // tree
NamesOptions, NamesResult, GadmRow, // names + database
ItemsOptions, GeoJSONFeature, GeoJSONCollection, // items
SearchRegionsOptions, SearchRegionsResult, RegionNamesOptions, // wrapper
} from '@polymech/gadm';
Data Layout
Parquet File
data/gadm_database.parquet — 356,508 rows, 6.29 MB
| Column Group | Columns | Description |
|---|---|---|
| GID | GID_0 … GID_5 |
GADM identifiers per level |
| NAME | NAME_0 … NAME_5 |
Display names per level |
| VARNAME | VARNAME_1 … VARNAME_5 |
Alternate names / translations |
129,448 rows have VARNAME_1 values (e.g. Badakhshān, Bavière).
GADM Levels
| Level | Typical Meaning | Example (Spain) |
|---|---|---|
| 0 | Country | Spain |
| 1 | State / Region | Cataluña |
| 2 | Province / Department | Barcelona |
| 3 | District / Comarca | Baix Llobregat |
| 4 | Municipality | Castelldefels |
| 5 | Sub-municipality | (rare, not all countries) |
Note: GADM does not include neighborhood/Stadtteil-level data.
For sub-city resolution (e.g. Johannstadt in Dresden), OSM/Nominatim would be needed.
Caching
Tree Cache (cacheDir)
When cacheDir is passed to buildTree(), the full tree is saved as tree_{md5}.json.
Subsequent calls with the same name/admin return the cached tree instantly (~1ms).
Wrapper Cache (GADM_CACHE)
The wrapper module caches search results, boundaries, and region names in $GADM_CACHE/ (default ./cache/gadm).
Files are keyed by MD5 hash of the query parameters.
In-Memory Cache
loadDatabase() is a singleton — the 356K-row array is loaded once per process.
Call resetCache() to force a reload (useful in tests).
Precalculating Boundaries
To improve runtime performance (especially for large geographies which take time to dissolve), you can precalculate and cache standard admin boundaries using the included CLI script:
cd packages/gadm
# Precalculate the outer boundary for a specific country
npm run boundaries -- --country=DEU
# Precalculate inner boundaries for a specific level
npm run boundaries -- --country=DEU --level=1
# Precalculate the outer boundary for ALL countries worldwide
npm run boundaries -- --country=all
Precalculated boundaries are saved as native .json artifacts inside the configured cache directory (./cache/gadm/boundary_{CODE}_{LEVEL}.json).
C++ Native Pipeline (Recommended for Batch)
For full batch generation across all 263 countries × 6 levels, the native C++ port provides significantly faster processing using GDAL/GEOS/PROJ directly. It reads the same GeoPackage, performs geometry unions via WKB-precision GEOS, and enriches with GHS raster data — producing identical output to the TypeScript pipeline.
# Build (requires vcpkg + CMake)
npm run build:cpp # or: cmake --build cpp/build --config Release
# Run via npm scripts
npm run boundaries:cpp # all countries
npm run boundaries:cpp -- --country=DEU # single country
# Sub-region splitting (generates boundary_ESP.6_1_4.json etc.)
npm run boundaries:cpp -- --country=all --level=4 --split-levels=1
# Custom resolution (1-10, default=4)
npm run boundaries:cpp -- --country=DEU --resolution=6
Output includes GHS enrichment by default when tiff files are present in data/ghs/:
ghsPopulation,ghsPopMaxDensity,ghsPopCenter,ghsPopCentersghsBuiltWeight,ghsBuiltMax,ghsBuiltCenter,ghsBuiltCenters
See cpp/README.md for build prerequisites, full CLI reference, and architecture details.
Data Refresh
Regenerate data/gadm_database.parquet from a GADM GeoPackage source file.
Prerequisites
Download one of the core GeoPackage database files. You can point the package to your gpkg location using the GADM_GPKG_PATH environment variable, or store it in your working directory at cache/gadm/gadm_410.gpkg:
https://geodata.ucdavis.edu/gadm/gadm4.1/gadm_410-gpkg.zip → unzip → gadm_410.gpkg
https://geodata.ucdavis.edu/gadm/gadm4.1/gadm_410-raw.gpkg
Run
cd packages/gadm
npm run refresh
The script (scripts/refresh-database.ts):
- Opens the GeoPackage (SQLite) via
better-sqlite3 - Auto-detects table format (per-level
ADM_xtables or single flat table) - Extracts GID, NAME, and VARNAME columns for levels 0–5
- Writes to
data/gadm_database.parquetviahyparquet-writer
Dev Dependencies (refresh only)
| Package | Purpose |
|---|---|
better-sqlite3 |
Read GeoPackage (SQLite) files |
hyparquet-writer |
Write Parquet output |
These are devDependencies — not needed at runtime.
Tests
cd packages/gadm
npx vitest run # all tests
npx vitest run src/__tests__/tree.test.ts # tree tests only
Tree Tests
JSON outputs saved to tests/tree/ for inspection:
| File | Content |
|---|---|
test-cataluna.json |
Full Cataluña tree (1,000 nodes, 955 leaves) |
test-germany-summary.json |
Germany L1 summary (16 Bundesländer, 16,402 nodes) |
test-dresden.json |
Sachsen → Dresden subtree with all children |
test-iterators.json |
DFS/BFS/walkLevel/findNode verification data |
Name Tests
src/__tests__/province-names.test.ts — tests getNames() for France departments, exact matches, fuzzy suggestions.
Architecture
packages/gadm/
├── cpp/ # C++ native pipeline (GDAL/GEOS/PROJ)
│ ├── src/ # main.cpp, gpkg_reader, geo_merge, ghs_enrich
│ ├── CMakeLists.txt
│ └── vcpkg.json
├── data/
│ ├── gadm_database.parquet # 356K rows, 6.29 MB
│ ├── gadm_continent.json # Continent → ISO3 mapping
│ └── ghs/ # GHS GeoTIFF rasters (optional)
├── dist/
│ └── win-x64/ # Compiled C++ binary + DLLs
├── scripts/
│ └── refresh-database.ts # GeoPackage → Parquet converter
├── src/
│ ├── database.ts # Parquet reader (hyparquet)
│ ├── names.ts # Name/code lookup + fuzzy match
│ ├── items.ts # GeoJSON boundaries from CDN
│ ├── gpkg-reader.ts # GeoPackage boundary reader + C++ cache fallback
│ ├── enrich-ghs.ts # GHS GeoTIFF enrichment (TS)
│ ├── wrapper.ts # Server-facing API with cache
│ ├── tree.ts # Tree builder + iterators
│ ├── index.ts # Barrel exports
│ └── __tests__/
│ ├── tree.test.ts # Tree building + iterator tests
│ └── province-names.test.ts
├── tests/
│ ├── tree/ # Test output JSONs
│ └── cache/gadm/ # Tree cache files
└── package.json
Dependencies
| Package | Type | Purpose |
|---|---|---|
hyparquet |
runtime | Read Parquet files (zero native deps) |
zod |
runtime | Schema validation |
better-sqlite3 |
dev | GeoPackage reader (refresh only) |
hyparquet-writer |
dev | Parquet writer (refresh only) |
vitest |
dev | Test runner |
typescript |
dev | Build |
