566 lines
22 KiB
Markdown
566 lines
22 KiB
Markdown
# @polymech/gadm
|
||
|
||

|
||

|
||
|
||
**[Homepage](https://service.polymech.info/)** · **[Source Code](https://git.polymech.info/polymech/gadm-ts)**
|
||
|
||
Pure TypeScript interface to the [GADM](https://gadm.org) v4.1 administrative boundaries database.
|
||
Zero Python dependencies — parquet data, tree construction, iterators, and caching all run in Node.js.
|
||
|
||
## Overview
|
||
|
||
| Feature | Description |
|
||
|---------|-------------|
|
||
| **Database** | 356K rows from GADM 4.1, stored as a 6 MB Parquet file |
|
||
| **Admin Levels** | L0 (country) → L5 (municipality/commune) |
|
||
| **Tree API** | Build hierarchical trees, walk with DFS/BFS/level iterators |
|
||
| **Name Search** | Fuzzy search across all levels with Levenshtein suggestions |
|
||
| **GeoJSON** | Fetch boundaries from GADM CDN with corrected names |
|
||
| **Caching** | File-based JSON cache for trees and API results |
|
||
| **VARNAME** | Alternate names / English translations via `VARNAME_1..5` columns |
|
||
|
||

|
||
|
||
|
||
---
|
||
|
||
## Installation
|
||
|
||
```bash
|
||
npm install @polymech/gadm
|
||
```
|
||
|
||
Internal monorepo — referenced via workspace protocol in `package.json`.
|
||
|
||
---
|
||
|
||
## Acknowledgments & PyGADM Port
|
||
|
||
This package is a direct Node.js/TypeScript port of the excellent Python library [pygadm](https://github.com/gee-community/pygadm) (which powers the core parquet-based data structure and fetching methodology).
|
||
|
||
While bringing these capabilities natively to the javascript ecosystem, we built and added several critical enhancements designed specifically for web applications and browser performance:
|
||
|
||
- **Aggressive Geometry Simplification:** Natively integrates `@turf/simplify` and `@turf/truncate` with a configurable `resolution` parameter (1=full detail, 10=max simplification, default=4). Compresses raw unoptimized 25MB boundary polygons down to ~1MB browser-friendly payloads while rounding all coordinates (geometry + GHS metadata) to 5 decimal places.
|
||
- **Unified Cascading Caches:** Intelligent caching ladders that auto-resolve across global `process.env.GADM_CACHE`, active `process.cwd()`, and local workspace `../cache` mounts.
|
||
- **Target-Level Subdivision Extraction:** A unified `targetLevel` API design that distinctly differentiates between extracting an outer merged geographic perimeter vs. an array of granular inner subdivided states natively derived from recursive `.merge()` operations.
|
||
- **Smart Pre-cacher Script:** Includes `boundaries.ts`, an auto-resuming build script that iterates downwards to pre-calculate, dissolve, and aggressively compress hierarchy layers 0–5 for instant sub-ms API delivery, bypassing heavy mathematical geometry intersections at runtime.
|
||
|
||
---
|
||
|
||
## Quick Start
|
||
|
||
```ts
|
||
import { buildTree, walkDFS, findNode, searchRegions, getNames } from '@polymech/gadm';
|
||
|
||
// Build a tree for Spain
|
||
const tree = await buildTree({ admin: 'ESP', cacheDir: './cache/gadm' });
|
||
console.log(tree.root.children.length); // 18 (comunidades)
|
||
|
||
// Find a specific region
|
||
const bcn = findNode(tree.root, 'Barcelona');
|
||
console.log(bcn?.gid); // ESP.6.1_1
|
||
|
||
// Walk all nodes
|
||
for (const node of walkDFS(tree.root)) {
|
||
console.log(' '.repeat(node.level) + node.name);
|
||
}
|
||
|
||
// Search via wrapper API
|
||
const result = await searchRegions({ query: 'France', contentLevel: 2 });
|
||
console.log(result.data?.length); // ~101 departments
|
||
```
|
||
|
||
---
|
||
|
||
## API Reference
|
||
|
||
### Tree Module
|
||
|
||
#### `buildTree(opts: BuildTreeOptions): Promise<GADMTree>`
|
||
|
||
Builds a hierarchical tree from the flat parquet data. Results are cached to disk when `cacheDir` is set.
|
||
|
||
```ts
|
||
interface BuildTreeOptions {
|
||
name?: string; // Region name: "Spain", "Cataluña", "Bayern"
|
||
admin?: string; // GADM code: "ESP", "DEU.2_1", "FRA.11_1"
|
||
cacheDir?: string; // Path for JSON cache files (optional)
|
||
}
|
||
```
|
||
|
||
Either `name` or `admin` must be set (not both).
|
||
Throws if the region is not found in the database.
|
||
|
||
#### `GADMTree` and `GADMNode`
|
||
|
||
```ts
|
||
interface GADMTree {
|
||
root: GADMNode; // Root node of the tree
|
||
maxLevel: number; // Deepest admin level reached (0–5)
|
||
nodeCount: number; // Total nodes across all levels
|
||
}
|
||
|
||
interface GADMNode {
|
||
name: string; // Display name: "Barcelona"
|
||
gid: string; // GADM ID: "ESP.6.1_1"
|
||
level: number; // Admin level 0–5
|
||
children: GADMNode[]; // Sub-regions (sorted alphabetically)
|
||
}
|
||
```
|
||
|
||
#### Iterators
|
||
|
||
All iterators are generators — use `for...of` or spread into arrays.
|
||
|
||
| Function | Description |
|
||
|----------|-------------|
|
||
| `walkDFS(node)` | Depth-first traversal, top-down |
|
||
| `walkBFS(node)` | Breadth-first, level by level |
|
||
| `walkLevel(node, level)` | Only nodes at a specific admin level |
|
||
| `leaves(node)` | Only leaf nodes (deepest, no children) |
|
||
| `findNode(root, query)` | First DFS match by name or GID (case-insensitive) |
|
||
|
||
```ts
|
||
// Get all provinces (level 2) under Cataluña
|
||
const provinces = [...walkLevel(tree.root, 2)];
|
||
// → [{ name: 'Barcelona', ... }, { name: 'Girona', ... }, ...]
|
||
|
||
// Count municipalities
|
||
const municipios = [...leaves(tree.root)];
|
||
console.log(municipios.length); // 955
|
||
|
||
// Find by GID
|
||
const girona = findNode(tree.root, 'ESP.6.2_1');
|
||
```
|
||
|
||
---
|
||
|
||
### Names Module
|
||
|
||
#### `getNames(opts: NamesOptions): Promise<NamesResult>`
|
||
|
||
Searches the parquet database for admin areas. Returns deduplicated rows with fuzzy match suggestions on miss.
|
||
|
||
```ts
|
||
interface NamesOptions {
|
||
name?: string; // Search by name
|
||
admin?: string; // Search by GADM code
|
||
contentLevel?: number; // Target level (0–5), -1 = auto
|
||
complete?: boolean; // Return all columns up to contentLevel
|
||
}
|
||
|
||
interface NamesResult {
|
||
rows: GadmRow[]; // Matched records
|
||
level: number; // Resolved content level
|
||
columns: string[]; // Column names in result
|
||
}
|
||
```
|
||
|
||
On miss, throws with Levenshtein-based suggestions:
|
||
```
|
||
The requested "Franec" is not part of GADM.
|
||
The closest matches are: France, Franca, Franco, ...
|
||
```
|
||
|
||
---
|
||
|
||
### Items Module
|
||
|
||
#### `getItems(opts: ItemsOptions): Promise<GeoJSONCollection>`
|
||
|
||
Fetches GeoJSON boundaries from the GADM CDN, with name correction from the local parquet database (workaround for camelCase bug in GADM GeoJSON responses).
|
||
|
||
```ts
|
||
interface ItemsOptions {
|
||
name?: string | string[]; // Region name(s)
|
||
admin?: string | string[]; // GADM code(s)
|
||
contentLevel?: number; // Target level, -1 = auto
|
||
includeOuter?: boolean; // Also include the containing region's external perimeter
|
||
geojson?: boolean; // Return geometries instead of just properties (metadata)
|
||
}
|
||
```
|
||
|
||
Supports continent expansion: `getItems({ name: ['europe'] })` fetches all European countries.
|
||
|
||
---
|
||
|
||
### Wrapper Module (Server API)
|
||
|
||
Higher-level API designed for HTTP handlers. Includes file-based caching via `GADM_CACHE` env var (default: `./cache/gadm`).
|
||
|
||
| Function | Description |
|
||
|----------|-------------|
|
||
| `searchRegions(opts)` | Search by name, returns metadata or GeoJSON |
|
||
| `getBoundary(gadmId, contentLevel?, cache?, enrichOpts?, resolution?)` | Get GeoJSON boundary for a GADM ID |
|
||
| `getRegionNames(opts)` | List sub-region names with depth control |
|
||
|
||
#### Integration Example (Server API)
|
||
|
||
Here is a real-world example of wrapping the GADM engine inside an HTTP handler (like Hono or Express) to fetch dynamically chunked boundaries and enrich their GeoJSON metadata on the fly:
|
||
|
||
```ts
|
||
import { getBoundary } from '@polymech/gadm';
|
||
import * as turf from '@turf/turf';
|
||
|
||
async function handleGetRegionBoundary(c) {
|
||
const id = c.req.param('id'); // e.g. "DEU" or "ESP.6_1"
|
||
const targetLevel = c.req.query('targetLevel'); // e.g. "1" for inner states
|
||
const enrich = c.req.query('enrich') === 'true';
|
||
|
||
try {
|
||
const parsedTargetLevel = targetLevel !== undefined ? parseInt(targetLevel) : undefined;
|
||
|
||
// Instantly fetches Boundary FeatureCollection (already cached and compressed)
|
||
const result = await getBoundary(id, parsedTargetLevel);
|
||
|
||
if ('error' in result) {
|
||
return c.json({ error: result.error }, 404);
|
||
}
|
||
|
||
// On-the-fly Geometry Enrichment
|
||
if (enrich && result.features) {
|
||
for (const feature of result.features) {
|
||
// Calculate geographical square kilometers organically using Turf
|
||
const areaSqkm = Math.round(turf.area(feature as any) / 1000000);
|
||
feature.properties.areaSqkm = areaSqkm;
|
||
|
||
// Construct bounding box for client camera tracking
|
||
const bbox = turf.bbox(feature as any);
|
||
feature.properties.bbox = bbox;
|
||
}
|
||
}
|
||
|
||
return c.json(result, 200);
|
||
|
||
} catch (error) {
|
||
return c.json({ error: error.message }, 500);
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Data Enrichment (Optional GeoTIFFs)
|
||
|
||
The GADM engine includes built-in optional enrichers that can rapidly query **European Commission GHSL (Global Human Settlement Layer)** GeoTIFFs directly in Node.js to instantly yield the **exact simulated population** and **built-up concrete metric weight** perfectly inside any requested boundary.
|
||
|
||
Because `getBoundary()` natively projects bounding boxes to Mollweide `EPSG:54009` and extracts spatial windows from the raw satellite TIFF data, you get perfect 100m² resolution density analytics on the fly, saving you from setting up heavy PostGIS/QGIS servers.
|
||
|
||
### Prerequisites (GHSL Data)
|
||
You must download the raw GeoTIFF datasets from the EU JRC Open Data portal and store them locally (e.g. in `data/ghs/`). *Warning: These files are >1GB.*
|
||
|
||
| Dataset | Metric | URL |
|
||
|---------|--------|-----|
|
||
| `GHS_POP` | Population (2030 Projections) | [GHS_POP_E2030_GLOBE_R2023A_54009_100_V1_0.tif](https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_POP_GLOBE_R2023A/GHS_POP_E2030_GLOBE_R2023A_54009_100/V1-0/GHS_POP_E2030_GLOBE_R2023A_54009_100_V1_0.zip) |
|
||
| `GHS_BUILT_S`| Built-up Area / Concrete Surface | [GHS_BUILT_S_E2030_GLOBE_R2023A_54009_100_V1_0.tif](https://jeodpp.jrc.ec.europa.eu/ftp/jrc-opendata/GHSL/GHS_BUILT_S_GLOBE_R2023A/GHS_BUILT_S_E2030_GLOBE_R2023A_54009_100/V1-0/GHS_BUILT_S_E2030_GLOBE_R2023A_54009_100_V1_0.zip) |
|
||
|
||
### Option 1: Native Wrapper Option (Recommended)
|
||
|
||
Simply pass `{ pop: true, built: true }` into `getBoundary()`. It will automatically discover the `.tif` datasets (looking in `data/ghs`, `cache/ghs`, and environment variables), scan the density per feature, calculate the true Population and Physical Centers of Mass, and append them directly to the GeoJSON `feature.properties` (deep cloning and caching the result!)
|
||
|
||
```typescript
|
||
const result = await getBoundary(gadmId, targetLevel, undefined, {
|
||
pop: true,
|
||
built: true
|
||
});
|
||
|
||
// result.features[0].properties will now contain:
|
||
// {
|
||
// ...
|
||
// "population": 5666, <- the standard property overwritten with hyper-accurate bounds
|
||
// "ghsPopMaxDensity": 125, <- highest density 100x100m block
|
||
// "ghsPopCenter": [2.1019, 41.8130], <- true center of mass (where residents actually live vs geographical center)
|
||
// "ghsPopCenters": [ <- up to 5 distinct population clusters [lon, lat, max_density]
|
||
// [2.1019, 41.8130, 125]
|
||
// ],
|
||
// "ghsBuiltWeight": 744080, <- concrete physical size index
|
||
// "ghsBuiltCenter": [2.1039, 41.8130], <- true center of concrete (industrial + urban spread)
|
||
// "ghsBuiltCenters": [ <- up to 5 distinct concrete clusters [lon, lat, max_density]
|
||
// [2.1039, 41.8130, 300]
|
||
// ]
|
||
// }
|
||
```
|
||
|
||
### Option 2: Standalone Feature Module
|
||
|
||
If you already have arbitrary GeoJSON polygons, you can extract the exact same density metrics natively:
|
||
|
||
```typescript
|
||
import { enrichFeatureWithGHS } from '@polymech/gadm';
|
||
|
||
const myCustomPolygon = { type: 'Feature', geometry: { ... } };
|
||
|
||
const stats = await enrichFeatureWithGHS(myCustomPolygon, {
|
||
pop: true
|
||
});
|
||
|
||
console.log(stats.ghsPopulation, stats.ghsPopCenter);
|
||
```
|
||
|
||
---
|
||
|
||
## Boundary Geometries & Caching
|
||
|
||
Fetching complex geospatial polygons (like country borders or district subdivisions) requires merging and calculating hundreds of complex geometries. Doing this mathematically at runtime for a user request is too slow, so `@polymech/gadm` handles this with pre-compiled caches and aggressive size compression.
|
||
|
||
### Resolving Boundary Target Levels
|
||
|
||
When building interactive user interfaces or fetching boundaries through the top-level API (`handleGetRegionBoundary`), the returned `FeatureCollection` granularity is controlled strictly through the `targetLevel` (or programmatic `contentLevel`).
|
||
|
||
- **Outer Boundary**: Set `targetLevel` exactly equal to the region's intrinsic level (e.g., Targetting `Level 0` for Spain). The engine uses `turf` to automatically dissolve internal geometries, returning a single merged bounding polygon mimicking the total region envelope.
|
||
- **Inner Subdivisions**: Provide a `targetLevel` deeper than the intrinsic level (e.g., Targetting `Level 1` for Spain). The engine filters for the exact constituent parts and returns a `FeatureCollection` where each active sub-group (the 17 Spanish States) is a distinctly preserved geometry feature.
|
||
|
||
### Geometry Simplification & Resolution
|
||
|
||
Both the TypeScript and C++ pipelines apply geometry simplification controlled by a `resolution` parameter (default: **4**):
|
||
|
||
| Resolution | Tolerance | Coordinate Precision | Use Case |
|
||
|------------|-----------|---------------------|----------|
|
||
| 1 | 0.0001 | 5 decimals | Maximum detail |
|
||
| 4 | 0.005 | 5 decimals | Default — good balance |
|
||
| 10 | 0.5 | 5 decimals | Maximum compression |
|
||
|
||
The formula: `tolerance = 0.0001 * 10^((resolution-1) * 4/9)`. GHS metadata coordinates (`ghsPopCenter`, `ghsBuiltCenters`, etc.) are also rounded to 5 decimal places to match geometry precision.
|
||
|
||
### Smart Caching & Cache Resolution Order
|
||
|
||
To ensure instantaneous delivery (sub-10ms) of these polygons to your HTTP APIs:
|
||
|
||
1. **Pre-Caching Scripts**: Run `npm run boundaries -- --country=all` (TypeScript) or `npm run boundaries:cpp` (C++). Both iterate downwards to compute and compress hierarchical layers 0 through 5 for each country. Existing files are skipped for easy resume.
|
||
2. **Cascading Cache Lookups**: The package resolves caches in order:
|
||
- Exact sub-region cache file: `boundary_{gadmId}_{level}.json`
|
||
- Full country cache file: `boundary_{countryCode}_{level}.json` (prefix-filtered for sub-region queries)
|
||
- Environment paths: `process.env.GADM_CACHE`, then `process.cwd()/cache/gadm`, then `../cache/gadm`
|
||
- Live GeoPackage query (fallback)
|
||
3. **Payload Compression (~25MB -> ~1MB)**: Boundary geometries are compressed using `@turf/simplify` (TS) or GEOS `GEOSSimplify_r` (C++) with matching tolerance, ensuring consistent output from both pipelines.
|
||
|
||
---
|
||
|
||
### Database Module (Low-Level)
|
||
|
||
| Function | Description |
|
||
|----------|-------------|
|
||
| `loadDatabase()` | Load parquet into memory (lazy, singleton) |
|
||
| `getColumns()` | Return column names |
|
||
| `resetCache()` | Clear the in-memory row cache |
|
||
|
||
`GadmRow` is `Record<string, string>` — all values normalized to strings.
|
||
|
||
---
|
||
|
||
## Types
|
||
|
||
All types are exported from the package entry point:
|
||
|
||
```ts
|
||
import type {
|
||
GADMNode, GADMTree, BuildTreeOptions, // tree
|
||
NamesOptions, NamesResult, GadmRow, // names + database
|
||
ItemsOptions, GeoJSONFeature, GeoJSONCollection, // items
|
||
SearchRegionsOptions, SearchRegionsResult, RegionNamesOptions, // wrapper
|
||
} from '@polymech/gadm';
|
||
```
|
||
|
||
---
|
||
|
||
## Data Layout
|
||
|
||
### Parquet File
|
||
|
||
`data/gadm_database.parquet` — **356,508 rows**, **6.29 MB**
|
||
|
||
| Column Group | Columns | Description |
|
||
|--------------|---------|-------------|
|
||
| GID | `GID_0` … `GID_5` | GADM identifiers per level |
|
||
| NAME | `NAME_0` … `NAME_5` | Display names per level |
|
||
| VARNAME | `VARNAME_1` … `VARNAME_5` | Alternate names / translations |
|
||
|
||
129,448 rows have `VARNAME_1` values (e.g. `Badakhshān`, `Bavière`).
|
||
|
||
### GADM Levels
|
||
|
||
| Level | Typical Meaning | Example (Spain) |
|
||
|-------|----------------|-----------------|
|
||
| 0 | Country | Spain |
|
||
| 1 | State / Region | Cataluña |
|
||
| 2 | Province / Department | Barcelona |
|
||
| 3 | District / Comarca | Baix Llobregat |
|
||
| 4 | Municipality | Castelldefels |
|
||
| 5 | Sub-municipality | *(rare, not all countries)* |
|
||
|
||
> **Note:** GADM does not include neighborhood/Stadtteil-level data.
|
||
> For sub-city resolution (e.g. Johannstadt in Dresden), OSM/Nominatim would be needed.
|
||
|
||
---
|
||
|
||
## Caching
|
||
|
||
### Tree Cache (`cacheDir`)
|
||
|
||
When `cacheDir` is passed to `buildTree()`, the full tree is saved as `tree_{md5}.json`.
|
||
Subsequent calls with the same `name`/`admin` return the cached tree instantly (~1ms).
|
||
|
||
### Wrapper Cache (`GADM_CACHE`)
|
||
|
||
The wrapper module caches search results, boundaries, and region names in `$GADM_CACHE/` (default `./cache/gadm`).
|
||
Files are keyed by MD5 hash of the query parameters.
|
||
|
||
### In-Memory Cache
|
||
|
||
`loadDatabase()` is a singleton — the 356K-row array is loaded once per process.
|
||
Call `resetCache()` to force a reload (useful in tests).
|
||
|
||
### Precalculating Boundaries
|
||
|
||
To improve runtime performance (especially for large geographies which take time to dissolve), you can precalculate and cache standard admin boundaries using the included CLI script:
|
||
|
||
```bash
|
||
cd packages/gadm
|
||
|
||
# Precalculate the outer boundary for a specific country
|
||
npm run boundaries -- --country=DEU
|
||
|
||
# Precalculate inner boundaries for a specific level
|
||
npm run boundaries -- --country=DEU --level=1
|
||
|
||
# Precalculate the outer boundary for ALL countries worldwide
|
||
npm run boundaries -- --country=all
|
||
```
|
||
|
||
Precalculated boundaries are saved as native `.json` artifacts inside the configured cache directory (`./cache/gadm/boundary_{CODE}_{LEVEL}.json`).
|
||
|
||
### C++ Native Pipeline (Recommended for Batch)
|
||
|
||
For full batch generation across all 263 countries × 6 levels, the native C++ port provides significantly faster processing using GDAL/GEOS/PROJ directly. It reads the same GeoPackage, performs geometry unions via WKB-precision GEOS, and enriches with GHS raster data — producing identical output to the TypeScript pipeline.
|
||
|
||
```bash
|
||
# Build (requires vcpkg + CMake)
|
||
npm run build:cpp # or: cmake --build cpp/build --config Release
|
||
|
||
# Run via npm scripts
|
||
npm run boundaries:cpp # all countries
|
||
npm run boundaries:cpp -- --country=DEU # single country
|
||
|
||
# Sub-region splitting (generates boundary_ESP.6_1_4.json etc.)
|
||
npm run boundaries:cpp -- --country=all --level=4 --split-levels=1
|
||
|
||
# Custom resolution (1-10, default=4)
|
||
npm run boundaries:cpp -- --country=DEU --resolution=6
|
||
```
|
||
|
||
Output includes GHS enrichment by default when tiff files are present in `data/ghs/`:
|
||
- `ghsPopulation`, `ghsPopMaxDensity`, `ghsPopCenter`, `ghsPopCenters`
|
||
- `ghsBuiltWeight`, `ghsBuiltMax`, `ghsBuiltCenter`, `ghsBuiltCenters`
|
||
|
||
See [`cpp/README.md`](./cpp/README.md) for build prerequisites, full CLI reference, and architecture details.
|
||
|
||
---
|
||
|
||
## Data Refresh
|
||
|
||
Regenerate `data/gadm_database.parquet` from a GADM GeoPackage source file.
|
||
|
||
### Prerequisites
|
||
|
||
Download one of the core GeoPackage database files. You can point the package to your `gpkg` location using the `GADM_GPKG_PATH` environment variable, or store it in your working directory at `cache/gadm/gadm_410.gpkg`:
|
||
|
||
```bash
|
||
https://geodata.ucdavis.edu/gadm/gadm4.1/gadm_410-gpkg.zip → unzip → gadm_410.gpkg
|
||
https://geodata.ucdavis.edu/gadm/gadm4.1/gadm_410-raw.gpkg
|
||
```
|
||
|
||
### Run
|
||
|
||
```bash
|
||
cd packages/gadm
|
||
npm run refresh
|
||
```
|
||
|
||
The script (`scripts/refresh-database.ts`):
|
||
1. Opens the GeoPackage (SQLite) via `better-sqlite3`
|
||
2. Auto-detects table format (per-level `ADM_x` tables or single flat table)
|
||
3. Extracts GID, NAME, and VARNAME columns for levels 0–5
|
||
4. Writes to `data/gadm_database.parquet` via `hyparquet-writer`
|
||
|
||
### Dev Dependencies (refresh only)
|
||
|
||
| Package | Purpose |
|
||
|---------|---------|
|
||
| `better-sqlite3` | Read GeoPackage (SQLite) files |
|
||
| `hyparquet-writer` | Write Parquet output |
|
||
|
||
These are `devDependencies` — not needed at runtime.
|
||
|
||
---
|
||
|
||
## Tests
|
||
|
||
```bash
|
||
cd packages/gadm
|
||
npx vitest run # all tests
|
||
npx vitest run src/__tests__/tree.test.ts # tree tests only
|
||
```
|
||
|
||
### Tree Tests
|
||
|
||
JSON outputs saved to `tests/tree/` for inspection:
|
||
|
||
| File | Content |
|
||
|------|---------|
|
||
| `test-cataluna.json` | Full Cataluña tree (1,000 nodes, 955 leaves) |
|
||
| `test-germany-summary.json` | Germany L1 summary (16 Bundesländer, 16,402 nodes) |
|
||
| `test-dresden.json` | Sachsen → Dresden subtree with all children |
|
||
| `test-iterators.json` | DFS/BFS/walkLevel/findNode verification data |
|
||
|
||
### Name Tests
|
||
|
||
`src/__tests__/province-names.test.ts` — tests `getNames()` for France departments, exact matches, fuzzy suggestions.
|
||
|
||
---
|
||
|
||
## Architecture
|
||
|
||
```
|
||
packages/gadm/
|
||
├── cpp/ # C++ native pipeline (GDAL/GEOS/PROJ)
|
||
│ ├── src/ # main.cpp, gpkg_reader, geo_merge, ghs_enrich
|
||
│ ├── CMakeLists.txt
|
||
│ └── vcpkg.json
|
||
├── data/
|
||
│ ├── gadm_database.parquet # 356K rows, 6.29 MB
|
||
│ ├── gadm_continent.json # Continent → ISO3 mapping
|
||
│ └── ghs/ # GHS GeoTIFF rasters (optional)
|
||
├── dist/
|
||
│ └── win-x64/ # Compiled C++ binary + DLLs
|
||
├── scripts/
|
||
│ └── refresh-database.ts # GeoPackage → Parquet converter
|
||
├── src/
|
||
│ ├── database.ts # Parquet reader (hyparquet)
|
||
│ ├── names.ts # Name/code lookup + fuzzy match
|
||
│ ├── items.ts # GeoJSON boundaries from CDN
|
||
│ ├── gpkg-reader.ts # GeoPackage boundary reader + C++ cache fallback
|
||
│ ├── enrich-ghs.ts # GHS GeoTIFF enrichment (TS)
|
||
│ ├── wrapper.ts # Server-facing API with cache
|
||
│ ├── tree.ts # Tree builder + iterators
|
||
│ ├── index.ts # Barrel exports
|
||
│ └── __tests__/
|
||
│ ├── tree.test.ts # Tree building + iterator tests
|
||
│ └── province-names.test.ts
|
||
├── tests/
|
||
│ ├── tree/ # Test output JSONs
|
||
│ └── cache/gadm/ # Tree cache files
|
||
└── package.json
|
||
```
|
||
|
||
## Dependencies
|
||
|
||
| Package | Type | Purpose |
|
||
|---------|------|---------|
|
||
| `hyparquet` | runtime | Read Parquet files (zero native deps) |
|
||
| `zod` | runtime | Schema validation |
|
||
| `better-sqlite3` | dev | GeoPackage reader (refresh only) |
|
||
| `hyparquet-writer` | dev | Parquet writer (refresh only) |
|
||
| `vitest` | dev | Test runner |
|
||
| `typescript` | dev | Build |
|