CLI Tools
TopoBathySim includes a unified command-line utility — cache_manager — for
inspecting, verifying integrity, and purging cached data. A separate run_server
script starts the tile server.
Understanding the Cache Hierarchy
TopoBathySim maintains five cache tiers, ordered from the safest to the most expensive to rebuild:
# |
Tier |
Paths (relative to |
Rebuild cost |
|---|---|---|---|
1 |
Output tiles & metadata |
|
Seconds — re-rendered on next tile request |
2 |
Fused zarr |
|
Minutes — re-fused from provider zarr |
3 |
Provider zarr |
|
Minutes–hours — re-rasterised from raw source files |
4 |
Discovery caches |
5 JSON/GeoJSON metadata files (see below) |
Seconds–minutes — re-queried from disk or network |
5 |
Raw source files ⚠️ |
|
Hours — full re-download |
Tier 4 discovery cache files (lightweight indices, fast to rebuild):
File (relative to |
Stores |
|---|---|
|
bbox → BAG download-URL list (NCEI ArcGIS API) |
|
NCEI landing-page URL → |
|
tile_id → HTTPS asset URL (S3 glob lookup) |
|
tile footprint spatial index (downloaded from NOAA S3) |
|
per-tile RAT |
|
per-project tile-index shapefiles (one per lidar project) |
|
project ID → folder name (S3 index HTML) |
|
project spatial bbox index (GeoJSON) |
Discovery caches survive server restarts so tile generation is fully offline once a region is hydrated. Invalidate them when you expect newly published survey data to be available.
Note
noaa_bluetopo, noaa_topobathy, and usgs_3dep are streaming
providers — they never download raw source files. They read Cloud
Optimized GeoTIFFs directly from S3 and write straight to provider zarr.
Only ncei_bag, usgs_lidar, and ncei_cudem populate tier 5.
Cache Manager
Use cache_manager for all cache operations: status, integrity checks, and
selective or bulk purging.
Interactive Mode (default)
Run without arguments to enter the interactive menu:
python -m topobathysim.scripts.cache_manager
The menu displays a live status table and loops after each operation:
TopoBathySim Cache Manager
==========================
Cache root : ~/.cache/topobathysim
Total size : 4.2 GB
# Tier Size Items
────────────────────────────────────────────────────────
✓ [1] Output tiles & metadata 23.4 MB 847
✓ [2] Fused zarr 0.2 MB 3
✓ [3] Provider zarr 3.8 GB 12
✓ [4] Discovery caches 12.0 KB 5
✓ [5] Raw source files ⚠ 0.4 GB 38
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
[1–5] purge tier [c] check integrity [i] tier info [q] quit
Available interactive commands:
1–5— purge that tier (prompts for confirmation; tier 5 requires typingyes)c— run a read-only integrity audit across all providers and discovery cachesi— print detailed per-tier descriptions with when/why to purgeq— quit
Non-Interactive Flags
# 1. Integrity audit (read-only)
python -m topobathysim.scripts.cache_manager --check
# 2. Integrity check + delete corrupt files and stale locks
python -m topobathysim.scripts.cache_manager --clean
# 3. Purge a specific tier
python -m topobathysim.scripts.cache_manager --purge 1
# 4. Purge multiple tiers
python -m topobathysim.scripts.cache_manager --purge 1 2 4
# 5. Purge all tiers (prompts for tier 5 confirmation unless --yes)
python -m topobathysim.scripts.cache_manager --purge all
# 6. Dry run — preview what would be deleted without deleting
python -m topobathysim.scripts.cache_manager --dry-run --purge all
# 7. Skip confirmation prompts (use with care for tier 5)
python -m topobathysim.scripts.cache_manager --yes --purge 1 2
Arguments
--check— read-only integrity scan (corrupt files, stale locks, malformed JSON)--clean— same as--checkbut also deletes the corrupt/stale items--purge TIER [TIER ...]— tier number(s) 1–5, orall--dry-run— show what would be deleted without deleting--yes— skip confirmation prompts--lock-timeout N— seconds before a.lockfile is considered stale (default: 3600)
Common Scenarios
Re-render tiles after a style change
python -m topobathysim.scripts.cache_manager --purge 1 --yes
Force re-fusion after updating the policy YAML
# Purge fused zarr so next tile request re-runs the fusion pipeline
python -m topobathysim.scripts.cache_manager --purge 2 --yes
NCEI has published new multibeam surveys for your area
python -m topobathysim.scripts.cache_manager --purge 4 --yes
NOAA has updated BlueTopo tiles (filenames changed with a new date stamp)
# Stale tile_url_cache entries will 404; purge tier 4 to force re-discovery
python -m topobathysim.scripts.cache_manager --purge 4 --yes
BlueTopo sidecar RAT_Links returning 404 (stale tile scheme)
When NOAA republishes a tile with a new date stamp, the RAT_Link URLs
baked into the cached BlueTopo_Tile_Scheme.gpkg go stale. The provider
writes a .failed sentinel so downloads are not retried on every request,
but the underlying surveys can no longer be resolved to human-readable names
until the scheme is refreshed:
# Purge tier 4 — clears the GPKG, sidecars, and .failed sentinels together
python -m topobathysim.scripts.cache_manager --purge 4 --yes
A new NOAA topobathy lidar project was added to the archive
python -m topobathysim.scripts.cache_manager --purge 4 --yes
Start completely fresh (keep raw source files)
# Clears tiers 1–4; raw files are retained so re-rendering is fast
python -m topobathysim.scripts.cache_manager --purge 1 2 3 4 --yes
Check for corrupt or stale files after a crash
python -m topobathysim.scripts.cache_manager --check
# If issues are found, run with --clean to delete them
python -m topobathysim.scripts.cache_manager --clean
Note
Purging a tier does not purge lower-numbered tiers. For example, purging tier 4 (discovery caches) does not delete provider zarr or raw source files. If the underlying provider data is still cached, discovery caches will be re-populated from that local data on the next tile request — no network calls are needed for already-cached survey regions.
Server Management
To run the tile server for local development or production usage:
# Start the server on port 9595 with 8 worker processes
micromamba run -n topobathysim python service/run_server.py --host 0.0.0.0 --port 9595 --workers 8
Arguments:
--host: Bind address (default: 127.0.0.1)--port: Port to listen on (default: 9595)--workers: Number of worker processes (default: 4)--debug: Enable debug logging (default: 0)