CLI Tools

TopoBathySim includes a unified command-line utility — cache_manager — for inspecting, verifying integrity, and purging cached data. A separate run_server script starts the tile server.

Understanding the Cache Hierarchy

TopoBathySim maintains five cache tiers, ordered from the safest to the most expensive to rebuild:

Cache Tier Inventory

#

Tier

Paths (relative to ~/.cache/topobathysim/)

Rebuild cost

1

Output tiles & metadata

tiles/visual/, tiles/data/, tiles/raw/ (PNGs, NPY/NPZ, _meta.json, _src.npz)

Seconds — re-rendered on next tile request

2

Fused zarr

fused_zarr/ (multi-provider composite grids)

Minutes — re-fused from provider zarr

3

Provider zarr

{provider}/zarr/*.zarr (rasterised per-provider grids)

Minutes–hours — re-rasterised from raw source files

4

Discovery caches

5 JSON/GeoJSON metadata files (see below)

Seconds–minutes — re-queried from disk or network

5

Raw source files ⚠️

*.bag, *.tiff, *.tif, *.laz in provider directories

Hours — full re-download

Tier 4 discovery cache files (lightweight indices, fast to rebuild):

File (relative to ~/.cache/topobathysim/)

Stores

ncei_bag/discovery_cache.json

bbox → BAG download-URL list (NCEI ArcGIS API)

metadata/ncei_bag_redirects.json

NCEI landing-page URL → .bag file URL

noaa_bluetopo/tile_url_cache.json

tile_id → HTTPS asset URL (S3 glob lookup)

noaa_bluetopo/BlueTopo_Tile_Scheme.gpkg

tile footprint spatial index (downloaded from NOAA S3)

noaa_bluetopo/sidecars/

per-tile RAT .aux.xml files mapping pixel values to survey IDs; .failed sentinels mark tiles whose RAT_Link is stale (404)

noaa_topobathy/*.zip

per-project tile-index shapefiles (one per lidar project)

metadata/noaa_coastal_lidar.json

project ID → folder name (S3 index HTML)

metadata/noaa_project_extents.geojson

project spatial bbox index (GeoJSON)

Discovery caches survive server restarts so tile generation is fully offline once a region is hydrated. Invalidate them when you expect newly published survey data to be available.

Note

noaa_bluetopo, noaa_topobathy, and usgs_3dep are streaming providers — they never download raw source files. They read Cloud Optimized GeoTIFFs directly from S3 and write straight to provider zarr. Only ncei_bag, usgs_lidar, and ncei_cudem populate tier 5.

Cache Manager

Use cache_manager for all cache operations: status, integrity checks, and selective or bulk purging.

Interactive Mode (default)

Run without arguments to enter the interactive menu:

python -m topobathysim.scripts.cache_manager

The menu displays a live status table and loops after each operation:

TopoBathySim Cache Manager
==========================
  Cache root : ~/.cache/topobathysim
  Total size : 4.2 GB

  #   Tier                            Size    Items
  ────────────────────────────────────────────────────────
✓  [1] Output tiles & metadata          23.4 MB      847
✓  [2] Fused zarr                        0.2 MB        3
✓  [3] Provider zarr                     3.8 GB       12
✓  [4] Discovery caches                 12.0 KB        5
✓  [5] Raw source files ⚠               0.4 GB       38

  ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
  [1–5] purge tier   [c] check integrity   [i] tier info   [q] quit

Available interactive commands:

  • 15 — purge that tier (prompts for confirmation; tier 5 requires typing yes)

  • c — run a read-only integrity audit across all providers and discovery caches

  • i — print detailed per-tier descriptions with when/why to purge

  • q — quit

Non-Interactive Flags

# 1. Integrity audit (read-only)
python -m topobathysim.scripts.cache_manager --check

# 2. Integrity check + delete corrupt files and stale locks
python -m topobathysim.scripts.cache_manager --clean

# 3. Purge a specific tier
python -m topobathysim.scripts.cache_manager --purge 1

# 4. Purge multiple tiers
python -m topobathysim.scripts.cache_manager --purge 1 2 4

# 5. Purge all tiers (prompts for tier 5 confirmation unless --yes)
python -m topobathysim.scripts.cache_manager --purge all

# 6. Dry run — preview what would be deleted without deleting
python -m topobathysim.scripts.cache_manager --dry-run --purge all

# 7. Skip confirmation prompts (use with care for tier 5)
python -m topobathysim.scripts.cache_manager --yes --purge 1 2

Arguments

  • --check — read-only integrity scan (corrupt files, stale locks, malformed JSON)

  • --clean — same as --check but also deletes the corrupt/stale items

  • --purge TIER [TIER ...] — tier number(s) 1–5, or all

  • --dry-run — show what would be deleted without deleting

  • --yes — skip confirmation prompts

  • --lock-timeout N — seconds before a .lock file is considered stale (default: 3600)

Common Scenarios

Re-render tiles after a style change

python -m topobathysim.scripts.cache_manager --purge 1 --yes

Force re-fusion after updating the policy YAML

# Purge fused zarr so next tile request re-runs the fusion pipeline
python -m topobathysim.scripts.cache_manager --purge 2 --yes

NCEI has published new multibeam surveys for your area

python -m topobathysim.scripts.cache_manager --purge 4 --yes

NOAA has updated BlueTopo tiles (filenames changed with a new date stamp)

# Stale tile_url_cache entries will 404; purge tier 4 to force re-discovery
python -m topobathysim.scripts.cache_manager --purge 4 --yes

BlueTopo sidecar RAT_Links returning 404 (stale tile scheme)

When NOAA republishes a tile with a new date stamp, the RAT_Link URLs baked into the cached BlueTopo_Tile_Scheme.gpkg go stale. The provider writes a .failed sentinel so downloads are not retried on every request, but the underlying surveys can no longer be resolved to human-readable names until the scheme is refreshed:

# Purge tier 4 — clears the GPKG, sidecars, and .failed sentinels together
python -m topobathysim.scripts.cache_manager --purge 4 --yes

A new NOAA topobathy lidar project was added to the archive

python -m topobathysim.scripts.cache_manager --purge 4 --yes

Start completely fresh (keep raw source files)

# Clears tiers 1–4; raw files are retained so re-rendering is fast
python -m topobathysim.scripts.cache_manager --purge 1 2 3 4 --yes

Check for corrupt or stale files after a crash

python -m topobathysim.scripts.cache_manager --check

# If issues are found, run with --clean to delete them
python -m topobathysim.scripts.cache_manager --clean

Note

Purging a tier does not purge lower-numbered tiers. For example, purging tier 4 (discovery caches) does not delete provider zarr or raw source files. If the underlying provider data is still cached, discovery caches will be re-populated from that local data on the next tile request — no network calls are needed for already-cached survey regions.

Server Management

To run the tile server for local development or production usage:

# Start the server on port 9595 with 8 worker processes
micromamba run -n topobathysim python service/run_server.py --host 0.0.0.0 --port 9595 --workers 8

Arguments:

  • --host: Bind address (default: 127.0.0.1)

  • --port: Port to listen on (default: 9595)

  • --workers: Number of worker processes (default: 4)

  • --debug: Enable debug logging (default: 0)