Runtime Architecture
The TopoBathySim Runtime is a stateless engine that executes Fusion Policies.
Core Concept
Unlike traditional pipelines that hardcode logic (e.g., “Always put Lidar over GEBCO”), the Runtime is agnostic. It simply executes the list of steps defined in the Policy YAML.
graph TD
subgraph Policy Loading
Policy[Policy YAML] -->|Load & Validate| Schema[Pydantic Models]
Schema --> Runtime[Runtime Engine]
end
subgraph Runtime Execution
Runtime -->|1. Initialize| Canvas[Canvas: Elevation + Provenance]
Runtime -->|2. Loop Steps| Registry[Provider Registry]
Registry -->|Get Provider| Match
Runtime -->|3. Fetch & Align| Match[Fetch Layer (Lazy/Cached) + Reproject]
Match -->|4. Check Rules| Rules{Transition Rules?}
Rules -- Yes --> TransOp[Apply Specific Operator (e.g. Feather)]
Rules -- No --> DefOp[Apply Default Operator (e.g. Overwrite)]
TransOp --> Blend[Blend into Canvas]
DefOp --> Blend
Blend -->|Update| Provenance[Update Source Mask]
end
subgraph Output
Provenance -->|Finalize| Dataset[Xarray Dataset]
Dataset -->|Save| Zarr[Fused Zarr Cache]
Dataset -->|Write sidecars| Sidecars["_meta.json + _src.npz"]
end
Components
Runtime Engine (topobathysim.runtime)
The entry point run(policy_path, bbox) orchestrates the entire process:
Canvas Initialization: Creates a blank
xarray.Datasetcovering the requested BBox in the Policy’s CRS.Step Execution: Iterates through policy steps.
Data Fetching: Calls
provider.fetch_layer().Alignment: Reprojects the fetched layer to match the Canvas pixel grid (using
rio.reproject_match).Composition: Applies the blening operator (Overwrite/Feather) to merge the aligned layer into the canvas.
Providers (topobathysim.providers)
Providers are standardized adapters that fetch data from remote sources.
Lazy Loading: Providers like
gebco_2025only open network connections when a requested tile is not in the local Zarr cache.Smart Caching: Data is cached locally as Zarr (for dense grids) or COGs/LAZ (for source files).
Aliases: Providers are registered with short names (e.g.,
usgs_3dep,ncei_bag) used in the Policy YAML.No-Data Signalling: When a provider has no coverage for a requested cell it raises
ProviderNoDataError(aLookupErrorsubclass). This is a normal operating condition — the runtime skips the provider silently atDEBUGlog level. Only genuine unexpected failures are logged atERRORwith a traceback.
Provenance System
Every tile carries a full record of which source dataset contributed each pixel.
Provider → Dataset
Each fetch_layer() call returns an xr.Dataset containing two arrays:
elevation— float32 depth/height valuessource_id— uint32 pixel map where each value identifies the contributing survey or sub-tile (provider-assigned IDs for BAG/BlueTopo; hash-derived IDs for other providers)
The dataset’s attrs["provenance_dict"] maps each source_id integer to
a {"name": "...", "provider": "..."} dict.
Runtime → Fused Dataset
As the runtime processes policy steps, it accumulates provenance_dict entries
from all providers into a single cell_provenance_dict. The final fused
xr.Dataset carries the merged provenance across all cells.
Service → Tile Sidecars
When a rendered tile is written to the tile cache, two sidecar files are written alongside it:
File |
Contents |
|---|---|
|
|
|
Compressed uint32 |
Viewer
The Leaflet popup fetches both /metadata and /pixel for each clicked
location, displaying the pixel-level contributing survey (name + provider color
swatch) and a deduplicated, priority-sorted list of all surveys present in the
tile.
Caching Strategy
Source Cache: Original files (TIFF, BAG, LAZ) downloaded from agencies (download providers only; streaming providers write directly to zarr).
Provider Zarr Cache: Intermediate rasterised chunks per provider, stored for fast repeated access.
Fused Zarr Cache: Multi-provider composite grids keyed by
{policy, cell_bbox, resolution, crs}hash.Tile Cache: Final XYZ PNG/NPY/NPZ tiles at
~/.cache/topobathysim/tiles/, with_meta.jsonand_src.npzsidecars.
See CLI Tools for cache tier management.