Pipeline Overview
The RDT MODEL platform produces a complete data product from a single RTiS entity. The pipeline chains 20 modules across 6 phases. Each module reads a JSON manifest, writes a JSON result, and operates in an isolated workspace that enables safe parallel execution across entities.
This page describes the pipeline structure. See the Modules section for each module’s documentation.
Pipeline phases
Section titled “Pipeline phases”Phases run sequentially. Modules within the same phase can run in parallel when they have no dependency on each other.
Data flow
Section titled “Data flow”Every arrow is a concrete JSON path connecting one module’s output to the next module’s input.
Module dependency matrix
Section titled “Module dependency matrix”| Phase | Module | Depends on | Parallel with |
|---|---|---|---|
| 1 Ingest | rdt-model-pull | (entry point) | Profile |
| 1 Ingest | rdt-model-profile (optional) | (entry point) | Pull |
| 2 Enrich | rdt-model-govern | (entry point) | Infer |
| 2 Enrich | rdt-model-infer (optional) | Pull | Govern |
| 3 Prepare | rdt-model-compile | Pull, Govern, Infer | — |
| 3 Prepare | rdt-model-validate | Compile | — |
| 4 Deploy | rdt-model-store | Compile, Validate | Policy, Api, Mcp, Sdk, Contract |
| 4 Deploy | rdt-model-policy | Compile, Validate | Store, Api, Mcp, Sdk, Contract |
| 4 Deploy | rdt-model-api | Compile, Validate | Store, Policy, Mcp, Sdk, Contract |
| 4 Deploy | rdt-model-mcp | Compile, Validate | Store, Policy, Api, Sdk, Contract |
| 4 Deploy | rdt-model-sdk | Compile, Validate | Store, Policy, Api, Mcp, Contract |
| 4 Deploy | rdt-model-contract | Compile, Validate | Store, Policy, Api, Mcp, Sdk |
| 5 Register | rdt-model-register | All Deploy modules | Gupri, Search |
| 5 Register | rdt-model-gupri | Compile | Register, Search |
| 5 Register | rdt-model-search | Compile | Register, Gupri |
| 6 Support | rdt-model-docs | Compile | Cidb, Event |
| 6 Support | rdt-model-cidb | All Register modules | Docs, Event |
| 6 Support | rdt-model-event | All Register modules | Docs, Cidb |
Data wiring
Section titled “Data wiring”This diagram shows exactly which output field feeds which input field across module boundaries.
Workspace isolation
Section titled “Workspace isolation”Every pipeline run creates an isolated workspace directory, enabling safe parallel execution.
Path convention
Section titled “Path convention”{base_dir}/rdt-{entity_id}-{run_id}/| Component | Source | Example |
|---|---|---|
base_dir | $RDT_WORKSPACE_DIR or $TMPDIR or /tmp | /tmp |
entity_id | Sanitised entity name | waste-tracking |
run_id | UUIDv4 | a1b2c3d4-e5f6-7890-abcd-ef1234567890 |
Workspace layout
Section titled “Workspace layout”Lifecycle
Section titled “Lifecycle”- Create — orchestrator generates workspace with unique UUID.
- Populate — each module writes to its own subdirectory.
- Promote — after validation, copy artifacts to final repo paths via
paths.rs. - Clean up — delete workspace (or retain with
--keep-workspacefor debugging).
Parallel safety
Section titled “Parallel safety”Multiple entities run concurrently in separate workspaces. Even two runs of the same entity are safe — different UUIDs mean different directories.
Entity A: /tmp/rdt-waste-tracking-{uuid-a}/ ← independentEntity B: /tmp/rdt-site-energy-{uuid-b}/ ← independentEntity C: /tmp/rdt-vendor-quality-{uuid-c}/ ← independentOrchestration
Section titled “Orchestration”The pipeline is composed by an external orchestrator (shell script or GitHub Actions), not by a module. Each step invokes the binary with a JSON manifest.
Shell script
Section titled “Shell script”#!/usr/bin/env bashset -euo pipefail
ENTITY="$1"RUN_ID="$(uuidgen)"WS="/tmp/rdt-${ENTITY}-${RUN_ID}"mkdir -p "$WS"
# Phase 1: Ingestrdt-model-pull --manifest <(jq -n \ --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id: $e, workspace: $ws}')
# Phase 2: Enrichrdt-model-govern --manifest <(jq -n \ --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}')
# Phase 3: Preparerdt-model-compile --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws}')rdt-model-validate --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws}')
# Phase 4: Deploy (parallel)rdt-model-store --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &rdt-model-policy --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &rdt-model-api --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &rdt-model-mcp --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &rdt-model-sdk --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &rdt-model-contract --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &wait
# Phase 5: Register (parallel)rdt-model-register --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws}') &rdt-model-gupri --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws}') &rdt-model-search --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws}') &wait
# Phase 6: Support (parallel)rdt-model-docs --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws}') &rdt-model-cidb --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws}') &rdt-model-event --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \ '{entity_id:$e, workspace:$ws}') &wait
# Promote and clean uprdt-model-compile promote --workspace "$WS" --entity "$ENTITY"rm -rf "$WS"GitHub Actions (multi-entity parallel)
Section titled “GitHub Actions (multi-entity parallel)”jobs: pipeline: strategy: matrix: entity: [waste-tracking, site-energy, vendor-quality] steps: - uses: actions/checkout@v4 - name: Run pipeline run: ./scripts/pipeline.sh ${{ matrix.entity }}Each matrix job runs in its own runner — full parallelism, zero contention.
Promote step
Section titled “Promote step”After validation passes, the promote step copies artifacts from the workspace to their final repository paths. This keeps the repo untouched if validation fails.
rdt-model-compile promote --workspace "$WS" --entity waste-trackingPromote reads compile/compile-result.json, maps each artifact to its destination via paths.rs, and copies. It writes promote-result.json listing files created, updated, or unchanged.
JSON Schema validation
Section titled “JSON Schema validation”Every JSON exchanged between modules has a corresponding JSON Schema. Validation happens at three layers:
| Layer | Where | What |
|---|---|---|
| Library | Inside every module | common::manifest::load_and_validate() validates manifests on entry and results on exit |
| CLI | rdt-model-validate schema | Standalone validation of any JSON against any schema |
| Embedded | At compile time | All schemas are embedded via include_str! — the binary is self-contained |
The jsonschema crate (already a workspace dependency) handles all validation. No external tool needed.
Further reading
Section titled “Further reading”- Running Multi-Entity Pipelines — practical guide to CI orchestration, workspace lifecycle, promotion, and debugging
- Pipeline Configuration — entity list, manifest schema, CLI flags, environment variables
- rdt-model-compile — orchestrator module reference
Related ADRs
Section titled “Related ADRs”- ADR 0006 — Multi-binary workspace (superseded by ADR 0011)
- ADR 0007 — Data product lifecycle (defines the pipeline phases)
- ADR 0008 — CLI module standards (defines per-binary conventions)
- ADR 0009 — Module I/O contracts (full specification)
- ADR 0011 — Pipeline restructure (18-module / 6-phase inventory)