Pipeline Overview

The RDT MODEL platform produces a complete data product from a single RTiS entity. The pipeline chains 20 modules across 6 phases. Each module reads a JSON manifest, writes a JSON result, and operates in an isolated workspace that enables safe parallel execution across entities.

This page describes the pipeline structure. See the Modules section for each module’s documentation.

Pipeline phases

Pipeline phases — 6 sequential phases with modules that can run in parallel within each phase

Phases run sequentially. Modules within the same phase can run in parallel when they have no dependency on each other.

Data flow

Every arrow is a concrete JSON path connecting one module’s output to the next module’s input.

Pipeline data flow — external systems feeding into and consuming from the pipeline workspace

Module dependency matrix

Phase	Module	Depends on	Parallel with
1 Ingest	`rdt-model-pull`	(entry point)	Profile
1 Ingest	`rdt-model-profile` (optional)	(entry point)	Pull
2 Enrich	`rdt-model-govern`	(entry point)	Infer
2 Enrich	`rdt-model-infer` (optional)	Pull	Govern
3 Prepare	`rdt-model-compile`	Pull, Govern, Infer	—
3 Prepare	`rdt-model-validate`	Compile	—
4 Deploy	`rdt-model-store`	Compile, Validate	Policy, Api, Mcp, Sdk, Contract
4 Deploy	`rdt-model-policy`	Compile, Validate	Store, Api, Mcp, Sdk, Contract
4 Deploy	`rdt-model-api`	Compile, Validate	Store, Policy, Mcp, Sdk, Contract
4 Deploy	`rdt-model-mcp`	Compile, Validate	Store, Policy, Api, Sdk, Contract
4 Deploy	`rdt-model-sdk`	Compile, Validate	Store, Policy, Api, Mcp, Contract
4 Deploy	`rdt-model-contract`	Compile, Validate	Store, Policy, Api, Mcp, Sdk
5 Register	`rdt-model-register`	All Deploy modules	Gupri, Search
5 Register	`rdt-model-gupri`	Compile	Register, Search
5 Register	`rdt-model-search`	Compile	Register, Gupri
6 Support	`rdt-model-docs`	Compile	Cidb, Event
6 Support	`rdt-model-cidb`	All Register modules	Docs, Event
6 Support	`rdt-model-event`	All Register modules	Docs, Cidb

Data wiring

This diagram shows exactly which output field feeds which input field across module boundaries.

Data wiring between modules — field-level connections from outputs to inputs

Workspace isolation

Every pipeline run creates an isolated workspace directory, enabling safe parallel execution.

Path convention

{base_dir}/rdt-{entity_id}-{run_id}/

Component	Source	Example
`base_dir`	`$RDT_WORKSPACE_DIR` or `$TMPDIR` or `/tmp`	`/tmp`
`entity_id`	Sanitised entity name	`waste-tracking`
`run_id`	UUIDv4	`a1b2c3d4-e5f6-7890-abcd-ef1234567890`

Workspace layout

Workspace layout — directory tree showing per-phase subdirectories and result files

Lifecycle

Create — orchestrator generates workspace with unique UUID.
Populate — each module writes to its own subdirectory.
Promote — after validation, copy artifacts to final repo paths via paths.rs.
Clean up — delete workspace (or retain with --keep-workspace for debugging).

Parallel safety

Multiple entities run concurrently in separate workspaces. Even two runs of the same entity are safe — different UUIDs mean different directories.

Entity A:  /tmp/rdt-waste-tracking-{uuid-a}/     ← independent
Entity B:  /tmp/rdt-site-energy-{uuid-b}/        ← independent
Entity C:  /tmp/rdt-vendor-quality-{uuid-c}/     ← independent

Orchestration

The pipeline is composed by an external orchestrator (shell script or GitHub Actions), not by a module. Each step invokes the binary with a JSON manifest.

Shell script

#!/usr/bin/env bash
set -euo pipefail

ENTITY="$1"
RUN_ID="$(uuidgen)"
WS="/tmp/rdt-${ENTITY}-${RUN_ID}"
mkdir -p "$WS"

# Phase 1: Ingest
rdt-model-pull --manifest <(jq -n \
  --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id: $e, workspace: $ws}')

# Phase 2: Enrich
rdt-model-govern --manifest <(jq -n \
  --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}')

# Phase 3: Prepare
rdt-model-compile --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws}')
rdt-model-validate --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws}')

# Phase 4: Deploy (parallel)
rdt-model-store          --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &
rdt-model-policy         --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &
rdt-model-api            --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &
rdt-model-mcp            --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &
rdt-model-sdk            --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &
rdt-model-contract       --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws, model_path:"pull/model.json"}') &
wait

# Phase 5: Register (parallel)
rdt-model-register       --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws}') &
rdt-model-gupri          --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws}') &
rdt-model-search         --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws}') &
wait

# Phase 6: Support (parallel)
rdt-model-docs           --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws}') &
rdt-model-cidb           --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws}') &
rdt-model-event          --manifest <(jq -n --arg e "$ENTITY" --arg ws "$WS" \
  '{entity_id:$e, workspace:$ws}') &
wait

# Promote and clean up
rdt-model-compile promote --workspace "$WS" --entity "$ENTITY"
rm -rf "$WS"

GitHub Actions (multi-entity parallel)

jobs:
  pipeline:
    strategy:
      matrix:
        entity: [waste-tracking, site-energy, vendor-quality]
    steps:
      - uses: actions/checkout@v4
      - name: Run pipeline
        run: ./scripts/pipeline.sh ${{ matrix.entity }}

Each matrix job runs in its own runner — full parallelism, zero contention.

Promote step

After validation passes, the promote step copies artifacts from the workspace to their final repository paths. This keeps the repo untouched if validation fails.

rdt-model-compile promote --workspace "$WS" --entity waste-tracking

Promote reads compile/compile-result.json, maps each artifact to its destination via paths.rs, and copies. It writes promote-result.json listing files created, updated, or unchanged.

JSON Schema validation

Every JSON exchanged between modules has a corresponding JSON Schema. Validation happens at three layers:

Layer	Where	What
Library	Inside every module	`common::manifest::load_and_validate()` validates manifests on entry and results on exit
CLI	`rdt-model-validate schema`	Standalone validation of any JSON against any schema
Embedded	At compile time	All schemas are embedded via `include_str!` — the binary is self-contained

The jsonschema crate (already a workspace dependency) handles all validation. No external tool needed.