Skip to content

Pipeline Configuration

Reference for all configuration that controls pipeline orchestration. For the practical how-to, see Running Multi-Entity Pipelines.

Defined in roche-data.toml under [entities]. This is the authoritative list for multi-entity pipeline runs and the GitHub Actions matrix.

[entities]
list = ["waste-tracking", "organization-site"]

Each entry must:

  • Be a valid entity name (lowercase, alphanumeric, hyphens, underscores)
  • Have a corresponding [rtis.entities.<name>] section with RTiS IDs
  • Produce a models/<name>/model.json after first rdt-model-pull run
VariableDescriptionExample
RDT_TARGETTarget environmentdev, test, prod
VariableDescriptionDefault
RDT_WORKSPACE_DIRBase directory for pipeline workspaces$TMPDIR or /tmp
RDT_RUN_IDCorrelation ID for the pipeline run (UUIDv7)Auto-generated
VariableSet byDescription
VAULT_ADDRGitHub EnvironmentVault URL for secret loading
VAULT_NAMESPACEGitHub EnvironmentVault namespace (rdt-model-prd)
All SNOWFLAKE_*Vault actionSnowflake credentials per environment
All RTIS_*Vault actionRTiS API credentials
FlagShortDescription
--target <env>-tRequired. Target environment: dev, test, prod
--entity <name>-eRequired. Entity identifier
--workspace <path>Enable workspace isolation mode
--phase <name>Run only a single phase
--dry-run-nSkip all side-effects
--jsonEmit PipelineResult to stdout
--run-id <id>Correlation ID (generated if omitted)
--quiet-qSuppress tracing output
--verbose-vIncrease verbosity (-v = DEBUG, -vv = TRACE)
FlagShortDescription
--target <env>-tRequired. Target environment
--entity <name>-eRequired. Entity to promote
--workspace <path>Required. Workspace directory to promote from
--dry-run-nShow what would be promoted without copying
--jsonEmit PromoteResult to stdout

Every rdt-model-* binary accepts --manifest <path> as an alternative to --target + --entity. When provided, the module reads all invocation context from the manifest JSON.

Terminal window
rdt-model-store --manifest /tmp/rdt-waste-tracking-abc/store-manifest.json --json generate

Conflicts with --entity (cannot specify both).

JSON Schema: cli/common/schemas/modules/pipeline-manifest.schema.json

FieldTypeRequiredDescription
entity_idstringyesEntity identifier (^[a-z0-9_-]+$)
workspacestringyesAbsolute path to workspace root
targetstring (enum)yesdev, test, or prod
run_idstringyesUUIDv7 correlation ID
dry_runbooleannoDefault: false
inputsobjectnoNamed input artifact paths (workspace-relative)
output_dirstringyesPhase subdirectory for outputs (workspace-relative)
{
"entity_id": "waste-tracking",
"workspace": "/tmp/rdt-waste-tracking-019600ab-cdef",
"target": "dev",
"run_id": "019600ab-cdef-7000-89ab-0123456789ab",
"dry_run": false,
"inputs": {
"model": "pull/model.json",
"governance": "govern/governance.json"
},
"output_dir": "deploy"
}

The orchestrator populates the inputs map based on outputs from previous phases:

PhaseModuleAvailable input keys
1 Ingestpull, profile(none — entry point)
2 Enrichgovern, infermodel
3 Preparevalidatemodel, governance
4 Deploystore, policy, api, mcp, sdk, contractmodel, governance, suggestions
5 Registerregister, gupri, searchAll Phase 4 outputs
6 Supportdocs, cidb, eventAll Phase 5 outputs

Emitted by rdt-model-compile run --json:

{
"entity_id": "waste-tracking",
"status": "ok",
"timestamp": "2026-05-12T10:30:00Z",
"duration_ms": 45000,
"phases": [
{
"phase": "ingest",
"status": "ok",
"duration_ms": 5000,
"modules": [ /* ModuleResult[] */ ]
}
]
}

Emitted by every rdt-model-* binary with --json:

{
"module": "rdt-model-store",
"version": "0.1.0",
"status": "ok",
"entity_id": "waste-tracking",
"timestamp": "2026-05-12T10:30:05Z",
"duration_ms": 1200,
"outputs": {
"snowflake/ddl/waste-tracking.sql": "wrote",
"dbt/models/bronze/waste-tracking.sql": "updated"
},
"errors": []
}

Emitted by rdt-model-compile promote --json:

{
"entity_id": "waste-tracking",
"promoted": [
{"source": "pull/model.json", "destination": "models/waste-tracking/model.json", "action": "wrote"}
],
"skipped": ["infer/suggestions.json"]
}
StatusMeaning
okAll required modules succeeded, at least one artifact produced
warningAll required succeeded, but all outputs were skipped (unchanged or dry-run)
partialSome required modules failed, but some outputs were produced
errorComplete failure — no outputs (or all required modules failed)
ActionMeaning
wroteNew file created
updatedExisting file overwritten with different content
skippedDry-run, or file already exists with identical content

Created by scripts/pipeline.sh or WorkspaceConfig::create():

/tmp/rdt-{entity}-{run_id}/
├── pull/
├── govern/
├── infer/
├── compile/
│ └── artifacts/
├── validate/
├── deploy/
├── register/
├── support/
├── pipeline-result.json
└── pipeline-stderr.log

scripts/pipeline.sh <entity> [options]

ArgumentDescription
<entity>Required. Entity identifier
--target <env>Target environment (default: $RDT_TARGET or dev)
--keep-workspaceDo not delete workspace after completion

entity-pipeline.yml supports workflow_dispatch with:

InputRequiredDefaultDescription
entitiesno(all from roche-data.toml)Comma-separated entity list
targetyesdevTarget environment (choice: dev/test/prod)