A data infrastructure compiler that turns Roche's canonical data models into production-ready pipelines, governed APIs, and conversational AI — fully automated, fully auditable, fully compliant.
RDT MODEL is a data infrastructure compiler for Roche Global IT. A single command — rdt-model-compile run --entity <name> — takes an RTiS ontology model, assembles metadata from RTiS (schema), Collibra (governance), and platform config (infrastructure), then orchestrates 18 specialized CLI modules across 6 pipeline phases to produce a complete data product: Snowflake layers, data contracts, OPA policies, SDKs, MCP tools, API specs, documentation, and audit trail. All artifacts are committed to git and deployed through GitHub Actions.
RTiS is the grammar. RDT MODEL is the compiler. Git is the object store. dbt is the runtime.
Roche operates hundreds of isolated data products across global sites — each with its own schema definitions, quality rules, and access patterns. Adding a new business question today takes 6–8 weeks and requires specialist intervention at every layer.
RDT MODEL changes this equation. One command. Minutes, not months.
Automated Pipeline Generation
A single rdt-model-compile run command reads a canonical data model from RTiS and generates every downstream artifact: data contracts, Snowflake schemas, dbt models across Bronze/Silver/Gold layers, semantic definitions, OpenAPI specs, and AI tool definitions. All committed to git. All deployed through CI/CD.
Four-Gate Data Quality
Every data record passes through four progressive quality gates — from technical completeness (G1) through business validity (G2), domain-specific rules (G3), to AI-readiness certification (G4). No record reaches a dashboard or AI model without earning its trust level.
Enterprise Policy Enforcement
A unified policy engine powered by Open Policy Agent (OPA) governs six domains from a single YAML definition: validation rules, row/column access control, workflow state machines, API authorization, audit compliance triggers, and deployment gates.
AI-Ready from Day One
Every generated artifact includes semantic definitions that Snowflake Cortex Analyst and MCP-compatible AI tools understand natively. The moment data passes all four quality gates, it’s available for natural-language queries.