Systems & Integrations

Platform Landscape

roche-data does not replace existing enterprise systems — it orchestrates them. Each system retains its role as a specialist, while the compiler weaves them into a coherent, automated data pipeline.

The diagram below shows how data flows from the canonical model through generation, quality assurance, and governance to reach business users and AI tools.

Platform landscape — data flow from RTiS through the compiler to deployment targets and governance systems

Core Systems

RTiS — Canonical Model Master


Role	The single source of truth for all data models. Ontologies, terminologies, synonyms, and entity relationships are maintained here by domain experts.
How roche-data uses it	Pulls entity definitions via REST/GraphQL. Every generated artifact traces back to an RTiS model version.
Business value	Domain experts work in familiar tooling. The platform reads their work and automates everything downstream.

GUPRI — Persistent Identifier Backbone


Role	Every artifact — every entity, every contract, every API — receives a globally unique, resolvable URI through GUPRI.
How roche-data uses it	Registers new artifacts and resolves existing identifiers during generation.
Business value	Unambiguous cross-system references. When Collibra, Snowflake, and an API all reference the same entity, they share the same GUPRI identifier.

Snowflake — Data Warehouse & AI Engine


Role	Houses all data: Bronze tables (physical, append-only), Silver views (validated), Gold views (business-ready). Hosts the Semantic Layer and Cortex Analyst.
How roche-data uses it	Generates DDL, dbt models, and Semantic YAML. Quality gates are embedded as view predicates.
Business value	One platform for storage, transformation, quality, and AI. No data movement between systems — views query the same physical data.

MRHub — Master Data Reference


Role	Master data for business identity validation. Provides reference data for the G2 validity gate (e.g., “Is this site ID a real Roche facility?”).
How roche-data uses it	Validates records against master data during Silver view queries. Listens for reference data changes via Solace events.
Business value	Business identity checks happen automatically. Orphan records (references to non-existent sites, unknown material codes) are filtered before reaching analysts.

Governance & Compliance Systems

Collibra — Business Governance Catalog


Role	Receives generated metadata on deployment. Business glossary, data lineage, ownership, and quality scores are synchronized automatically.
How roche-data uses it	Pushes contract metadata, quality gate results, and lineage information after successful deployment.
Business value	The governance catalog stays current without manual curation. Data stewards review and enrich — they don’t have to build from scratch.

OPA on CaaS — Policy Enforcement Runtime


Role	Runs compiled policies as live REST services on Roche’s managed Kubernetes platform. Covers validation, access control, workflow, API authorization, audit, and deployment gates.
How roche-data uses it	Compiles YAML policy definitions to Rego, generates Kubernetes manifests, deploys one OPA instance per entity.
Business value	Policies are code — versioned, reviewed, tested, and deployed like any other artifact. No more spreadsheet-based access matrices or manual deployment checklists.

ServiceNow — Change Management


Role	Automated change request creation for production deployments. Ensures every data platform change follows Roche’s ITIL change management process.
How roche-data uses it	Creates change requests via the Table API before production deployments. Links the change to the git commit and PR.
Business value	Change management compliance without manual ticket creation. Every deployment has an auditable change record.

Delivery & Distribution Systems

GitHub Actions — CI/CD Pipeline


Role	Validates artifacts on pull request, deploys on merge, builds documentation. The automation backbone.
How roche-data uses it	Three workflows: `validate.yml` (PR checks), `deploy.yml` (merge deployment), `docs.yml` (documentation rebuild).
Business value	Every change is validated before it reaches production. Peer review through pull requests. Full audit trail in git history.

Mulesoft — API Management


Role	Publishes generated OpenAPI specifications as managed API proxies with authentication, rate limiting, and monitoring.
How roche-data uses it	Pushes generated OpenAPI specs to Anypoint Platform, creating or updating managed API proxies.
Business value	Data consumers get standard REST APIs with enterprise-grade security, monitoring, and SLA management — without any custom development.

Stretch Goal Systems

LeanIX — Enterprise Architecture


Role	Enterprise architecture catalog. Generated artifacts are registered as IT components in the EA landscape.
How roche-data uses it	Registers data products, APIs, and platform components in the LeanIX catalog via REST API.
Business value	Enterprise architects have real-time visibility into the data platform’s footprint without manual documentation.

Data Marketplace — Self-Service Discovery


Role	Data product index for self-service discovery across Roche. Business users browse and subscribe to data products.
How roche-data uses it	Publishes data product metadata (contract, quality scores, access instructions) to the marketplace.
Business value	Business users find and request access to data products through a self-service portal — no tickets, no waiting for someone to point them to the right table.

Integration Status

System	Access status	Current mode
RTiS	Pending (A01)	Stub client with fixture data
GUPRI	Pending (A02)	Stub client with fixture data
MRHub	Pending (A03, A04)	Stub client with fixture data
Snowflake	Pending (A05, A06)	Stub client with fixture data
Collibra	Pending (A07, A08)	Stub client with fixture data
Mulesoft	Pending (A09)	Stub client with fixture data
ServiceNow	Pending (A12)	Stub client with fixture data
CaaS (Kubernetes)	Pending (A13)	Stub client with fixture data
GitHub Actions	Active	CI/CD operational
LeanIX	Pending (A14)	Stretch goal
Data Marketplace	Pending (A15)	Stretch goal

Design principle: The entire pipeline runs in --dry-run mode using stub clients that return realistic fixture data. This means development and testing proceed at full speed while platform access requests are being processed. When access is granted, swapping StubClient for HttpClient is a configuration change — not a code change.