Skip to content

Systems & Integrations

roche-data does not replace existing enterprise systems — it orchestrates them. Each system retains its role as a specialist, while the compiler weaves them into a coherent, automated data pipeline.

The diagram below shows how data flows from the canonical model through generation, quality assurance, and governance to reach business users and AI tools.

Platform landscape — data flow from RTiS through the compiler to deployment targets and governance systems


RoleThe single source of truth for all data models. Ontologies, terminologies, synonyms, and entity relationships are maintained here by domain experts.
How roche-data uses itPulls entity definitions via REST/GraphQL. Every generated artifact traces back to an RTiS model version.
Business valueDomain experts work in familiar tooling. The platform reads their work and automates everything downstream.
RoleEvery artifact — every entity, every contract, every API — receives a globally unique, resolvable URI through GUPRI.
How roche-data uses itRegisters new artifacts and resolves existing identifiers during generation.
Business valueUnambiguous cross-system references. When Collibra, Snowflake, and an API all reference the same entity, they share the same GUPRI identifier.
RoleHouses all data: Bronze tables (physical, append-only), Silver views (validated), Gold views (business-ready). Hosts the Semantic Layer and Cortex Analyst.
How roche-data uses itGenerates DDL, dbt models, and Semantic YAML. Quality gates are embedded as view predicates.
Business valueOne platform for storage, transformation, quality, and AI. No data movement between systems — views query the same physical data.
RoleMaster data for business identity validation. Provides reference data for the G2 validity gate (e.g., “Is this site ID a real Roche facility?”).
How roche-data uses itValidates records against master data during Silver view queries. Listens for reference data changes via Solace events.
Business valueBusiness identity checks happen automatically. Orphan records (references to non-existent sites, unknown material codes) are filtered before reaching analysts.

RoleReceives generated metadata on deployment. Business glossary, data lineage, ownership, and quality scores are synchronized automatically.
How roche-data uses itPushes contract metadata, quality gate results, and lineage information after successful deployment.
Business valueThe governance catalog stays current without manual curation. Data stewards review and enrich — they don’t have to build from scratch.

OPA on CaaS — Policy Enforcement Runtime

Section titled “OPA on CaaS — Policy Enforcement Runtime”
RoleRuns compiled policies as live REST services on Roche’s managed Kubernetes platform. Covers validation, access control, workflow, API authorization, audit, and deployment gates.
How roche-data uses itCompiles YAML policy definitions to Rego, generates Kubernetes manifests, deploys one OPA instance per entity.
Business valuePolicies are code — versioned, reviewed, tested, and deployed like any other artifact. No more spreadsheet-based access matrices or manual deployment checklists.
RoleAutomated change request creation for production deployments. Ensures every data platform change follows Roche’s ITIL change management process.
How roche-data uses itCreates change requests via the Table API before production deployments. Links the change to the git commit and PR.
Business valueChange management compliance without manual ticket creation. Every deployment has an auditable change record.

RoleValidates artifacts on pull request, deploys on merge, builds documentation. The automation backbone.
How roche-data uses itThree workflows: validate.yml (PR checks), deploy.yml (merge deployment), docs.yml (documentation rebuild).
Business valueEvery change is validated before it reaches production. Peer review through pull requests. Full audit trail in git history.
RolePublishes generated OpenAPI specifications as managed API proxies with authentication, rate limiting, and monitoring.
How roche-data uses itPushes generated OpenAPI specs to Anypoint Platform, creating or updating managed API proxies.
Business valueData consumers get standard REST APIs with enterprise-grade security, monitoring, and SLA management — without any custom development.

RoleEnterprise architecture catalog. Generated artifacts are registered as IT components in the EA landscape.
How roche-data uses itRegisters data products, APIs, and platform components in the LeanIX catalog via REST API.
Business valueEnterprise architects have real-time visibility into the data platform’s footprint without manual documentation.

Data Marketplace — Self-Service Discovery

Section titled “Data Marketplace — Self-Service Discovery”
RoleData product index for self-service discovery across Roche. Business users browse and subscribe to data products.
How roche-data uses itPublishes data product metadata (contract, quality scores, access instructions) to the marketplace.
Business valueBusiness users find and request access to data products through a self-service portal — no tickets, no waiting for someone to point them to the right table.

SystemAccess statusCurrent mode
RTiSPending (A01)Stub client with fixture data
GUPRIPending (A02)Stub client with fixture data
MRHubPending (A03, A04)Stub client with fixture data
SnowflakePending (A05, A06)Stub client with fixture data
CollibraPending (A07, A08)Stub client with fixture data
MulesoftPending (A09)Stub client with fixture data
ServiceNowPending (A12)Stub client with fixture data
CaaS (Kubernetes)Pending (A13)Stub client with fixture data
GitHub ActionsActiveCI/CD operational
LeanIXPending (A14)Stretch goal
Data MarketplacePending (A15)Stretch goal

Design principle: The entire pipeline runs in --dry-run mode using stub clients that return realistic fixture data. This means development and testing proceed at full speed while platform access requests are being processed. When access is granted, swapping StubClient for HttpClient is a configuration change — not a code change.