Collibra
Collibra is Roche’s enterprise data governance platform. It is bidirectional in the roche-data pipeline: the govern module pulls stewardship metadata (ownership, SLAs, classification, PII flags) at generation time, and the register module pushes lineage information back after deployment.
Access to Collibra is proxied through Mulesoft — there is no direct API access.
Connection Details
Section titled “Connection Details”| Property | Value |
|---|---|
| URL | https://roche.collibra.com (via Mulesoft proxy) |
| Auth method | Client credentials (Mulesoft proxy headers) |
| Network | Via Mulesoft API proxy (no direct Collibra access) |
| Access task (read) | A07 |
| Access task (write) | A08 |
| GitHub issue (read) | #25 (closed — resolved) |
| GitHub issue (write) | #76 (open — blocked) |
Environment Variables
Section titled “Environment Variables”| Variable | Source | Description |
|---|---|---|
COLLIBRA_BASE_URL | Vault common/collibra | Base URL for Collibra API (via Mulesoft proxy) |
MULESOFT_MODEL_CLIENT_ID | Vault {env}/collibra | OAuth client ID for Mulesoft proxy authentication |
MULESOFT_MODEL_CLIENT_SECRET | Vault {env}/collibra | OAuth client secret for Mulesoft proxy authentication |
X-META-BRIDGE-KEY | Vault {env}/collibra | Additional header key for Mulesoft bridge |
CLI Modules
Section titled “CLI Modules”| Module | Direction | Usage |
|---|---|---|
rdt-model-govern | Pull (read) | Fetches governance metadata — ownership, SLAs, classification, PII flags — writes governance.json |
rdt-model-register | Push (write) | Pushes lineage records and quality gate results after deployment |
Data Flow
Section titled “Data Flow”Access Verification
Section titled “Access Verification”Script: scripts/access/check-collibra.sh
Required tools: curl, jq
Required env vars: MULESOFT_MODEL_CLIENT_ID, MULESOFT_MODEL_CLIENT_SECRET
Checks performed:
- Mulesoft proxy reachability (
GET /rest/2.0/communities) - Client credential validation
- Community data retrieval (community count)
- Asset search (
GET /rest/2.0/assets) for governance metadata read - Write access status (currently reports blocked — A08 pending)
Authentication
Section titled “Authentication”All Collibra API calls go through the Mulesoft proxy with client credentials:
curl -s "https://roche.collibra.com/rest/2.0/communities" \ -H "client_id: $MULESOFT_MODEL_CLIENT_ID" \ -H "client_secret: $MULESOFT_MODEL_CLIENT_SECRET" \ -H "X-META-BRIDGE-KEY: $X_META_BRIDGE_KEY"API Endpoints Used
Section titled “API Endpoints Used”| Method | Path | Module | Purpose |
|---|---|---|---|
GET | /rest/2.0/communities | govern | List available governance communities |
GET | /rest/2.0/assets | govern | Search for entity assets by name/type |
GET | /rest/2.0/assets/{id}/attributes | govern | Get governance attributes (SLA, PII, etc.) |
POST | /rest/2.0/assets | register | Create lineage assets (blocked — A08) |
POST | /rest/2.0/relations | register | Create lineage relationships (blocked — A08) |
Governance Metadata Retrieved
Section titled “Governance Metadata Retrieved”The govern module pulls and writes to models/{entity}/governance.json:
| Field | Collibra source | Used by |
|---|---|---|
| Data steward | Asset responsibility | Contract, docs |
| Data owner | Asset responsibility | Contract, docs |
| SLA | Custom attribute | Contract, policy |
| Classification | Custom attribute | Policy (access domain) |
| PII flags | Custom attribute | Policy (access domain), Silver enrichment |
| Quality score | Custom attribute | Gold view predicates |
Current Status
Section titled “Current Status”- Read path: Fully implemented (
HttpCollibraClientvia Mulesoft proxy, PR #114) - Write path: Blocked pending write-access service account (A08)
- Workaround:
StubCollibraClientreturns fixture governance data for all entities