rdt-model-infer
Purpose
Section titled “Purpose”rdt-model-infer uses a large language model to suggest enrichments when RTiS data is lacking. Given the table DDL and sample data, it can generate:
- Term mappings — Suggested RTiS terminology associations for unmapped columns
- Descriptions — Human-readable descriptions for fields lacking documentation
- DQ rules — Data quality rule suggestions based on observed data patterns
Suggestions are never auto-applied — they are written to a review file for human approval.
Pipeline phase
Section titled “Pipeline phase”Phase 2 — Enrich (optional)
The pipeline produces a complete data product without this module. It adds enrichment where the source model has gaps.
# Generate LLM suggestionscargo run -p rdt-model-infer -- --target dev infer --entity waste-tracking
# Infer only specific targetscargo run -p rdt-model-infer -- --target dev infer --entity waste-tracking --targets terminology,descriptionsConfiguration
Section titled “Configuration”| Key | Source | Description |
|---|---|---|
llm.provider | roche-data.toml | LLM provider (anthropic, openai) |
LLM_API_KEY | Environment variable | LLM API authentication key |
Dependencies
Section titled “Dependencies”rdt-model-pull(model.json)
Output
Section titled “Output”| File | Format | Description |
|---|---|---|
infer/suggestions.json | JSON | LLM suggestions — always human-reviewable, never auto-applied |