rdt-model-infer

Purpose

rdt-model-infer uses a large language model to suggest enrichments when RTiS data is lacking. Given the table DDL and sample data, it can generate:

Term mappings — Suggested RTiS terminology associations for unmapped columns
Descriptions — Human-readable descriptions for fields lacking documentation
DQ rules — Data quality rule suggestions based on observed data patterns

Suggestions are never auto-applied — they are written to a review file for human approval.

Pipeline phase

Phase 2 — Enrich (optional)

The pipeline produces a complete data product without this module. It adds enrichment where the source model has gaps.

Usage

# Generate LLM suggestions
cargo run -p rdt-model-infer -- --target dev infer --entity waste-tracking

# Infer only specific targets
cargo run -p rdt-model-infer -- --target dev infer --entity waste-tracking --targets terminology,descriptions

Configuration

Key	Source	Description
`llm.provider`	`roche-data.toml`	LLM provider (anthropic, openai)
`LLM_API_KEY`	Environment variable	LLM API authentication key

Dependencies

rdt-model-pull (model.json)

Output

File	Format	Description
`infer/suggestions.json`	JSON	LLM suggestions — always human-reviewable, never auto-applied