DBSprout’s --engine spec sends column names and types to an LLM to produce a DataSpec. By default it uses the embedded Qwen2.5-1.5B model, which runs entirely on your machine. This recipe shows how to explicitly pin to a local model — either the embedded one or your own Ollama instance — and explains the privacy guarantees each mode provides.
Problem: you want the semantic accuracy of --engine spec but your organisation policy prohibits sending schema metadata to any external API, or you simply have no internet connection.
Understanding the privacy gradient
DBSprout has three privacy tiers:
| Tier | What can leave the machine | Requires |
|---|---|---|
local | Nothing | Default — no config needed |
redacted | Column names and types only (no values) | Cloud LLM + API key |
cloud | Column names, types, and sample values | Cloud LLM + API key + explicit opt-in |
Running with --privacy local (the default) locks the tier to local and refuses to use any provider that makes network calls — you get a hard error rather than a silent leak.
Prerequisites
dbsproutinstalled- For the embedded model: nothing extra — it ships inside the package
- For Ollama: Ollama installed and running, with at least one model pulled (e.g.
ollama pull mistral)
Steps
Option A: Embedded Qwen2.5-1.5B (zero config)
1. Generate — the embedded model is the default
dbsprout generate --engine spec --privacy local
The --privacy local flag is technically redundant (it is already the default) but makes your intent explicit and future-proof.
2. Confirm which model ran
dbsprout generate --engine spec --privacy local --verbose
Output will include:
LLM provider : embedded (qwen2.5-1.5b-instruct-q4_k_m.gguf)
Privacy tier : local
Network calls : 0
Spec cached : .dbsprout/cache/spec_a3f9c12.json
Option B: Your own Ollama instance
1. Start Ollama and pull a model
ollama serve # if not already running as a service
ollama pull mistral # or any model you prefer
2. Configure DBSprout to use Ollama
Either pass flags:
dbsprout generate --engine spec \
--llm-provider ollama \
--llm-model mistral \
--privacy local
Or persist in dbsprout.toml:
[llm]
provider = "ollama"
model = "mistral"
base_url = "http://localhost:11434"
[privacy]
tier = "local"
Then just run:
dbsprout generate --engine spec
3. Verify no external calls
dbsprout generate --engine spec --dry-run
--dry-run produces the spec without generating rows and prints the provider summary. If you see any provider other than embedded or ollama, double-check your dbsprout.toml.
Invalidating the cache after a model change
The spec is keyed on schema hash + provider. Switching from embedded to Ollama (or vice versa) generates a fresh spec automatically — no manual cache clear needed.
To force regeneration without changing the provider:
dbsprout generate --engine spec --refresh-spec
Result
Column names and types stay on your machine — the LLM sees only schema metadata, never row values. The spec is generated once, cached, and reused on every subsequent run. You get semantically rich values for domain-specific columns with zero network exposure.
For the full privacy model, see /docs/core-concepts. For provider configuration details, see /docs/guides/llm-configuration.