Problem: every CI run needs the same realistic database, fast, with no network access.
Solution:
pip install dbsprout
dbsprout init --file schema.sql
dbsprout generate --seed 42 --rows 200
dbsprout validate --format json --output validate.json
Commit dbsprout.toml and the .dbsprout/ snapshot so every machine
regenerates byte-identical fixtures. Example GitHub Actions step:
- run: |
pip install dbsprout
dbsprout generate --seed 42 --rows 200
dbsprout validate --format json --output validate.json
Why it works: the default heuristic engine needs no model or network
and runs at 100K+ rows/sec. A fixed --seed makes output identical across
runs and machines; validate --format json gives CI a machine-readable pass/
fail.