CLI Reference

VoxKitchen provides the vkit command-line tool.

Commands

`vkit init`

Scaffold a new pipeline project.

vkit init my-project                        # Empty template
vkit init my-project --template tts         # TTS data preparation
vkit init my-project --template asr         # ASR training data
vkit init my-project --template cleaning    # Data cleaning
vkit init my-project --template speaker     # Speaker analysis
vkit init --list-templates                  # Show available templates

Flag	Meaning
`--template`, `-t`	Project template (`tts`, `asr`, `cleaning`, `speaker`).
`--list-templates`	Print templates and exit.

`vkit run` (container entrypoint)

Execute a pipeline in the current Python environment. This is the container entrypoint used by VoxKitchen images. Most host users should use vkit docker run <yaml>, which supplies the runtime image and Docker mounts.

# Host usage; these flags are forwarded to the image entrypoint:
vkit docker run pipeline.yaml                           # Run full pipeline
vkit docker run pipeline.yaml --dry-run                 # Validate only
vkit docker run pipeline.yaml --resume-from vad         # Resume from a stage
vkit docker run pipeline.yaml --stop-at asr             # Stop after a stage
vkit docker run pipeline.yaml --keep-intermediates      # Keep derived audio
vkit docker run pipeline.yaml --num-gpus 2              # Override GPU count
vkit docker run pipeline.yaml --num-workers 8           # Override CPU workers
vkit docker run pipeline.yaml --work-dir ./work/run1    # Override work_dir

Flag	Meaning
`--dry-run`	Parse + validate the pipeline, resolve the stage plan, exit without executing.
`--resume-from STAGE`	Force-resume from `STAGE` regardless of existing checkpoints.
`--stop-at STAGE`	Stop after `STAGE` completes.
`--keep-intermediates`	Disable GC; keep every stage's derived audio on disk.
`--num-gpus N`	Override the pipeline YAML's `num_gpus`.
`--num-workers N`	Override `num_cpu_workers`.
`--work-dir PATH`	Override the pipeline YAML's `work_dir`.

`vkit validate`

Check YAML syntax, operator references, per-operator arg schemas, and the recommended Docker image without executing.

vkit validate pipeline.yaml

`vkit download` (current-env helper)

Download a dataset using its recipe in the current environment. For the Docker-first user path, use vkit docker download so recipe-specific dependencies come from the image.

vkit docker download --tag slim librispeech --root ./data/librispeech --subsets dev-clean,test-clean
vkit docker download --tag slim aishell --root ./data/aishell
vkit docker download --tag slim fleurs --root ./data/fleurs --subsets en_us,zh_cn

Flag	Meaning
`--root PATH`	Directory to download into (required).
`--subsets LIST`	Comma-separated subset names. Recipe-specific; see Recipes & Download.

`vkit ingest`

Build a CutSet manifest from a data source — standalone, outside a pipeline. Most users let vkit docker run do this through the pipeline ingest block; vkit ingest is useful for one-off manifest prep.

vkit ingest --source dir      --root ./data/audio       --out cuts.jsonl.gz
vkit ingest --source recipe   --recipe librispeech      --root ./data/librispeech --out cuts.jsonl.gz
vkit ingest --source manifest --path input.jsonl.gz --out merged.jsonl.gz
vkit ingest --source dir      --root ./data/audio       --out cuts.jsonl.gz --no-recursive

Flag	Source	Meaning
`--source`	all	`dir`, `manifest`, or `recipe`.
`--out`	all	Output `cuts.jsonl.gz` path (required).
`--root`	`dir`, `recipe`	Dataset root directory.
`--path`	`manifest`	Path to an input `cuts.jsonl.gz`.
`--recipe`	`recipe`	Recipe name (`librispeech`, `aishell`, `commonvoice`, `fleurs`).
`--subsets`	`recipe`	Comma-separated subset names.
`--recursive / --no-recursive`	`dir`	Recurse into subdirectories (default: recurse).

`vkit inspect`

Inspect pipeline results and cut data. Four subcommands:

vkit inspect cuts   work/01_pack/cuts.jsonl.gz    # CutSet statistics
vkit inspect run    work/                          # Per-stage summary + timing
vkit inspect trace  utt-001 --in work/             # Provenance chain for one cut
vkit inspect errors work/                          # Per-stage error entries

Subcommand	Argument	Purpose
`cuts <path>`	A `cuts.jsonl.gz`	Print CutSet-level stats (duration, sample rate, metric histograms).
`run <work_dir>`	Pipeline work dir	Table of stage name, cut count, duration, success marker.
`trace <cut_id> --in <work_dir>`	Cut id + work dir	Walk `Provenance` parent links to show where a cut came from.
`errors <work_dir>`	Pipeline work dir	Dump per-stage `_errors.jsonl` entries (cuts that failed).

`vkit operators`

List and inspect operators.

vkit operators                       # List all operators (grouped by category)
vkit operators --category quality    # List only operators in one category
vkit operators search noise          # Find operators whose name or description matches "noise"
vkit operators show silero_vad       # Show config fields + YAML example for an operator

search matches case-insensitively against the operator name and the first line of its docstring. It exits with code 1 when nothing matches, so shell scripts can branch on no-result.

Valid --category values are: basic, segment, augment, annotate, quality, synthesize, pack, noop.

`vkit schema`

Generate JSON Schemas for YAML editor integration.

vkit schema export                                  # → ./pipeline.schema.json
vkit schema export --out docs/schemas/pipeline.schema.json   # custom path

The output is consumed by YAML language servers (VS Code, Neovim, JetBrains) so users get autocompletion and inline validation while editing pipeline.yaml. vkit init already writes the right # yaml-language-server: $schema=… directive at the top of every scaffolded pipeline. See Pipeline JSON Schema for editor setup.

`vkit recipes`

List dataset recipes (the entities behind vkit docker download and ingest: source=recipe).

vkit recipes

Output is a table with name, download mechanism (openslr, keithito, HuggingFace, or manual), compressed download size, and a one-line description. The Size column shows a single value for single-archive datasets and a range for multi-subset datasets (299 MB - 28.5 GB) so you can compare before downloading. Manual / HuggingFace recipes render as a dash.

To actually download, use vkit docker download --tag slim <name> --root ./data/<name>; to reference inside a pipeline, use ingest: { source: recipe, recipe: <name>, args: { root: <dir> } }. Recipe-specific subset names are listed in Recipes & Download.

`vkit datasets`

Browse the dataset catalog from the terminal — same data that's shown on the docs site dataset catalog, without leaving your shell. Three modes:

vkit datasets                                # all 60 entries as a table
vkit datasets --task asr --language zh       # filter
vkit datasets --recipe-only                  # only downloadable-via-VoxKitchen
vkit datasets --query libri                  # substring across id/name/summary
vkit datasets show librispeech               # full record (one entry)
vkit datasets search 'code-switch'           # substring search across all fields

Flag	Meaning
`--task`, `-t`	Filter by task tag (`asr` / `tts` / `speaker` / `multilingual` / `emotion` / `augmentation`).
`--language`, `-l`	Filter by ISO language code (`en` / `zh` / `multi` / `ja` / ...).
`--recipe-only`	Only show entries downloadable via `vkit docker download`.
`--query`, `-q`	Substring match (case-insensitive) across id / name / summary.

Filters compose with AND. The show subcommand prints a Rich panel with all fields (license, homepage, paper, recipe hint, recommended pipeline, summary, recommendation, notes); when the entry is recipe-backed the panel surfaces the exact vkit docker download invocation.

`vkit doctor`

Report per-env operator availability and warmup-model status.

vkit doctor                          # Single-env report (dev install or :slim image)
vkit doctor --expect core            # Assert core-env expected operators are importable
vkit doctor --expect asr             # Asserts for :asr image
vkit doctor --json                   # Machine-readable output on stdout

Flag	Meaning
`--expect ENV`	Image env to validate against (`core`, `asr`, `diarize`, `tts`, `fish-speech`). Exits non-zero if any expected operator fails to import. Used by the Dockerfile's per-stage smoke test.
`--json`	Emit a JSON report on stdout (rich table still goes to stderr).

Inside the voxkitchen:latest multi-env image, vkit doctor with no --expect aggregates a table across every env under /opt/voxkitchen/envs/, re-invoking each env's own vkit doctor --expect <env>.

`vkit docker`

Run any of the commands above inside a published Docker image instead of the local Python env. Also has three image-management helpers (build, pull, shell).

Image selection flags (run, download, doctor, pull, shell):

Flag	Default	Meaning
`--tag NAME`	`latest` (`download`: `slim`)	Image tag. Resolves to `ghcr.io/xqfeng-josie/voxkitchen:NAME`.
`--image REF`	—	Full image reference; overrides `--tag`.

`vkit docker run <yaml>`

Execute a pipeline inside the container.

vkit docker run pipeline.yaml                          # :latest
vkit docker run pipeline.yaml --tag asr                # :asr
vkit docker run pipeline.yaml --gpus none              # CPU-only
vkit docker run pipeline.yaml --dry-run                # Validate inside image
vkit docker run pipeline.yaml --env-file /tmp/.env     # Alternate env file
vkit docker run pipeline.yaml --mount /data/raw        # Extra read-only bind mount

Flag	Default	Meaning
`--gpus MODE`	`auto`	`auto` (attach all GPUs if `nvidia-smi` is on PATH), `all`, or `none`.
`--env-file PATH`	`./.env` if present	`docker --env-file` path (used for `HF_TOKEN`).
`--mount PATH`, `-m`	—	Extra host path to bind read-only. Repeatable.
`--dry-run`, `--resume-from`, `--stop-at`, `--num-gpus`, `--num-workers`, `--work-dir`, `--keep-intermediates`	—	Pipeline options forwarded to the image entrypoint.

The wrapper automatically:

Sets --user $(id -u):$(id -g) and -e HOME=/tmp so files in ./work are owned by the host user.
Sets NUMBA_CACHE_DIR=/app/work/.numba-cache so librosa/numba operators can cache under the mounted work directory.
Binds ./work → /app/work and ./output → /app/output; if ./data exists, binds it to both /app/data for template-relative YAML and /data for absolute data roots.
Binds the pipeline YAML at its absolute path when it points to a host file.

`vkit docker doctor`

Run vkit doctor inside the container.

vkit docker doctor                                    # :latest, multi-env aggregate
vkit docker doctor --tag slim                         # slim image, single-env
vkit docker doctor --tag asr --expect asr --json      # smoke test + JSON

Accepts --expect and --json (same semantics as local vkit doctor). Default --gpus is none (doctor doesn't need GPU).

`vkit docker download <recipe>`

Download a dataset inside the container. The wrapper creates and mounts ./data, so roots under ./data/... are written back to the host.

vkit docker download --tag slim librispeech --root ./data/librispeech --subsets dev-clean
vkit docker download --tag slim fleurs --root ./data/fleurs --subsets en_us,zh_cn

`vkit docker build [target]`

Build a local Docker image from docker/Dockerfile (wraps docker build).

vkit docker build                 # Default target: latest
vkit docker build slim
vkit docker build asr
vkit docker build latest --tag voxkitchen:dev
vkit docker build latest --no-hf-token                # Skip baking pyannote

Argument/Flag	Default	Meaning
`target`	`latest`	Dockerfile target: `slim`, `asr`, `diarize`, `tts`, `fish-speech`, `latest`.
`--tag NAME`	`voxkitchen:<target>`	Image tag to apply.
`--hf-token / --no-hf-token`	`--hf-token`	Pass `HF_TOKEN` from `./.env` as a build arg so pyannote is baked into the image.

Pass extra docker build flags after --:

vkit docker build latest -- --no-cache --progress=plain

By default the wrapper keeps Docker client temp/config/cache files under ./.docker (DOCKER_CONFIG, TMPDIR, BUILDX_CONFIG, XDG_CACHE_HOME). Set VKIT_DOCKER_WORK_DIR=/path/to/.docker to choose a different base directory. Docker image layers still live under the Docker daemon's data-root (often /var/lib/docker); move that daemon setting separately if / is full.

`vkit docker pull`

Pull a published image from GHCR.

vkit docker pull                      # :latest
vkit docker pull --tag slim
vkit docker pull --image my-registry/vox:custom

`vkit docker shell`

Drop into an interactive bash inside the image, useful for debugging.

vkit docker shell --tag slim
vkit docker shell --tag latest --gpus all

`vkit viz`

Launch an interactive Gradio panel to explore a CutSet.

vkit viz work/01_pack/cuts.jsonl.gz --port 7860

vkit viz is an optional local developer UI; it is separate from the Docker-first pipeline execution path.

`vkit card`

Generate a standalone, shareable HTML dataset card from a processed CutSet manifest. The card includes quality distributions, language/gender breakdowns, a metrics summary, and sample utterances — useful for dataset documentation and sharing.

vkit card work/01_pack/cuts.jsonl.gz
vkit card work/01_pack/cuts.jsonl.gz --out my_dataset_card.html
vkit card work/01_pack/cuts.jsonl.gz --out card.html --title "My Dataset" --description "ASR training set"
# Auto-fill from the dataset catalog (license/homepage/recommendation):
vkit card work/01_pack/cuts.jsonl.gz --catalog-id librispeech

Flag	Default	Meaning
`--out`, `-o`	`dataset_card.html`	Output HTML file path.
`--title`	`""`	Card title (shown at the top of the HTML).
`--description`	`""`	Short dataset description.
`--catalog-id`	(none)	Pre-fill title/description and a Source section (license, homepage, paper, recommendation) from the matching entry in `voxkitchen/datasets/catalog.yaml`. Explicit `--title`/`--description` still override.

Requires the viz extra. If Jinja2 is not installed, the command exits with a friendly message:

error: the dataset card needs the 'viz' extra. Install it with `pip install voxkitchen[viz]`.

CLI Reference

Commands

vkit init

vkit run (container entrypoint)

vkit validate

vkit download (current-env helper)

vkit ingest

vkit inspect

vkit operators

vkit schema

vkit recipes

vkit datasets

vkit doctor

vkit docker

vkit docker run <yaml>

vkit docker doctor

vkit docker download <recipe>

vkit docker build [target]

vkit docker pull

vkit docker shell

vkit viz

vkit card