How-To Guides

Task-oriented. Each page assumes you already know what you want to do — if you're looking for orientation, start with Tutorials.

TTS

Switch TTS Backend — Fish ↔ Kokoro
Clone a Voice — upload a reference clip and speak in that voice
Run Without the Fish Sidecar — Kokoro-only, single container

LLM

Use an External LLM — point at a gateway, a remote vLLM, or OpenAI

All-API setup

Use LocalAI (all-API) — swap STT + TTS + LLM to OpenAI-compat endpoints; run protoVoice on a CPU box

Voice agent behaviour

Configure Verbosity — tune filler chattiness (silent / brief / narrated / chatty)
Backchannels — listener-acks ("mm-hmm") during long user turns
Delivery Policies — now / next_silence / when_asked for async tool results
Personas & Skills — swap voice + system prompt per skill YAML
Users & API Keys — configure the auth roster (Infisical or config/users.yaml)
Audio Handling — echo guard, half-duplex, noise filter, smart-turn

Extending the agent

Build a Tool — sync vs async patterns, latency tiers, the result_callback gotcha

Fleet integration

A2A Integration — inbound JSON-RPC + callback webhook + outbound dispatch

Ops

Benchmarking — measure LLM / TTS / STT / A2A latency with scripts/bench.py