Skip to content

Environment Variables

All values have sensible defaults. Set via (in order of precedence):

  1. Shell env / docker compose environment: — highest precedence
  2. A local .env file at the repo root (auto-loaded via python-dotenv at startup — gitignored; .env.example in the repo is the template)
  3. Built-in defaults in the code

For deployed boxes, inject secrets via your secrets manager (Infisical, Vault, SOPS, k8s Secret + envFrom, etc.). The app reads os.environ — it doesn't care where values came from.

Server

VariableDefaultPurpose
PORT7866HTTP port the FastAPI/Pipecat server listens on
HF_HOME/modelsHuggingFace cache dir (inside the container)
MODEL_DIR/modelsAlias for HF_HOME — set one, both resolve
SYSTEM_PROMPT(built-in)Overrides the default voice-assistant system prompt
VERBOSITYbriefDefault filler verbosity: silent / brief / narrated / chatty
TZAmerica/New_YorkTimezone for the get_datetime tool
DELEGATES_YAMLconfig/delegates.yamlPath to the delegate registry (A2A + OpenAI)

LLM

VariableDefaultPurpose
START_VLLM1Set 0 to use an external endpoint
VLLM_PORT8100Port the built-in vLLM subprocess listens on (loopback only)
LLM_MODELQwen/Qwen3.5-4BModel to serve if START_VLLM=1
LLM_URLhttp://localhost:8100/v1OpenAI-compat endpoint
LLM_SERVED_NAMElocalModel name as served at LLM_URL
LLM_API_KEYnot-neededBearer for LLM_URL
LLM_MAX_TOKENS150Cap per response
LLM_TEMPERATURE0.7Sampling temperature

STT

VariableDefaultPurpose
STT_BACKENDlocallocal (HF Whisper, in-process) or openai (any compat /v1/audio/transcriptions)
WHISPER_MODELopenai/whisper-large-v3-turboHF model id when STT_BACKEND=local
STT_URLhttps://api.openai.com/v1Base URL when STT_BACKEND=openai
STT_MODELwhisper-1Model name when STT_BACKEND=openai
STT_API_KEYnot-neededBearer when STT_BACKEND=openai

TTS

VariableDefaultPurpose
TTS_BACKENDfishfish (sidecar w/ cloning), kokoro (in-process), or openai (any compat /v1/audio/speech)
FISH_URLhttp://fish-speech:8092Fish sidecar endpoint
FISH_REFERENCE_ID(unset)Default saved voice reference
FISH_SAMPLE_RATE44100Fish's native output SR
FISH_TIMEOUT180Per-call timeout (seconds). Covers cold compile
KOKORO_VOICEaf_heartKokoro preset voice
KOKORO_LANGaKokoro language code (a = American English, b = British, j = Japanese, …)
TTS_OPENAI_URLhttps://api.openai.com/v1Base URL when TTS_BACKEND=openai
TTS_OPENAI_MODELtts-1Model name when TTS_BACKEND=openai
TTS_OPENAI_VOICEalloyVoice id when TTS_BACKEND=openai
TTS_OPENAI_API_KEYnot-neededBearer when TTS_BACKEND=openai
TTS_OPENAI_SAMPLE_RATE24000Output SR claimed when TTS_BACKEND=openai

GPU / compose

VariableDefaultPurpose
PROTOVOICE_GPU0GPU index for the protovoice container
FISH_GPU1GPU index for the fish-speech container
NVIDIA_VISIBLE_DEVICES0Inside-container GPU visibility
FISH_REFERENCES_DIR/mnt/data/fish-referencesHost path for saved voice references

Tool behaviour

VariableDefaultPurpose
FAKE_RESEARCH_SECS4Synthetic fallback sleep for deep_research when no ava configured
SLOW_RESEARCH_SECS20Synthetic slow_research sleep length (async-delivery validation)

Memory

VariableDefaultPurpose
MEMORY_SUMMARIZE1Master switch for pipecat's built-in auto-summarizer. Set 0 to disable.
MEMORY_MAX_CONTEXT_TOKENS8000Token-based trigger for summarization (~4 chars/token).
MEMORY_MAX_MESSAGES20Message-count trigger (user + assistant + tool).
MEMORY_TARGET_CONTEXT_TOKENSMEMORY_MAX_CONTEXT_TOKENS / 2Compression target size.

Tracing (Langfuse)

When all three are set, each user turn becomes a trace. Unset → no-op. See Tracing.

VariableDefaultPurpose
LANGFUSE_HOSTLangfuse base URL (e.g. http://ava:3000 on the protoLabs tailnet).
LANGFUSE_PUBLIC_KEYpk-lf-… from the Langfuse project.
LANGFUSE_SECRET_KEYsk-lf-… from the Langfuse project.

Config paths

VariableDefaultPurpose
CONFIG_DIRconfigWhere SOUL.md + skills/ + agents.yaml live
SESSION_STORE_DIR/tmp/protovoice_sessionsPer-user session summaries + pending deliveries ({user_id}/{skill_slug}.txt)

Auth (API-key users)

Every /api/* route requires X-API-Key: <key> or Authorization: Bearer <key>. Keys resolve to users via the roster loaded from Infisical (primary) or a local YAML file (fallback). Empty roster = single-user fallback.

VariableDefaultPurpose
INFISICAL_API_URLhttps://app.infisical.comInfisical API base (override for self-hosted pve01, etc.)
INFISICAL_CLIENT_ID(unset)Machine-identity id — presence of this + SECRET + PROJECT_ID enables Infisical mode
INFISICAL_CLIENT_SECRET(unset)Machine-identity secret
INFISICAL_PROJECT_ID(unset)Infisical workspace/project id
INFISICAL_ENVIRONMENTprodEnv slug within the project
INFISICAL_SECRET_PATH/protovoiceFolder path for the roster secret
INFISICAL_USERS_SECRET_NAMEUSERS_YAMLSecret name — the full users.yaml content

When Infisical isn't configured, protoVoice falls back to {CONFIG_DIR}/users.yaml. See Users guide for the roster shape.

Delegate authentication

Referenced from config/delegates.yaml via credentialsEnv (a2a) or api_key_env (openai). Common values:

VariablePurpose
AVA_API_KEYAva orchestrator auth (when type: a2a)
AVA_URLOverride ava's URL without editing YAML (${AVA_URL:-...} expansion)
LITELLM_MASTER_KEYBearer for a LiteLLM-fronted openai delegate
OPENAI_API_KEYIf a delegate points directly at OpenAI

Add more as you extend the registry.

A2A inbound (our own server)

VariableDefaultPurpose
A2A_AUTH_TOKEN(unset)Shared secret required on inbound /a2a. When set, requests must carry X-API-Key: <token> or Authorization: Bearer <token>. Unset = anonymous inbound.
A2A_USER_IDdefaultWhich protoVoice user inbound A2A turns attribute to (skill / memory / verbosity / stashed deliveries all resolve under this id). True per-caller A2A auth is future work.
AGENT_NAMEprotovoiceAdvertised name in the agent card
AGENT_VERSION0.2.0Advertised version in the agent card
A2A_MAX_TURNS10Per-contextId history cap for inbound text turns

Backchannels

VariableDefaultPurpose
BACKCHANNEL_FIRST_SECS5.0Seconds into a user turn before the first backchannel fires
BACKCHANNEL_INTERVAL_SECS6.0Interval between subsequent backchannels

Audio handling (echo / feedback / turn)

VariableDefaultPurpose
ECHO_GUARD_MS300Drop mic audio for this many ms after the bot stops speaking. 0 = disable.
HALF_DUPLEX01 = mute mic entirely while bot speaks (loses barge-in, kills echo loops).
NOISE_FILTERoffrnnoise enables RNNoise filter on the mic stream. Requires pip install -e .[rnnoise].
SMART_TURNofflocal enables LocalSmartTurnAnalyzerV3 — learned end-of-turn detection. Requires pip install -e .[smart-turn].

See Audio Handling guide for when to use which.

Part of the protoLabs autonomous development studio.