voxhelm_deploy¶
Deploy Voxhelm on macOS using uv, uvicorn, and a launchd LaunchDaemon.
Description¶
This role deploys the Voxhelm service to the studio Mac Studio. It syncs the
local source tree, installs dependencies with uv sync --frozen --no-dev
plus configured optional extras, renders a shell environment file, applies
Django migrations, creates launcher scripts for the HTTP API and the Django
Tasks worker, installs launchd plists, and verifies both the HTTP health
endpoint and worker launchd state locally on the target host.
Current default runtime:
one
uvicornHTTP API processone Django Tasks
db_workerprocessone Wyoming STT/TTS sidecar process on port
10300optional WhisperKit local-server sidecar on port
50060when explicitly enabledwhisper.cppas the default STT backend onstudio, withmlx-whisperas fallbackmlx-whisperas the default Wyoming STT backend for short interactive speechpiperas the default TTS backend onstudiohost-wide lane scheduling enabled by default for local inference on
studiospeaker diarization disabled by default, with optional pyannote support
bearer-token authentication via environment variables
filesystem artifact storage by default, with S3/MinIO-compatible env vars available
no Traefik dependency; the service binds directly on the configured port
Requirements¶
macOS target host
uvinstalled on the target hostHugging Face token and accepted pyannote model access when
voxhelm_diarization_backend: "pyannote"Ansible collection:
ansible.posix
Required Variables¶
voxhelm_source_path: "/Users/jochen/projects/voxhelm"
voxhelm_django_secret_key: "replace-me"
voxhelm_bearer_tokens_env: "archive=replace-me"
voxhelm_bootstrap_operator_username: "jochen"
voxhelm_bootstrap_operator_password: "replace-me"
Optional Variables¶
voxhelm_bootstrap_operator_email is optional and may be empty. The bootstrap
username and password are required at deploy time even though they also appear
in defaults/main.yml with sentinel placeholders.
voxhelm_app_port: 8787
voxhelm_bind_host: "0.0.0.0"
voxhelm_stt_backend: "whispercpp"
voxhelm_stt_fallback_backend: "mlx"
voxhelm_tts_backend: "piper"
voxhelm_tts_max_input_chars: 5000
voxhelm_mlx_model: "mlx-community/whisper-large-v3-mlx"
voxhelm_whispercpp_model: "ggml-large-v3.bin"
voxhelm_whispercpp_bin: "/opt/homebrew/bin/whisper-cli"
voxhelm_whispercpp_processors: 4
voxhelm_whisperkit_enabled: false
voxhelm_whisperkit_cli_bin: "/opt/homebrew/bin/whisperkit-cli"
voxhelm_whisperkit_host: "127.0.0.1"
voxhelm_whisperkit_port: 50060
voxhelm_whisperkit_base_url: "http://127.0.0.1:50060/v1"
voxhelm_whisperkit_model: "large-v3-v20240930"
voxhelm_whisperkit_audio_encoder_compute_units: "cpuAndGPU"
voxhelm_whisperkit_text_decoder_compute_units: "cpuAndGPU"
voxhelm_whisperkit_concurrent_worker_count: 8
voxhelm_whisperkit_chunking_strategy: "vad"
voxhelm_whisperkit_timeout_seconds: 900
voxhelm_stt_debug_logging: false
voxhelm_python_version: "3.14.5"
voxhelm_diarization_backend: "none"
voxhelm_pyannote_model: "pyannote/speaker-diarization-3.1"
voxhelm_pyannote_device: "auto"
voxhelm_huggingface_token: ""
voxhelm_uv_extras: []
voxhelm_model_cache_dir: "/opt/apps/voxhelm/site/var/models"
voxhelm_piper_voice_dir: "/opt/apps/voxhelm/site/var/piper"
voxhelm_piper_voices:
- "en_US-lessac-medium"
- "de_DE-thorsten-high"
voxhelm_piper_default_voice: "en_US-lessac-medium"
voxhelm_piper_language_voices:
en: "en_US-lessac-medium"
en_US: "en_US-lessac-medium"
de: "de_DE-thorsten-high"
de_DE: "de_DE-thorsten-high"
voxhelm_wyoming_stt_enabled: true
voxhelm_wyoming_stt_host: "0.0.0.0"
voxhelm_wyoming_stt_port: 10300
voxhelm_wyoming_stt_backend: "mlx"
voxhelm_wyoming_stt_model: ""
voxhelm_wyoming_stt_language: ""
voxhelm_wyoming_stt_languages:
- "de"
- "en"
voxhelm_wyoming_stt_prompt: ""
voxhelm_wyoming_stt_normalize_transcript: true
voxhelm_lane_scheduler_enabled: true
voxhelm_lane_scheduler_dir: "/opt/apps/voxhelm/site/var/lane-scheduler"
voxhelm_lane_scheduler_stale_seconds: 1800
voxhelm_bootstrap_operator_username: "CHANGEME"
voxhelm_bootstrap_operator_email: ""
voxhelm_bootstrap_operator_password: "CHANGEME"
voxhelm_allowed_hosts:
- "studio.tailde2ec.ts.net"
- "studio"
- "localhost"
- "127.0.0.1"
voxhelm_allowed_url_hosts: []
voxhelm_trusted_http_hosts: []
voxhelm_uvicorn_log_level: "info"
For the full list, see defaults/main.yml.
Example Playbook¶
- name: Deploy Voxhelm
hosts: macstudio
gather_facts: true
roles:
- role: local.ops_library.uv_install
- role: local.ops_library.voxhelm_deploy
vars:
voxhelm_source_path: "/Users/jochen/projects/voxhelm"
voxhelm_django_secret_key: "{{ service_secrets.django_secret_key }}"
voxhelm_bearer_tokens_env: "archive={{ service_secrets.api_token_archive }}"
voxhelm_bootstrap_operator_username: "{{ service_secrets.bootstrap_operator_username }}"
voxhelm_bootstrap_operator_email: "{{ service_secrets.bootstrap_operator_email }}"
voxhelm_bootstrap_operator_password: "{{ service_secrets.bootstrap_operator_password }}"
voxhelm_diarization_backend: "pyannote"
voxhelm_pyannote_model: "pyannote/speaker-diarization-3.1"
voxhelm_huggingface_token: "{{ service_secrets.huggingface_token }}"
Bootstrap Operator¶
The role runs
python manage.py bootstrap_operatorafter migrations on every deploy.Bootstrap credentials should come from the private control repo, typically
ops-control/secrets/prod/voxhelm.yml.The in-app command is idempotent: first deploy creates the operator, later deploys update the matching account’s password, email,
is_staff, andis_activefields.The role passes credentials as task-scoped environment variables for that one-shot command and does not persist the operator password in
voxhelm.env.
Speaker Diarization Notes¶
voxhelm_diarization_backenddefaults tonone; requested diarization jobs fail clearly unless a backend is configured.The role creates
.venvwith a uv-managedvoxhelm_python_versioninterpreter instead of the host’s Homebrew Python. If an existing virtualenv points at a different base executable, the role recreates it before runninguv sync.Setting
voxhelm_diarization_backend: "pyannote"makes the role install the Voxhelm optional dependency extra withuv sync --frozen --no-dev --extra diarization.The role renders
VOXHELM_DIARIZATION_BACKEND,VOXHELM_PYANNOTE_MODEL,VOXHELM_PYANNOTE_DEVICE,VOXHELM_HUGGINGFACE_TOKEN, andHF_TOKENinto/etc/voxhelm/voxhelm.env. The env file remainsroot:wheeland0640, and the template task usesno_log: true.The role also renders the transcription execution mode. The default
voxhelm_transcription_execution_mode: django_taskspreserves local studio execution. Settingvoxhelm_transcription_execution_mode: remote_pullrequiresvoxhelm_worker_tokens_envand complete S3 artifact settings so remote workers can claim jobs and upload attempt-scoped artifacts.The HTTP API, Django Tasks worker, and Wyoming sidecar all source the same env file through their launcher scripts. The Hugging Face token is not written directly into launchd plist files.
The token should come from encrypted private control-repo secrets, not from this public collection.
Accept Hugging Face access for
pyannote/speaker-diarization-3.1,pyannote/speaker-diarization-community-1, and any gated dependency reported by pyannote before first production use.The first diarization run downloads model weights and can take time. Long podcast episodes are expensive; use batch jobs and inspect the worker logs.
Wyoming STT Notes¶
The Wyoming listener on
10300now exposes both STT and TTS backed by Voxhelm.The launchd label and helper script retain the legacy
-sttsuffix for continuity, but the runtime serves both speech directions.voxhelm_wyoming_stt_backenddefaults tomlxbecause it performed materially better than the currentwhisper.cppsetup on short Home Assistant commands with trailing silence.voxhelm_wyoming_stt_model,voxhelm_wyoming_stt_language, andvoxhelm_wyoming_stt_promptcan be used to pin the interactive listener to a specific model, language, or prompt without changing the main HTTP/batch lane.voxhelm_whisperkit_enabledinstallswhisperkit-cli, renders a dedicated launchd unit, and exposes the backend to Voxhelm’s accepted-model surface. Leave itfalseunless you explicitly want the experimental backend onstudio.voxhelm_whisperkit_model, compute-unit knobs, worker count, and chunking strategy map directly towhisperkit-cli serveso the tunedstudioconfiguration can be preserved in deploy config rather than in ad-hoc shell history.WhisperKit remains non-default on purpose. The benchmark re-evaluation showed it is competitive on
studio, but the tuned long-form run still logged a Metal GPU recovery error.voxhelm_wyoming_stt_normalize_transcripttrims a small set of leading filler words such asokay/undfrom Wyoming transcripts before they are returned to Home Assistant. This is enabled by default because the built-in German Assist parser is materially less tolerant of those prefixes than the English one.voxhelm_lane_scheduler_enabledenables the first C13 slice: one host-wide admission gate shared by the HTTP API, Django Tasks worker, and Wyoming sidecar.voxhelm_lane_scheduler_dirstores the shared scheduler state on local disk.voxhelm_lane_scheduler_stale_secondsdefaults to1800so a crashed holder can be reclaimed without risking false expiry during long-running local inference. Lower this only if the runtime also refreshes the lease while work is active.voxhelm_stt_debug_loggingenables one structured log line per transcription containing the input audio shape, requested and resolved backend/model/language, and transcript preview. When normalization changes the transcript, the debug log also includes the raw transcript for comparison. Leave it off unless you are actively tuning or debugging.Piper voice files are downloaded during deploy into
voxhelm_piper_voice_dir.voxhelm_piper_language_voicesmaps requested language codes such asen/deto installed Piper voice IDs for both Wyoming TTS and HTTP or batch synthesis.The first C13 slice is cooperative, not preemptive. A running HTTP or batch inference can still delay a later Wyoming turn, but new non-interactive work will not be admitted ahead of a waiting Wyoming request while the scheduler is enabled.
The role verifies the sidecar by checking the launchd unit state and waiting for the configured TCP port to listen locally on the target host.
License¶
MIT