Minimalist push-to-talk dictation for Hyprland: hold a hotkey to talk, release to drop the transcribed text into your clipboard.
A small bash script pipes your mic to a local Speaches (Whisper) server β fully offline, language auto-picked from your keyboard layout.
Run from a terminal:
curl -fsSL https://stt.demo.land/setup.sh | bash
Automatic installation steps:
docker, ffmpeg, wl-clipboard, libnotify, jq, hyprland).~/.config/stt/config.json from the default configuration (if not already present).model_aliases.json to ~/.config/stt/aliases.json.multi) model.SUPER + `).Re-running is safe. Linux/Hyprland only.
The first transcription after switching to a new language will pause while the model downloads β itβs cached after that.
~/.config/stt/config.json is generated on first install using the default configuration. Use it as a starting point and customise it to suit your workflow.
Language-to-model mapping lives in config/model_aliases.json and is served directly to the Speaches server. Each key is an ISO 639-1 language code (or multi for the multilingual fallback); the value is a Hugging Face model ID.
The active keyboard layout (detected via hyprctl) automatically picks the matching alias β no extra config needed. Language models are downloaded on first use, so only the models you actually speak are ever fetched. Re-run setup after editing model_aliases.json to push the updated aliases to ~/.config/stt/aliases.json.
Upgrading from v0.3.0? Config files moved to
~/.config/stt/. Remove the old files before re-running setup:rm -f ~/.config/stt.json ~/.config/stt-aliases.json
curl -fsSL https://stt.demo.land/setup.sh | bash -s -- --uninstall
Removes the scripts, the config, the keybinding, the server, and the downloaded models.