4.7 KiB
Calliope
A locally running voice transcription app built specifically for macOS. Fully MLX accelerated, fully free.
Calliope lives in your menu bar and turns speech into text in any application. Press a hotkey, speak, and your words appear wherever your cursor is. No cloud services, no API keys, no subscriptions — everything runs on-device using Apple Silicon acceleration.
Features
- Menu bar native — Runs quietly in the macOS menu bar, always one hotkey away
- Universal text input — Types transcribed text directly into any focused application via Quartz events or clipboard paste
- On-device transcription — Powered by OpenAI Whisper models via Hugging Face Transformers, accelerated with MPS on Apple Silicon
- LLM post-processing — Optional grammar and punctuation correction using local MLX language models
- Live waveform overlay — Floating visual feedback showing audio levels during recording and a pulsing indicator during transcription
- Dual hotkey modes — Push-to-talk (hold to record) and toggle (tap to start/stop), both fully configurable
- Multi-language support — Transcribe in English, Spanish, French, German, Japanese, Chinese, Korean, Portuguese, Italian, Dutch, Russian, or auto-detect
- Context prompting — Provide domain-specific vocabulary to improve transcription accuracy for technical or specialized content
- Interactive setup wizard — Rich terminal UI that walks through microphone selection, hotkey configuration, model download, and permission checks on first run
- Configurable models — Choose from multiple Whisper model sizes to balance speed and accuracy, from
whisper-basetowhisper-large-v3
Installation
git clone https://github.com/yourname/calliope.git
cd calliope
pip install -e .
Requirements
- macOS on Apple Silicon (M1 or later)
- Python 3.10+
- Accessibility permission (for typing into other apps)
- Microphone permission (for audio capture)
Usage
calliope # Launch (runs setup wizard on first run)
calliope setup # Re-run the setup wizard
calliope --debug # Launch with verbose logging
calliope --device 2 --model openai/whisper-large-v3 # Override config for this session
calliope --version # Print version
Hotkeys
| Mode | Default | Behavior |
|---|---|---|
| Push-to-talk | Ctrl+Shift (hold) |
Records while held, transcribes on release |
| Toggle | Ctrl+Space |
Tap to start recording, tap again to stop and transcribe |
Hotkeys are fully configurable through the setup wizard or by editing the config file directly.
Configuration
All settings are stored at ~/.config/calliope/config.yaml:
device: null # Microphone index (null = system default)
model: distil-whisper/distil-large-v3
language: auto # Language code or "auto" for detection
hotkeys:
ptt: ctrl+shift
toggle: ctrl+space
context: "" # Domain-specific terms to improve accuracy
typing_mode: char # "char" (keystroke simulation) or "clipboard" (Cmd+V paste)
typing_delay: 0.005 # Seconds between keystrokes in char mode
max_recording_seconds: 300 # Maximum recording duration
silence_threshold: 0.005 # RMS energy below which audio is considered silence
notifications: true # macOS notification banners
postprocessing:
enabled: false # LLM grammar/punctuation correction
model: null # Active MLX model
system_prompt: "..." # Custom post-processing instructions
debug: false
CLI flags override config values for that session.
Available Models
| Model | Size | Speed | Accuracy |
|---|---|---|---|
openai/whisper-base |
~150 MB | Fastest | Basic |
openai/whisper-small |
~500 MB | Fast | Good |
openai/whisper-medium |
~1.5 GB | Moderate | Better |
distil-whisper/distil-large-v3 |
~1.5 GB | Fast | High (default) |
openai/whisper-large-v3 |
~3 GB | Slower | Highest |
Troubleshooting
"Status: Model load failed"
Verify you have sufficient disk space and RAM for the selected model. Run with --debug for detailed error logs.
No text appears after transcription Confirm that Accessibility permission is granted in System Settings > Privacy & Security > Accessibility. Restart Calliope after granting.
Wrong microphone selected
Run calliope setup to choose a different input device, or set the device index in the config file. Use python -m sounddevice to list available devices.
Hotkeys not responding
Ensure no other application is capturing the same key combination. Reconfigure hotkeys via calliope setup.
License
TBD