Calliope

A locally running voice transcription app built specifically for macOS. Fully MLX accelerated, fully free.

Calliope lives in your menu bar and turns speech into text in any application. Press a hotkey, speak, and your words appear wherever your cursor is. No cloud services, no API keys, no subscriptions — everything runs on-device using Apple Silicon acceleration.

Features

  • Menu bar native — Runs quietly in the macOS menu bar, always one hotkey away
  • Universal text input — Types transcribed text directly into any focused application via Quartz events or clipboard paste
  • On-device transcription — Powered by OpenAI Whisper models via mlx-whisper, natively accelerated on Apple Silicon with no MPS/PyTorch overhead
  • Auto-stop on silence — Recording stops automatically after a configurable period of silence, so you don't have to press the hotkey again
  • LLM post-processing — Optional grammar and punctuation correction using local MLX language models
  • Live waveform overlay — Floating visual feedback showing audio levels during recording and a pulsing indicator during transcription
  • Dual hotkey modes — Push-to-talk (hold to record) and toggle (tap to start/stop), both fully configurable
  • Multi-language support — Transcribe in English, Spanish, French, German, Japanese, Chinese, Korean, Portuguese, Italian, Dutch, Russian, or auto-detect
  • Context prompting — Provide domain-specific vocabulary to improve transcription accuracy for technical or specialized content
  • Interactive setup wizard — Rich terminal UI that walks through microphone selection, hotkey configuration, model download, and permission checks on first run
  • Configurable models — Choose from multiple Whisper model sizes to balance speed and accuracy, from whisper-base to whisper-large-v3

Installation

git clone https://github.com/yourname/calliope.git
cd calliope
pip install -e .

Requirements

  • macOS on Apple Silicon (M1 or later)
  • Python 3.10+
  • Accessibility permission (for typing into other apps)
  • Microphone permission (for audio capture)

Usage

calliope                # Launch (runs setup wizard on first run)
calliope setup          # Re-run the setup wizard
calliope --debug        # Launch with verbose logging
calliope --device 2 --model mlx-community/whisper-large-v3  # Override config for this session
calliope --version      # Print version

Hotkeys

Mode Default Behavior
Push-to-talk Ctrl+Shift (hold) Records while held, transcribes on release
Toggle Ctrl+Space Tap to start recording, tap again to stop and transcribe

Hotkeys are fully configurable through the setup wizard or by editing the config file directly.

Configuration

All settings are stored at ~/.config/calliope/config.yaml:

device: null                        # Microphone index (null = system default)
model: mlx-community/whisper-large-v3-turbo
language: auto                      # Language code or "auto" for detection
hotkeys:
  ptt: ctrl+shift
  toggle: ctrl+space
context: ""                         # Domain-specific terms to improve accuracy
typing_mode: char                   # "char" (keystroke simulation) or "clipboard" (Cmd+V paste)
typing_delay: 0.005                 # Seconds between keystrokes in char mode
max_recording_seconds: 300          # Maximum recording duration
silence_threshold: 0.005            # RMS energy below which audio is considered silence
auto_stop_silence: true             # Automatically stop recording after sustained silence
silence_timeout_seconds: 1.5        # Seconds of silence before auto-stop triggers
notifications: true                 # macOS notification banners
postprocessing:
  enabled: false                    # LLM grammar/punctuation correction
  model: null                       # Active MLX model
  system_prompt: "..."              # Custom post-processing instructions
debug: false

CLI flags override config values for that session.

Available Models

All models are sourced from Hugging Face and run natively via mlx-whisper on Apple Silicon.

Model Size Speed Accuracy
mlx-community/whisper-base ~150 MB Fastest Basic
mlx-community/whisper-small ~500 MB Fast Good
mlx-community/whisper-medium ~1.5 GB Moderate Better
mlx-community/whisper-large-v3-turbo ~1.6 GB Fast High (default)
mlx-community/whisper-large-v3 ~3 GB Slower Highest

Troubleshooting

"Status: Model load failed" Verify you have sufficient disk space and RAM for the selected model. Run with --debug for detailed error logs.

No text appears after transcription Confirm that Accessibility permission is granted in System Settings > Privacy & Security > Accessibility. Restart Calliope after granting.

Wrong microphone selected Run calliope setup to choose a different input device, or set the device index in the config file. Use python -m sounddevice to list available devices.

Hotkeys not responding Ensure no other application is capturing the same key combination. Reconfigure hotkeys via calliope setup.

License

TBD

Description
A locally running voice transcription app built specifically for MacOS. Fully MLX accelerated, fully free.
Readme MIT 106 KiB
Languages
Python 100%