Calliope lives in your menu bar and turns speech into text in any application. Press a hotkey, speak, and your words appear wherever your cursor is. No cloud services, no API keys, no subscriptions — everything runs on-device using Apple Silicon acceleration.

Features

Menu bar native — Runs quietly in the macOS menu bar, always one hotkey away
Universal text input — Types transcribed text directly into any focused application via Quartz events or clipboard paste
On-device transcription — Powered by OpenAI Whisper models via mlx-whisper, natively accelerated on Apple Silicon with no MPS/PyTorch overhead
Auto-stop on silence — Recording stops automatically after a configurable period of silence, so you don't have to press the hotkey again
LLM post-processing — Optional grammar and punctuation correction using local MLX language models
Live waveform overlay — Floating visual feedback showing audio levels during recording and a pulsing indicator during transcription
Dual hotkey modes — Push-to-talk (hold to record) and toggle (tap to start/stop), both fully configurable
Multi-language support — Transcribe in English, Spanish, French, German, Japanese, Chinese, Korean, Portuguese, Italian, Dutch, Russian, or auto-detect
Context prompting — Provide domain-specific vocabulary to improve transcription accuracy for technical or specialized content
Interactive setup wizard — Rich terminal UI that walks through microphone selection, hotkey configuration, model download, and permission checks on first run
Configurable models — Choose from multiple Whisper model sizes to balance speed and accuracy, from whisper-base to whisper-large-v3

Installation

git clone https://github.com/yourname/calliope.git
cd calliope
pip install -e .

Requirements

macOS on Apple Silicon (M1 or later)
Python 3.10+
Accessibility permission (for typing into other apps)
Microphone permission (for audio capture)

Usage

calliope                # Launch (runs setup wizard on first run)
calliope setup          # Re-run the setup wizard
calliope --debug        # Launch with verbose logging
calliope --device 2 --model mlx-community/whisper-large-v3  # Override config for this session
calliope --version      # Print version

Hotkeys

Mode	Default	Behavior
Push-to-talk	`Ctrl+Shift` (hold)	Records while held, transcribes on release
Toggle	`Ctrl+Space`	Tap to start recording, tap again to stop and transcribe

Hotkeys are fully configurable through the setup wizard or by editing the config file directly.

Configuration

All settings are stored at ~/.config/calliope/config.yaml:

device: null                        # Microphone index (null = system default)
model: mlx-community/whisper-large-v3-turbo
language: auto                      # Language code or "auto" for detection
hotkeys:
  ptt: ctrl+shift
  toggle: ctrl+space
context: ""                         # Domain-specific terms to improve accuracy
typing_mode: char                   # "char" (keystroke simulation) or "clipboard" (Cmd+V paste)
typing_delay: 0.005                 # Seconds between keystrokes in char mode
max_recording_seconds: 300          # Maximum recording duration
silence_threshold: 0.005            # RMS energy below which audio is considered silence
auto_stop_silence: true             # Automatically stop recording after sustained silence
silence_timeout_seconds: 1.5        # Seconds of silence before auto-stop triggers
notifications: true                 # macOS notification banners
postprocessing:
  enabled: false                    # LLM grammar/punctuation correction
  model: null                       # Active MLX model
  system_prompt: "..."              # Custom post-processing instructions
debug: false

CLI flags override config values for that session.

Available Models

All models are sourced from Hugging Face and run natively via mlx-whisper on Apple Silicon.

Model	Size	Speed	Accuracy
`mlx-community/whisper-base`	~150 MB	Fastest	Basic
`mlx-community/whisper-small`	~500 MB	Fast	Good
`mlx-community/whisper-medium`	~1.5 GB	Moderate	Better
`mlx-community/whisper-large-v3-turbo`	~1.6 GB	Fast	High (default)
`mlx-community/whisper-large-v3`	~3 GB	Slower	Highest

Troubleshooting

"Status: Model load failed" Verify you have sufficient disk space and RAM for the selected model. Run with --debug for detailed error logs.

No text appears after transcription Confirm that Accessibility permission is granted in System Settings > Privacy & Security > Accessibility. Restart Calliope after granting.

Wrong microphone selected Run calliope setup to choose a different input device, or set the device index in the config file. Use python -m sounddevice to list available devices.

Hotkeys not responding Ensure no other application is capturing the same key combination. Reconfigure hotkeys via calliope setup.

License

TBD