Files
calliope/README.md
2026-02-17 17:15:08 +01:00

4.7 KiB

Calliope

A locally running voice transcription app built specifically for macOS. Fully MLX accelerated, fully free.

Calliope lives in your menu bar and turns speech into text in any application. Press a hotkey, speak, and your words appear wherever your cursor is. No cloud services, no API keys, no subscriptions — everything runs on-device using Apple Silicon acceleration.

Features

  • Menu bar native — Runs quietly in the macOS menu bar, always one hotkey away
  • Universal text input — Types transcribed text directly into any focused application via Quartz events or clipboard paste
  • On-device transcription — Powered by OpenAI Whisper models via Hugging Face Transformers, accelerated with MPS on Apple Silicon
  • LLM post-processing — Optional grammar and punctuation correction using local MLX language models
  • Live waveform overlay — Floating visual feedback showing audio levels during recording and a pulsing indicator during transcription
  • Dual hotkey modes — Push-to-talk (hold to record) and toggle (tap to start/stop), both fully configurable
  • Multi-language support — Transcribe in English, Spanish, French, German, Japanese, Chinese, Korean, Portuguese, Italian, Dutch, Russian, or auto-detect
  • Context prompting — Provide domain-specific vocabulary to improve transcription accuracy for technical or specialized content
  • Interactive setup wizard — Rich terminal UI that walks through microphone selection, hotkey configuration, model download, and permission checks on first run
  • Configurable models — Choose from multiple Whisper model sizes to balance speed and accuracy, from whisper-base to whisper-large-v3

Installation

git clone https://github.com/yourname/calliope.git
cd calliope
pip install -e .

Requirements

  • macOS on Apple Silicon (M1 or later)
  • Python 3.10+
  • Accessibility permission (for typing into other apps)
  • Microphone permission (for audio capture)

Usage

calliope                # Launch (runs setup wizard on first run)
calliope setup          # Re-run the setup wizard
calliope --debug        # Launch with verbose logging
calliope --device 2 --model openai/whisper-large-v3  # Override config for this session
calliope --version      # Print version

Hotkeys

Mode Default Behavior
Push-to-talk Ctrl+Shift (hold) Records while held, transcribes on release
Toggle Ctrl+Space Tap to start recording, tap again to stop and transcribe

Hotkeys are fully configurable through the setup wizard or by editing the config file directly.

Configuration

All settings are stored at ~/.config/calliope/config.yaml:

device: null                    # Microphone index (null = system default)
model: distil-whisper/distil-large-v3
language: auto                  # Language code or "auto" for detection
hotkeys:
  ptt: ctrl+shift
  toggle: ctrl+space
context: ""                     # Domain-specific terms to improve accuracy
typing_mode: char               # "char" (keystroke simulation) or "clipboard" (Cmd+V paste)
typing_delay: 0.005             # Seconds between keystrokes in char mode
max_recording_seconds: 300      # Maximum recording duration
silence_threshold: 0.005        # RMS energy below which audio is considered silence
notifications: true             # macOS notification banners
postprocessing:
  enabled: false                # LLM grammar/punctuation correction
  model: null                   # Active MLX model
  system_prompt: "..."          # Custom post-processing instructions
debug: false

CLI flags override config values for that session.

Available Models

Model Size Speed Accuracy
openai/whisper-base ~150 MB Fastest Basic
openai/whisper-small ~500 MB Fast Good
openai/whisper-medium ~1.5 GB Moderate Better
distil-whisper/distil-large-v3 ~1.5 GB Fast High (default)
openai/whisper-large-v3 ~3 GB Slower Highest

Troubleshooting

"Status: Model load failed" Verify you have sufficient disk space and RAM for the selected model. Run with --debug for detailed error logs.

No text appears after transcription Confirm that Accessibility permission is granted in System Settings > Privacy & Security > Accessibility. Restart Calliope after granting.

Wrong microphone selected Run calliope setup to choose a different input device, or set the device index in the config file. Use python -m sounddevice to list available devices.

Hotkeys not responding Ensure no other application is capturing the same key combination. Reconfigure hotkeys via calliope setup.

License

TBD