chore: update readme
This commit is contained in:
106
README.md
106
README.md
@@ -1,8 +1,21 @@
|
|||||||
# Calliope
|
# Calliope
|
||||||
|
|
||||||
Voice-to-text for macOS — speak and type into any app.
|
**A locally running voice transcription app built specifically for macOS. Fully MLX accelerated, fully free.**
|
||||||
|
|
||||||
Calliope sits in your menu bar, listens when you hold a hotkey, transcribes your speech with Whisper, and types the result into whatever app is focused. No cloud, no API keys — everything runs locally on your Mac.
|
Calliope lives in your menu bar and turns speech into text in any application. Press a hotkey, speak, and your words appear wherever your cursor is. No cloud services, no API keys, no subscriptions — everything runs on-device using Apple Silicon acceleration.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- **Menu bar native** — Runs quietly in the macOS menu bar, always one hotkey away
|
||||||
|
- **Universal text input** — Types transcribed text directly into any focused application via Quartz events or clipboard paste
|
||||||
|
- **On-device transcription** — Powered by OpenAI Whisper models via Hugging Face Transformers, accelerated with MPS on Apple Silicon
|
||||||
|
- **LLM post-processing** — Optional grammar and punctuation correction using local MLX language models
|
||||||
|
- **Live waveform overlay** — Floating visual feedback showing audio levels during recording and a pulsing indicator during transcription
|
||||||
|
- **Dual hotkey modes** — Push-to-talk (hold to record) and toggle (tap to start/stop), both fully configurable
|
||||||
|
- **Multi-language support** — Transcribe in English, Spanish, French, German, Japanese, Chinese, Korean, Portuguese, Italian, Dutch, Russian, or auto-detect
|
||||||
|
- **Context prompting** — Provide domain-specific vocabulary to improve transcription accuracy for technical or specialized content
|
||||||
|
- **Interactive setup wizard** — Rich terminal UI that walks through microphone selection, hotkey configuration, model download, and permission checks on first run
|
||||||
|
- **Configurable models** — Choose from multiple Whisper model sizes to balance speed and accuracy, from `whisper-base` to `whisper-large-v3`
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
|
||||||
@@ -12,75 +25,82 @@ cd calliope
|
|||||||
pip install -e .
|
pip install -e .
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### Requirements
|
||||||
|
|
||||||
|
- macOS on Apple Silicon (M1 or later)
|
||||||
|
- Python 3.10+
|
||||||
|
- Accessibility permission (for typing into other apps)
|
||||||
|
- Microphone permission (for audio capture)
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# First run — launches the setup wizard, then starts the app
|
calliope # Launch (runs setup wizard on first run)
|
||||||
calliope
|
calliope setup # Re-run the setup wizard
|
||||||
|
calliope --debug # Launch with verbose logging
|
||||||
# Re-run the setup wizard
|
calliope --device 2 --model openai/whisper-large-v3 # Override config for this session
|
||||||
calliope setup
|
calliope --version # Print version
|
||||||
|
|
||||||
# Launch with overrides
|
|
||||||
calliope --device 2 --model openai/whisper-large-v3 --debug
|
|
||||||
|
|
||||||
# Print version
|
|
||||||
calliope --version
|
|
||||||
```
|
```
|
||||||
|
|
||||||
## Hotkeys
|
## Hotkeys
|
||||||
|
|
||||||
| Action | Default | Description |
|
| Mode | Default | Behavior |
|
||||||
|--------|---------|-------------|
|
|------|---------|----------|
|
||||||
| Push-to-talk | `Ctrl+Shift` (hold) | Records while held, transcribes on release |
|
| Push-to-talk | `Ctrl+Shift` (hold) | Records while held, transcribes on release |
|
||||||
| Toggle | `Ctrl+Space` | Start/stop recording |
|
| Toggle | `Ctrl+Space` | Tap to start recording, tap again to stop and transcribe |
|
||||||
|
|
||||||
Hotkeys are configurable via the setup wizard or `~/.config/calliope/config.yaml`.
|
Hotkeys are fully configurable through the setup wizard or by editing the config file directly.
|
||||||
|
|
||||||
## Permissions
|
|
||||||
|
|
||||||
Calliope needs two macOS permissions:
|
|
||||||
|
|
||||||
- **Accessibility** — to type text into other apps (System Settings > Privacy & Security > Accessibility)
|
|
||||||
- **Microphone** — to record audio (System Settings > Privacy & Security > Microphone)
|
|
||||||
|
|
||||||
The setup wizard checks for these and can open System Settings for you.
|
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
Config lives at `~/.config/calliope/config.yaml`:
|
All settings are stored at `~/.config/calliope/config.yaml`:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
device: null # sounddevice index; null = system default
|
device: null # Microphone index (null = system default)
|
||||||
model: distil-whisper/distil-large-v3
|
model: distil-whisper/distil-large-v3
|
||||||
|
language: auto # Language code or "auto" for detection
|
||||||
hotkeys:
|
hotkeys:
|
||||||
ptt: ctrl+shift
|
ptt: ctrl+shift
|
||||||
toggle: ctrl+space
|
toggle: ctrl+space
|
||||||
context: "" # domain-specific terms to help Whisper
|
context: "" # Domain-specific terms to improve accuracy
|
||||||
|
typing_mode: char # "char" (keystroke simulation) or "clipboard" (Cmd+V paste)
|
||||||
|
typing_delay: 0.005 # Seconds between keystrokes in char mode
|
||||||
|
max_recording_seconds: 300 # Maximum recording duration
|
||||||
|
silence_threshold: 0.005 # RMS energy below which audio is considered silence
|
||||||
|
notifications: true # macOS notification banners
|
||||||
|
postprocessing:
|
||||||
|
enabled: false # LLM grammar/punctuation correction
|
||||||
|
model: null # Active MLX model
|
||||||
|
system_prompt: "..." # Custom post-processing instructions
|
||||||
debug: false
|
debug: false
|
||||||
```
|
```
|
||||||
|
|
||||||
CLI flags override config values for that session.
|
CLI flags override config values for that session.
|
||||||
|
|
||||||
|
## Available Models
|
||||||
|
|
||||||
|
| Model | Size | Speed | Accuracy |
|
||||||
|
|-------|------|-------|----------|
|
||||||
|
| `openai/whisper-base` | ~150 MB | Fastest | Basic |
|
||||||
|
| `openai/whisper-small` | ~500 MB | Fast | Good |
|
||||||
|
| `openai/whisper-medium` | ~1.5 GB | Moderate | Better |
|
||||||
|
| `distil-whisper/distil-large-v3` | ~1.5 GB | Fast | High (default) |
|
||||||
|
| `openai/whisper-large-v3` | ~3 GB | Slower | Highest |
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
**"Status: Model load failed"**
|
**"Status: Model load failed"**
|
||||||
Check that you have enough disk space and RAM. The default model needs ~1.5 GB. Run with `--debug` for detailed logs.
|
Verify you have sufficient disk space and RAM for the selected model. Run with `--debug` for detailed error logs.
|
||||||
|
|
||||||
**No text appears after transcribing**
|
**No text appears after transcription**
|
||||||
Make sure Accessibility permission is granted. Restart Calliope after granting it.
|
Confirm that Accessibility permission is granted in System Settings > Privacy & Security > Accessibility. Restart Calliope after granting.
|
||||||
|
|
||||||
**Wrong microphone**
|
**Wrong microphone selected**
|
||||||
Run `calliope setup` to pick a different input device, or set `device` in the config file. Use `python -m sounddevice` to list devices.
|
Run `calliope setup` to choose a different input device, or set the `device` index in the config file. Use `python -m sounddevice` to list available devices.
|
||||||
|
|
||||||
**Hotkeys not working**
|
**Hotkeys not responding**
|
||||||
Ensure no other app is capturing the same key combo. Customize hotkeys via `calliope setup`.
|
Ensure no other application is capturing the same key combination. Reconfigure hotkeys via `calliope setup`.
|
||||||
|
|
||||||
## Remaining TODOs
|
## License
|
||||||
|
|
||||||
- LICENSE file
|
TBD
|
||||||
- Unit tests
|
|
||||||
- CI/CD pipeline
|
|
||||||
- Homebrew formula
|
|
||||||
- `.app` bundle for drag-and-drop install
|
|
||||||
- Changelog
|
|
||||||
|
|||||||
Reference in New Issue
Block a user