From c73fe3c2c4a31b8fe35de689f2e540ed58c261e0 Mon Sep 17 00:00:00 2001 From: syntaxbullet Date: Tue, 17 Feb 2026 17:15:08 +0100 Subject: [PATCH] chore: update readme --- README.md | 106 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 63 insertions(+), 43 deletions(-) diff --git a/README.md b/README.md index ecd93ed..b4d8968 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,21 @@ # Calliope -Voice-to-text for macOS — speak and type into any app. +**A locally running voice transcription app built specifically for macOS. Fully MLX accelerated, fully free.** -Calliope sits in your menu bar, listens when you hold a hotkey, transcribes your speech with Whisper, and types the result into whatever app is focused. No cloud, no API keys — everything runs locally on your Mac. +Calliope lives in your menu bar and turns speech into text in any application. Press a hotkey, speak, and your words appear wherever your cursor is. No cloud services, no API keys, no subscriptions — everything runs on-device using Apple Silicon acceleration. + +## Features + +- **Menu bar native** — Runs quietly in the macOS menu bar, always one hotkey away +- **Universal text input** — Types transcribed text directly into any focused application via Quartz events or clipboard paste +- **On-device transcription** — Powered by OpenAI Whisper models via Hugging Face Transformers, accelerated with MPS on Apple Silicon +- **LLM post-processing** — Optional grammar and punctuation correction using local MLX language models +- **Live waveform overlay** — Floating visual feedback showing audio levels during recording and a pulsing indicator during transcription +- **Dual hotkey modes** — Push-to-talk (hold to record) and toggle (tap to start/stop), both fully configurable +- **Multi-language support** — Transcribe in English, Spanish, French, German, Japanese, Chinese, Korean, Portuguese, Italian, Dutch, Russian, or auto-detect +- **Context prompting** — Provide domain-specific vocabulary to improve transcription accuracy for technical or specialized content +- **Interactive setup wizard** — Rich terminal UI that walks through microphone selection, hotkey configuration, model download, and permission checks on first run +- **Configurable models** — Choose from multiple Whisper model sizes to balance speed and accuracy, from `whisper-base` to `whisper-large-v3` ## Installation @@ -12,75 +25,82 @@ cd calliope pip install -e . ``` +### Requirements + +- macOS on Apple Silicon (M1 or later) +- Python 3.10+ +- Accessibility permission (for typing into other apps) +- Microphone permission (for audio capture) + ## Usage ```bash -# First run — launches the setup wizard, then starts the app -calliope - -# Re-run the setup wizard -calliope setup - -# Launch with overrides -calliope --device 2 --model openai/whisper-large-v3 --debug - -# Print version -calliope --version +calliope # Launch (runs setup wizard on first run) +calliope setup # Re-run the setup wizard +calliope --debug # Launch with verbose logging +calliope --device 2 --model openai/whisper-large-v3 # Override config for this session +calliope --version # Print version ``` ## Hotkeys -| Action | Default | Description | -|--------|---------|-------------| +| Mode | Default | Behavior | +|------|---------|----------| | Push-to-talk | `Ctrl+Shift` (hold) | Records while held, transcribes on release | -| Toggle | `Ctrl+Space` | Start/stop recording | +| Toggle | `Ctrl+Space` | Tap to start recording, tap again to stop and transcribe | -Hotkeys are configurable via the setup wizard or `~/.config/calliope/config.yaml`. - -## Permissions - -Calliope needs two macOS permissions: - -- **Accessibility** — to type text into other apps (System Settings > Privacy & Security > Accessibility) -- **Microphone** — to record audio (System Settings > Privacy & Security > Microphone) - -The setup wizard checks for these and can open System Settings for you. +Hotkeys are fully configurable through the setup wizard or by editing the config file directly. ## Configuration -Config lives at `~/.config/calliope/config.yaml`: +All settings are stored at `~/.config/calliope/config.yaml`: ```yaml -device: null # sounddevice index; null = system default +device: null # Microphone index (null = system default) model: distil-whisper/distil-large-v3 +language: auto # Language code or "auto" for detection hotkeys: ptt: ctrl+shift toggle: ctrl+space -context: "" # domain-specific terms to help Whisper +context: "" # Domain-specific terms to improve accuracy +typing_mode: char # "char" (keystroke simulation) or "clipboard" (Cmd+V paste) +typing_delay: 0.005 # Seconds between keystrokes in char mode +max_recording_seconds: 300 # Maximum recording duration +silence_threshold: 0.005 # RMS energy below which audio is considered silence +notifications: true # macOS notification banners +postprocessing: + enabled: false # LLM grammar/punctuation correction + model: null # Active MLX model + system_prompt: "..." # Custom post-processing instructions debug: false ``` CLI flags override config values for that session. +## Available Models + +| Model | Size | Speed | Accuracy | +|-------|------|-------|----------| +| `openai/whisper-base` | ~150 MB | Fastest | Basic | +| `openai/whisper-small` | ~500 MB | Fast | Good | +| `openai/whisper-medium` | ~1.5 GB | Moderate | Better | +| `distil-whisper/distil-large-v3` | ~1.5 GB | Fast | High (default) | +| `openai/whisper-large-v3` | ~3 GB | Slower | Highest | + ## Troubleshooting **"Status: Model load failed"** -Check that you have enough disk space and RAM. The default model needs ~1.5 GB. Run with `--debug` for detailed logs. +Verify you have sufficient disk space and RAM for the selected model. Run with `--debug` for detailed error logs. -**No text appears after transcribing** -Make sure Accessibility permission is granted. Restart Calliope after granting it. +**No text appears after transcription** +Confirm that Accessibility permission is granted in System Settings > Privacy & Security > Accessibility. Restart Calliope after granting. -**Wrong microphone** -Run `calliope setup` to pick a different input device, or set `device` in the config file. Use `python -m sounddevice` to list devices. +**Wrong microphone selected** +Run `calliope setup` to choose a different input device, or set the `device` index in the config file. Use `python -m sounddevice` to list available devices. -**Hotkeys not working** -Ensure no other app is capturing the same key combo. Customize hotkeys via `calliope setup`. +**Hotkeys not responding** +Ensure no other application is capturing the same key combination. Reconfigure hotkeys via `calliope setup`. -## Remaining TODOs +## License -- LICENSE file -- Unit tests -- CI/CD pipeline -- Homebrew formula -- `.app` bundle for drag-and-drop install -- Changelog +TBD