Initial commit: Calliope voice-to-text macOS menu bar app

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
syntaxbullet
2026-02-17 15:08:53 +01:00
commit 7cbf2d04a9
15 changed files with 1431 additions and 0 deletions

42
CLAUDE.md Normal file
View File

@@ -0,0 +1,42 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## What is Calliope?
A macOS menu bar app for local voice-to-text. Users press a hotkey, speak, and transcribed text is typed into the focused app. Runs entirely offline using Whisper models via Hugging Face Transformers + PyTorch.
## Setup & Running
```bash
pip install -e . # Install in dev mode
calliope # Launch (runs setup wizard on first run)
calliope setup # Re-run setup wizard
calliope --debug # Launch with debug logging
calliope --device 2 --model openai/whisper-large-v3 # Override config
```
No test suite or linter is configured yet.
## Architecture
**Entry point:** `calliope/cli.py` → Click CLI → `calliope/app.py:main()`
**Data flow:** Hotkey press → Record audio → Transcribe with Whisper → Type into focused app
Key modules in `calliope/`:
- **app.py** — `CalliopeApp(rumps.App)`: main orchestrator, manages menu bar UI and coordinates all components
- **recorder.py** — Audio capture via `sounddevice` at 16kHz mono float32, with chunk consolidation
- **transcriber.py** — Whisper STT using HF `transformers.pipeline("automatic-speech-recognition")`
- **hotkeys.py** — `HotkeyListener` using `pynput`: supports push-to-talk (Ctrl+Shift hold) and toggle (Ctrl+Space) modes
- **typer.py** — Outputs text via Quartz CGEvents (character mode) or clipboard paste (Cmd+V)
- **overlay.py** — `WaveformOverlay`: floating NSPanel with scrolling waveform during recording, pulsing dots during transcription
- **setup_wizard.py** — Rich-based interactive first-run config (mic, hotkeys, model download)
- **config.py** — Loads/saves YAML config at `~/.config/calliope/config.yaml`
## Platform Constraints
- **macOS only** — uses `pyobjc` bindings (Quartz, AppKit, AVFoundation, ApplicationServices)
- **MPS (Apple Silicon):** must use float32, not float16 (causes garbled Whisper output)
- Requires Accessibility and Microphone permissions in macOS System Settings