Voice Output¶
Skippy can speak responses aloud using text-to-speech (TTS). Enable this for a more conversational experience.
Quick Start¶
- Click 🔇 button (top-right) to enable
- Button changes to 🔊 (green)
- Skippy now speaks responses aloud
- Click again to disable
TTS Providers¶
Edge TTS (Recommended)¶
Microsoft's neural voices - natural sounding and free.
| Feature | Details |
|---|---|
| Quality | Excellent, neural voices |
| Speed | Fast |
| Cost | Free |
| Requirements | Internet connection |
Available Voices:
| Voice ID | Description |
|---|---|
| en-GB-RyanNeural | Ryan (UK Male) - Default |
| en-US-GuyNeural | Guy (US Male) |
| en-US-JennyNeural | Jenny (US Female) |
| en-US-AriaNeural | Aria (US Female, Conversational) |
| en-US-DavisNeural | Davis (US Male) |
| en-US-JaneNeural | Jane (US Female) |
| en-GB-SoniaNeural | Sonia (UK Female) |
| en-AU-WilliamNeural | William (AU Male) |
| en-AU-NatashaNeural | Natasha (AU Female) |
pyttsx3 (Fallback)¶
Windows SAPI voices - works offline but less natural.
| Feature | Details |
|---|---|
| Quality | Robotic, old-style |
| Speed | Instant |
| Cost | Free |
| Requirements | None (Windows built-in) |
Selecting a Voice¶
Via Tray Menu¶
- Right-click tray icon
- Go to 🎤 Voice submenu
- Click 🗣️ Select Voice
- Choose from available voices
Via Config¶
Edit config.json:
Voice Controls¶
Toggle TTS¶
| Method | Action |
|---|---|
| 🔇/🔊 Button | Click to toggle |
| Tray Menu | Voice → 🔊 Speak Responses |
Visual Feedback¶
| State | Button Display |
|---|---|
| Off | 🔇 (gray) |
| On | 🔊 (green) |
| Speaking | 🔈🔉🔊 (animated) |
Speech Rate¶
Adjust how fast Skippy speaks:
Edge TTS Rate¶
{
"edge_rate": "+20%" // 20% faster
// or
"edge_rate": "-10%" // 10% slower
// or
"edge_rate": "+0%" // Normal speed
}
pyttsx3 Rate¶
Text Processing¶
Before speaking, Skippy cleans the text:
What's Removed¶
- Code blocks (
code) - Inline code (
code) - Markdown formatting (bold, italic)
- Headers (# ## ###)
- Links text
- Bullet points
- Excessive whitespace
Example¶
Original Response:
Here's how to **fix** the issue:
1. Run `pip install package`
2. Check the `config.json` file
```python
print("Hello")
**Spoken:**
> "Here's how to fix the issue. Run pip install package. Check the config.json file. code block"
---
## Audio Playback
### How It Works
1. Text split into sentence chunks (streaming)
2. Edge TTS generates MP3 for each chunk
3. MP3 decoded to PCM audio (via miniaudio/pydub)
4. `sounddevice` plays audio directly (no external player)
5. Chunks play sequentially while next chunk generates
### Streaming Architecture
This allows audio to start playing while the response is still being generated.
### Audio Components
| Component | Purpose |
|-----------|---------|
| `UnifiedAudioManager` | Coordinates all audio |
| `AudioPlayer` | Single-threaded playback via sounddevice |
| `TTSGenerator` | Persistent async Edge TTS loop |
| `SoundEffectCache` | Pre-generated effects (DING!, WHIR!) |
---
## Configuration Reference
### config.json Settings
```json
{
"voice_output_enabled": false,
"tts_provider": "edge",
"edge_voice": "en-US-GuyNeural",
"edge_rate": "+0%",
"tts_rate": 175,
"tts_voice": null
}
| Setting | Description | Default |
|---|---|---|
voice_output_enabled |
Enable TTS | false |
tts_provider |
"edge" or "pyttsx3" | "edge" |
edge_voice |
Edge TTS voice ID | "en-US-GuyNeural" |
edge_rate |
Speed adjustment | "+0%" |
tts_rate |
pyttsx3 words/min | 175 |
tts_voice |
pyttsx3 voice name | null (system default) |
Troubleshooting¶
No Sound¶
- Check volume - System volume and app volume
- Check speaker - Ensure output device is correct
- Check TTS enabled - Button should show 🔊
Garbled/Distorted Audio¶
- Try a different voice
- Update audio drivers
- Check for conflicting audio apps
Edge TTS Fails¶
Error: Network-related issues
Fix:
- Check internet connection
- Edge TTS requires connectivity
Fallback:
Voice Not Changing¶
After changing voice in menu:
- Voice saves to config
- Next response uses new voice
- Or restart Skippy to apply immediately
Advanced Usage¶
All Edge TTS Voices¶
Get complete list:
Filter by language:
import asyncio
import edge_tts
async def list_english():
voices = await edge_tts.list_voices()
for v in voices:
if v['Locale'].startswith('en'):
print(f"{v['ShortName']}: {v['FriendlyName']}")
asyncio.run(list_english())
Custom pyttsx3 Voice¶
List available voices:
import pyttsx3
engine = pyttsx3.init()
for voice in engine.getProperty('voices'):
print(f"{voice.name}: {voice.id}")
Set in config:
Best Practices¶
Voice Selection
- Use Guy or Ryan for authoritative responses
- Use Jenny or Aria for conversational tone
- Match accent to your preference (US/UK/AU)
When to Use TTS
- Hands-free computing
- While doing other tasks
- For accessibility needs
When to Disable
- In quiet environments
- When reading code/technical content
- During meetings/calls