Voice Output¶

Skippy can speak responses aloud using text-to-speech (TTS). Enable this for a more conversational experience.

Quick Start¶

Click 🔇 button (top-right) to enable
Button changes to 🔊 (green)
Skippy now speaks responses aloud
Click again to disable

TTS Providers¶

Edge TTS (Recommended)¶

Microsoft's neural voices - natural sounding and free.

Feature	Details
Quality	Excellent, neural voices
Speed	Fast
Cost	Free
Requirements	Internet connection

Available Voices:

Voice ID	Description
en-GB-RyanNeural	Ryan (UK Male) - Default
en-US-GuyNeural	Guy (US Male)
en-US-JennyNeural	Jenny (US Female)
en-US-AriaNeural	Aria (US Female, Conversational)
en-US-DavisNeural	Davis (US Male)
en-US-JaneNeural	Jane (US Female)
en-GB-SoniaNeural	Sonia (UK Female)
en-AU-WilliamNeural	William (AU Male)
en-AU-NatashaNeural	Natasha (AU Female)

pyttsx3 (Fallback)¶

Windows SAPI voices - works offline but less natural.

Feature	Details
Quality	Robotic, old-style
Speed	Instant
Cost	Free
Requirements	None (Windows built-in)

Selecting a Voice¶

Right-click tray icon
Go to 🎤 Voice submenu
Click 🗣️ Select Voice
Choose from available voices

Via Config¶

Edit config.json:

{
    "tts_provider": "edge",
    "edge_voice": "en-GB-RyanNeural",
    "edge_rate": "+0%"
}

Voice Controls¶

Toggle TTS¶

Method	Action
🔇/🔊 Button	Click to toggle
Tray Menu	Voice → 🔊 Speak Responses

Visual Feedback¶

State	Button Display
Off	🔇 (gray)
On	🔊 (green)
Speaking	🔈🔉🔊 (animated)

Speech Rate¶

Adjust how fast Skippy speaks:

Edge TTS Rate¶

{
    "edge_rate": "+20%"   // 20% faster
    // or
    "edge_rate": "-10%"   // 10% slower
    // or
    "edge_rate": "+0%"    // Normal speed
}

pyttsx3 Rate¶

{
    "tts_rate": 175   // Words per minute (default)
    // Typical range: 125-200
}

Text Processing¶

Before speaking, Skippy cleans the text:

What's Removed¶

Code blocks (code)
Inline code (code)
Markdown formatting (bold, italic)
Headers (# ## ###)
Links text
Bullet points
Excessive whitespace

Example¶

Original Response:

Here's how to **fix** the issue:

1. Run `pip install package`
2. Check the `config.json` file

```python
print("Hello")

**Spoken:**
> "Here's how to fix the issue. Run pip install package. Check the config.json file. code block"

---

## Audio Playback

### How It Works

1. Text split into sentence chunks (streaming)
2. Edge TTS generates MP3 for each chunk
3. MP3 decoded to PCM audio (via miniaudio/pydub)
4. `sounddevice` plays audio directly (no external player)
5. Chunks play sequentially while next chunk generates

### Streaming Architecture

Text Chunk → Edge TTS → MP3 → PCM → sounddevice → Speaker ↓ Next chunk generates in parallel

This allows audio to start playing while the response is still being generated.

### Audio Components

| Component | Purpose |
|-----------|---------|
| `UnifiedAudioManager` | Coordinates all audio |
| `AudioPlayer` | Single-threaded playback via sounddevice |
| `TTSGenerator` | Persistent async Edge TTS loop |
| `SoundEffectCache` | Pre-generated effects (DING!, WHIR!) |

---

## Configuration Reference

### config.json Settings

```json
{
    "voice_output_enabled": false,
    "tts_provider": "edge",
    "edge_voice": "en-US-GuyNeural",
    "edge_rate": "+0%",
    "tts_rate": 175,
    "tts_voice": null
}

Setting	Description	Default
`voice_output_enabled`	Enable TTS	false
`tts_provider`	"edge" or "pyttsx3"	"edge"
`edge_voice`	Edge TTS voice ID	"en-US-GuyNeural"
`edge_rate`	Speed adjustment	"+0%"
`tts_rate`	pyttsx3 words/min	175
`tts_voice`	pyttsx3 voice name	null (system default)

Troubleshooting¶

No Sound¶

Check volume - System volume and app volume
Check speaker - Ensure output device is correct
Check TTS enabled - Button should show 🔊

Garbled/Distorted Audio¶

Try a different voice
Update audio drivers
Check for conflicting audio apps

Edge TTS Fails¶

Error: Network-related issues

Fix:

Check internet connection
Edge TTS requires connectivity

Fallback:

{
    "tts_provider": "pyttsx3"
}

Voice Not Changing¶

After changing voice in menu:

Voice saves to config
Next response uses new voice
Or restart Skippy to apply immediately

Advanced Usage¶

All Edge TTS Voices¶

Get complete list:

python -c "import asyncio; import edge_tts; print(asyncio.run(edge_tts.list_voices()))"

Filter by language:

import asyncio
import edge_tts

async def list_english():
    voices = await edge_tts.list_voices()
    for v in voices:
        if v['Locale'].startswith('en'):
            print(f"{v['ShortName']}: {v['FriendlyName']}")

asyncio.run(list_english())

Custom pyttsx3 Voice¶

List available voices:

import pyttsx3
engine = pyttsx3.init()
for voice in engine.getProperty('voices'):
    print(f"{voice.name}: {voice.id}")

Set in config:

{
    "tts_provider": "pyttsx3",
    "tts_voice": "Microsoft David Desktop"
}

Best Practices¶

Voice Selection

Use Guy or Ryan for authoritative responses
Use Jenny or Aria for conversational tone
Match accent to your preference (US/UK/AU)

When to Use TTS

Hands-free computing
While doing other tasks
For accessibility needs

When to Disable

In quiet environments
When reading code/technical content
During meetings/calls