Speech-to-Text
Dictate messages to AI agents using voice input.
Overview
OpenCode Manager supports two STT providers:
- Built-in Browser - Uses your browser's Web Speech API
- External API - OpenAI-compatible STT endpoints (Whisper)
Built-in Browser STT
Uses your browser's built-in speech recognition via the Web Speech API.
Advantages
- No API key required
- Works without server communication
- Free to use
- Interim results while speaking
Limitations
- Voice recognition quality varies by browser/OS
- Not supported in all browsers
- Language support depends on browser
Setup
- Go to Settings > Voice
- Under Speech-to-Text, select Built-in Browser provider
- Optionally configure language
- Click the microphone button in the chat input to test
External API STT
Connect to OpenAI-compatible STT endpoints for higher accuracy transcription.
Advantages
- Higher accuracy transcription
- Consistent across devices
- Supports many languages
Limitations
- Requires API key and endpoint
- Requires network connection
- Server-side processing
Setup
- Go to Settings > Voice
- Under Speech-to-Text, select External API provider
- Enter the STT Server URL:
- OpenAI:
https://api.openai.com
- OpenAI:
- Enter your API Key
- Wait for model discovery
- Choose a model (e.g.,
whisper-1) - Optionally set language
Compatible Services
Any OpenAI-compatible transcription API works:
- OpenAI Whisper
- Azure OpenAI
- Self-hosted Whisper servers
- Local STT servers with OpenAI-compatible API
Using Voice Input
Recording
- Click the microphone button in the chat input
- Speak your message
- Click the stop button when finished
- Your speech is transcribed into the input field
- Review and send
Recording Overlay
While recording, a visual overlay indicates active recording status.
Aborting
Click the cancel button during recording to discard without transcribing.