Speech-to-Text

Dictate messages to AI agents using voice input.

Overview

OpenCode Manager supports two STT providers:

Built-in Browser - Uses your browser's Web Speech API
External API - OpenAI-compatible STT endpoints (Whisper)

Built-in Browser STT

Uses your browser's built-in speech recognition via the Web Speech API.

Advantages

No API key required
Works without server communication
Free to use
Interim results while speaking

Limitations

Voice recognition quality varies by browser/OS
Not supported in all browsers
Language support depends on browser

Setup

Go to Settings > Voice
Under Speech-to-Text, select Built-in Browser provider
Optionally configure language
Click the microphone button in the chat input to test

External API STT

Connect to OpenAI-compatible STT endpoints for higher accuracy transcription.

Advantages

Higher accuracy transcription
Consistent across devices
Supports many languages

Limitations

Requires API key and endpoint
Requires network connection
Server-side processing

Setup

Go to Settings > Voice
Under Speech-to-Text, select External API provider
Enter the STT Server URL:
- OpenAI: https://api.openai.com
Enter your API Key
Wait for model discovery
Choose a model (e.g., whisper-1)
Optionally set language

Compatible Services

Any OpenAI-compatible transcription API works:

OpenAI Whisper
Azure OpenAI
Self-hosted Whisper servers
Local STT servers with OpenAI-compatible API

Using Voice Input

Tap-to-Start / Tap-to-Stop

Tap the microphone button in the chat input to begin recording
The button shows active recording status
Tap the stop button when you have finished speaking
The transcribed text is inserted into the input field
Review and send

Recording States

State	Indicator	When it appears
Recording	"Recording…"	Microphone is active; audio is being captured
Processing	"Processing…"	Audio sent to STT backend; waiting for transcript (external provider only)
Interim text	Live partial transcript	Browser is streaming partial results in real time (built-in provider only)

Cancelling

Tap the cancel (×) button during recording to discard the recording without transcribing.

Errors

If recording fails — microphone permission denied, startup timeout, or transcription error — a brief error message appears and auto-dismisses after 3 seconds. No text is inserted.