VoiceAgent — Voice AI Agents Platform

Introduction

Overview

VoiceAgents is a multi-tenant voice agents platform that lets you deploy AI-powered phone agents in minutes. Configure your agent once, assign a phone number, and start handling inbound and outbound calls — no telephony expertise required.

Every agent runs a real-time STT → LLM → TTS pipeline: speech from the caller is transcribed, sent to your chosen language model, and the response is synthesized back as natural voice — all within milliseconds.

Developer-first: Everything you can do in the dashboard is also available via the REST API. Build workflows, trigger calls, and retrieve transcripts programmatically using your API key.

Key Features

Multi-provider AI

Mix and match STT, LLM, and TTS providers per agent. OpenAI, Anthropic, ElevenLabs, Deepgram, and more.

PSTN / Phone Calls

Real phone calls via Twilio. Inbound and outbound — callers use their regular phone, no app required.

Real-time Transcripts

Full conversation transcripts saved per call. Searchable, exportable, accessible via API.

Call Recordings

Optional call recording with presigned S3 URLs for secure playback and archival.

Knowledge Bases

Attach PDFs and URLs to an agent. It retrieves relevant context before responding.

Developer API

REST API with API key auth. Manage agents and trigger calls from your own application.

Webhooks

Receive events (call started, completed, transcript ready) at your own endpoint.

Multi-tenancy

Full tenant isolation. Every agent, call, and key is scoped to your account.

Architecture

Each call runs through three stages in real-time:

Caller speaks→STT transcribes→LLM generates reply→TTS synthesizes voice→Caller hears response

Layer	Role	Supported Providers
STT	Transcribes caller audio to text	Deepgram, AssemblyAI, OpenAI Whisper, Spitch
LLM	Generates agent response	OpenAI, Anthropic, Google
TTS	Synthesizes text to voice	ElevenLabs, Deepgram, OpenAI TTS, Cartesia, Spitch
Telephony	PSTN bridging	Twilio, Africa's Talking

Quick Start

Create an Account

1Go to the sign-up page and enter your name, email, and a strong password (min 6 chars, one uppercase, one number, one special character).
2After sign-up, you'll be prompted to create your organisation. Enter your company name — this sets up your tenant.
3You'll land on the Dashboard. A free trial credit is applied automatically so you can start making calls right away.

Create Your First Agent

1Navigate to Agents in the sidebar and click New Agent.
2Enter an agent name and write a system prompt describing the agent's role and behaviour.
3Select your STT, LLM, and TTS providers (see for details).
4Click Save. Your agent is ready to handle calls.

Tip: Use Auto-Build to generate your system prompt and greeting automatically from a plain-English description of your agent's goal.

Get an API Key

API keys let your applications interact with VoiceAgents programmatically.

1Go to Developer → API Keys in the sidebar.
2Click Create New Key, give it a descriptive name (e.g. "Production App").
3Copy the key immediately — it starts with eq_live_ and is shown only once.

Make Your First API Call

Initiate an outbound call using your API key:

bash

curl -X POST https://your-api-url.com/v1/calls/outbound \
  -H "x-api-key: eq_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{
    "agentId": "your-agent-id",
    "toNumber": "+15551234567",
    "fromNumber": "+15559876543"
  }'

Response:

json

{
  "callId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "initiated"
}

Use the callId to poll for status, retrieve the transcript, or fetch the recording.

Agent Setup

Speech-to-Text (STT)

The STT provider transcribes the caller's audio in real time. Choose the provider and model that best fits your language, accuracy, and latency requirements.

Provider	Recommended Model	Best For
Deepgram	nova-3	Low latency, English, most use cases
AssemblyAI	best	High accuracy, async transcription
OpenAI Whisper	whisper-1	Multilingual, 90+ languages
Spitch	—	African languages (Swahili, Yoruba, etc.)

Deepgram Options

Setting	Default	Description
model	nova-3	STT model (nova-3, nova-2, enhanced)
language	en	BCP-47 language code
detect_language	false	Auto-detect language per utterance
smart_format	true	Numbers, punctuation formatting
smart_endpointing	true	Intelligent sentence-end detection
endpointing_min_words	2	Min words before endpoint fires (1–10)

Language Model (LLM)

The LLM generates the agent's responses based on the system prompt, conversation history, and any retrieved knowledge base context.

Provider	Models	Notes
OpenAI	gpt-4.1-mini, gpt-4o, gpt-4.1	Best balance of speed and quality
Anthropic	claude-sonnet-4-5, claude-3-5-haiku	Strong reasoning, longer context
Google	gemini-2.0-flash, gemini-1.5-pro	Fast, multimodal capable

Parameters

Parameter	Range	Recommended	Description
temperature	0 – 2	0.3 – 0.7	Higher = more creative / unpredictable. Keep low for support agents.
max_tokens	0 – 4096	200 – 400	Max tokens per response. Keep short for voice — 200 ≈ 30 spoken seconds.
top_p	0 – 1	—	Nucleus sampling. Alternative to temperature (OpenAI only).
frequency_penalty	-2 – 2	—	Reduce word repetition (OpenAI only).

Voice tip: Keep max_tokens between 150–400 for voice. Longer responses feel unnatural on a phone call. The agent speaks at roughly 150 words/min — 300 tokens ≈ 30 seconds of speech.

Text-to-Speech (TTS)

The TTS provider converts the LLM's text response into speech that the caller hears.

Provider	Key Setting	Best For
ElevenLabs	voice_id (required)	Most natural voice, wide voice library
OpenAI TTS	voice (alloy/nova/shimmer/echo/onyx/fable)	Fast, consistent, good quality
Deepgram	model (aura-asteria-en, aura-*)	Low latency streaming
Cartesia	voice_id + model	Emotion control, ultra-low latency
Spitch	language + voice	African languages support

Common Settings

Setting	Range	Description
speed	0.25 – 4.0	Playback speed multiplier. 1.0 = normal. 1.1–1.2 recommended for efficiency.
stability	0 – 1	ElevenLabs: higher = more consistent, less expressive.
similarity_boost	0 – 1	ElevenLabs: how closely to match the original voice.
style	0 – 1	ElevenLabs: speaking style intensity.

Call Settings

These settings control how the agent behaves during a call.

Setting	Default	Description
Greeting Message	—	What the agent says when the call connects. Leave blank to start listening immediately.
Interruption Threshold (ms)	100	How many ms of caller speech before the agent stops talking. Lower = more interruptible.
Silence Timeout (ms)	800	How many ms of silence before the agent considers the caller done speaking.
Max Call Duration (s)	3600	Hard limit on call length. Prevents runaway calls. Max 7200 (2 hrs).
End Call Phrases	goodbye, bye…	If the caller says any of these, the agent hangs up gracefully.
Transfer Number	—	Phone number to transfer to if the agent says a transfer phrase.
Record Calls	true	Save a recording to S3. Accessible via API or dashboard.

Response Rate Presets

Preset	Interruption	Silence Timeout	Use When
Rapid	50ms	500ms	High-energy, fast-paced conversations
Normal	100ms	800ms	General use (default)
Patient	200ms	1200ms	Elderly callers, slow speakers, complex topics

API Reference

Authentication

All /v1/* endpoints require an API key passed in the x-api-key header.

bash

curl https://your-api-url.com/v1/agents \
  -H "x-api-key: eq_live_your_key_here"

API keys follow the format:

eq_live_<64 lowercase hex characters>

Keys are exactly 72 characters long. Any key that does not match this exact format is rejected before any database lookup.

Security: Store API keys in environment variables, never in source code. Keys are shown only once at creation — if lost, create a new key and revoke the old one.

Rate Limits

Rate limits are applied per tenant (all API keys for an account share the same limit).

Plan	Requests / minute	On Exceed
Trial	30	429 Too Many Requests
Starter	100	429 Too Many Requests
Growth	200	429 Too Many Requests
Business / Enterprise	500	429 Too Many Requests

When rate limited, the response includes a Retry-After header and a retryAfter field (seconds) in the body.

Agents

GET/v1/agents

List all active agents for your account.

Query Parameters

Name	Type	Default	Description
limit	integer	50	Max results (max 100)
offset	integer	0	Pagination offset

POST/v1/agents

Create a new agent.

GET/v1/agents/:id

Get a single agent by ID.

PATCH/v1/agents/:id

Update agent configuration. Only include fields you want to change.

DEL/v1/agents/:id

Soft-delete an agent (sets isActive to false).

POST/v1/agents/:id/clone

Duplicate an agent. The copy gets "(Copy)" appended to the name.

Create Agent — Request Body

json

{
  "name": "Support Agent",
  "systemPrompt": "You are a helpful support assistant for Acme Corp...",
  "greetingMessage": "Hello! How can I help you today?",
  "sttConfig": { "provider": "deepgram", "model": "nova-3", "language": "en" },
  "llmConfig": { "provider": "openai", "model": "gpt-4.1-mini", "temperature": 0.5 },
  "ttsConfig": { "provider": "openai-tts", "voice": "nova" },
  "maxCallDurationS": 3600,
  "recordCalls": true
}

Calls

GET/v1/calls

List calls with optional filters.

Filters: direction, status, startDate, endDate, limit, offset

POST/v1/calls/outbound

Initiate an outbound call to a phone number.

GET/v1/calls/:id

Get full call details including status, duration, and recording info.

DEL/v1/calls/:id

End an active call. Only works on calls with status: initiated, ringing, or in_progress.

GET/v1/calls/:id/transcript

Get the full conversation transcript as an ordered array of {role, content, timestampMs}.

GET/v1/calls/:id/recording

Get a presigned S3 URL for the call recording. Valid for 1 hour.

Outbound Call — Request Body

json

{
  "agentId": "550e8400-e29b-41d4-a716-446655440000",
  "toNumber": "+15551234567",
  "fromNumber": "+15559876543",
  "overrides": {
    "systemPrompt": "Optional per-call system prompt override"
  }
}

Call Status Values

initiatedringingin_progresscompletedfailedended

Error Codes

All errors return a consistent JSON body:

json

{
  "error": "not_found",
  "code": "AGENT_NOT_FOUND",
  "message": "Agent xyz not found",
  "statusCode": 404
}

Code	HTTP	When it occurs
`MISSING_API_KEY`	401	No x-api-key header provided
`INVALID_API_KEY`	401	Key not found, wrong format, or bcrypt mismatch
`EXPIRED_API_KEY`	401	Key is past its expiry date
`TENANT_SUSPENDED`	403	Account has been suspended
`RATE_LIMIT_EXCEEDED`	429	Plan rate limit hit — check Retry-After header
`AGENT_NOT_FOUND`	404	Agent ID does not exist or belongs to another tenant
`CALL_NOT_FOUND`	404	Call ID does not exist or belongs to another tenant
`CALL_NOT_ACTIVE`	400	Trying to end a call that is already completed or ended
`CALL_NO_TRANSCRIPT`	404	Transcript not yet available (call in progress or just ended)
`CALL_NO_RECORDING`	404	Recording not available (disabled or not yet processed)
`VALIDATION_ERROR`	400	Request body failed validation — check issues field
`INTERNAL_ERROR`	500	Unexpected server error

Interactive API Explorer

Try all endpoints directly in your browser using the developer docs. Click Authorize, paste your API key, and run requests without writing any code.

Open API Explorer