Deepgram

Voice AI that listens, transcribes, and speaks for your business

Your AI agent gains ears and a voice. Deepgram transcribes customer call recordings in real time, summarizes audio content, and converts text to natural-sounding speech. Build voice-enabled support that handles inquiries across 30+ languages without a human on the line.

Chosen by 800+ global brands across industries

Speech intelligence at conversation speed

From transcribing customer calls to generating audio responses, your AI agent taps into Deepgram's Nova models to process voice data during live interactions.

Transcribe Audio Files

A customer submits a voicemail or call recording. Your AI agent sends the audio to Deepgram's pre-recorded Speech-to-Text API with smart formatting, punctuation, and speaker diarization enabled, then returns a clean, readable transcript within seconds.

Summarize Conversations

Support manager needs a quick recap of a lengthy call. Your agent sends the audio file to Deepgram's Summarize Audio endpoint and delivers a concise summary highlighting key topics, decisions, and action items from the conversation.

Generate Spoken Responses

Your AI agent composes a reply and needs to deliver it as audio. Using Deepgram's Text-to-Speech REST API, the agent converts text into natural-sounding speech with customizable voice, pitch, and speed settings for phone or messaging channels.

Detect Audio Topics

A recorded customer interaction arrives for categorization. Your agent routes it through Deepgram's Topic Detection API, which transcribes and identifies the main subjects discussed, enabling automatic tagging and routing to the right team.

List Available Models

Your team wants to know which Deepgram models support a specific language. The agent queries the Get Models endpoint and returns a list of available speech-to-text models with their language support and accuracy tiers.

Monitor Project Usage

An operations lead asks about this month's transcription volume. Your agent retrieves the usage summary for the Deepgram project, reporting total minutes submitted, processed, and billed, filtered by date range or model.

Deepgram

Use Cases

Voice-first support scenarios

Real examples of how businesses use Deepgram-powered AI agents to convert spoken customer interactions into actionable data and automated responses.

After-Hours Voicemail Transcription and Triage

A customer leaves a voicemail at midnight about a billing dispute. Your AI Agent picks up the recording, sends it through Deepgram's Speech-to-Text API with smart formatting and diarization, and produces a clean transcript. The agent then categorizes the issue, creates a support ticket with the full transcript attached, and prioritizes it for the morning shift. Your team starts the day with organized, actionable voicemail data instead of a cluttered inbox.

Multilingual Call Summaries for Global Teams

A support center handles calls in Spanish, French, and English. After each call, the AI Agent sends the recording to Deepgram's Summarize Audio endpoint with automatic language detection. The agent extracts a concise summary of the customer's issue and resolution, then logs it in your CRM. Managers across regions review standardized summaries without listening to full recordings, saving hours of manual review every week.

Voice Responses for Accessibility-First Experiences

A visually impaired customer interacts with your website chat. Your AI Agent composes the answer and converts it to audio using Deepgram's Text-to-Speech API with the Aura voice model. The customer hears a natural spoken response instead of reading text. Your business delivers inclusive, accessible support that serves every customer, regardless of how they prefer to consume information.

Try
Deepgram

Deepgram

FAQs

Frequently Asked Questions

Which Deepgram speech models does the AI agent use for transcription?

The agent defaults to Deepgram's Nova-3 model, which delivers a 53% lower word error rate than competing services. You can also configure it to use specialized models like the phonecall model for call center audio or the general model for broader use cases. The model parameter is adjustable per request.

Can the agent transcribe audio with multiple speakers and label who said what?

Yes. The agent enables Deepgram's diarize parameter when processing multi-speaker recordings. Each word in the transcript gets assigned a speaker label, so your team can see exactly who said what. This works with all Deepgram ASR models including Nova and is included at no extra charge.

What audio formats and languages does the transcription support?

Deepgram supports WAV, MP3, FLAC, OGG, and most common audio formats. The content_type parameter specifies the MIME type. Language support covers 30+ languages including English, Spanish, French, German, Portuguese, and Mandarin. Enable detect_language for automatic language identification.

How does Deepgram's Text-to-Speech work within conversations?

The agent calls Deepgram's TTS REST API with the text content, selecting from available Aura voice models. You can customize pitch (0.5 to 2.0), speed (0.25 to 4.0), output format (MP3, WAV), and sample rate. The audio file is generated and delivered to the customer within the conversation thread.

Does Tars store the audio files or transcripts from Deepgram?

Tars does not store your audio files. Audio URLs are sent to Deepgram for processing, and the resulting transcripts or summaries are used within the active conversation. Deepgram processes audio in-memory by default. You control data retention policies through your Deepgram project settings.

What is the topic detection feature and how does it categorize conversations?

Topic Detection transcribes the audio and then identifies the main subjects discussed using Deepgram's audio intelligence layer. The agent receives a list of detected topics with confidence scores. This enables automatic categorization of customer calls, such as tagging billing inquiries, technical issues, or sales conversations without manual review.

How accurate is the transcription for noisy call center recordings?

Deepgram's Nova-3 model is specifically optimized for noisy, real-world audio. You can boost accuracy further using the keywords parameter to prioritize industry-specific terms like product names or technical jargon. The enhanced tier option provides additional accuracy for challenging audio quality.

Can I track how many transcription minutes my AI agent consumes?

Yes. The agent can query Deepgram's Usage Summary API to retrieve minutes submitted, processed, and billed for any time range. Filter by model, accessor, or tag to break down usage across different agent workflows. This helps you monitor costs and optimize which audio gets processed.

How to add Tools to your AI Agent

Supercharge your AI Agent with Tool Integrations

Don't limit your AI Agent to basic conversations. Watch how to configure and add powerful tools making your agent smarter and more functional.

Privacy & Security

We’ll never let you lose sleep over privacy and security concerns

At Tars, we take privacy and security very seriously. We are compliant with GDPR, ISO, SOC 2, and HIPAA.

GDPR
ISO
SOC 2
HIPAA

Still scrolling? We both know you're interested.

Let's chat about AI Agents the old-fashioned way. Get a demo tailored to your requirements.

Schedule a Demo