
LMNT
Your AI agent does more than type responses. With LMNT, it speaks in lifelike, ultra-low-latency voices tailored to your brand. Clone custom voices from short audio samples, synthesize conversational speech in real time, and deliver audio experiences that feel human.




From cloning brand voices to generating speech on the fly, your AI agent taps directly into LMNT's voice engine during live conversations.
LMNT
Real scenarios where your AI agent uses LMNT to turn text into natural speech, creating audio experiences customers remember.
A visually impaired customer asks about your return policy. Instead of sending a wall of text, your AI Agent converts the policy explanation into natural speech using LMNT's synthesis API and delivers an audio message. The customer hears the information clearly, and your business becomes more accessible without any manual effort from your team.
Your marketing team wants a consistent brand voice on WhatsApp and your website. Your AI Agent uses a cloned brand voice in LMNT to generate personalized audio greetings for each visitor. Every customer hears the same warm, on-brand welcome, and your team never records a single audio file manually.
An international shopper browses your catalog and asks to hear a product description. Your AI Agent pulls the product details, translates if needed, and sends the text to LMNT for speech synthesis with a conversational tone enabled. The shopper gets an engaging audio walkthrough that builds confidence and reduces purchase hesitation.

LMNT
FAQs
The agent sends text and a voice ID to LMNT's synthesize speech endpoint. LMNT returns high-fidelity audio in milliseconds. You can enable conversational mode for natural-sounding dialogue or use standard synthesis for formal content. The audio is delivered directly within the customer conversation.
Yes. Upload a short audio sample through the agent, and LMNT creates an instant voice clone. For higher fidelity, provide around five minutes of clean audio. The cloned voice becomes available in your LMNT account for all future synthesis requests from the agent.
LMNT is built for ultra-low latency, delivering streaming audio fast enough for real-time conversations. The output quality matches professional voice-over standards. For noisy source audio during voice cloning, LMNT offers an enhance option that processes the recording to improve clarity.
No. Audio is generated on the fly by LMNT and delivered to the customer during the conversation. Tars does not maintain a persistent cache of synthesized audio files. Each request produces fresh output based on the current text and voice selection.
Absolutely. The agent can list all available voices, filtering by owner type: system voices provided by LMNT, your custom clones, or all voices combined. You can star your favorites for quick access and assign different voices to different conversation types.
When used through Tars, voice synthesis happens contextually during customer conversations. The agent decides when speech is appropriate, selects the right voice, and delivers audio seamlessly. You skip building the orchestration layer and focus on the customer experience instead of API plumbing.
The agent handles rate limit responses gracefully. If LMNT returns a limit error, the agent falls back to text-based responses and retries speech synthesis when capacity is available. Your customers still get answers without interruption.
Yes. The agent can rename voices, update descriptions, change gender tags, unfreeze voices to upgrade them to the latest LMNT model, and delete voices that are no longer needed. All voice management actions happen through authenticated API calls using your LMNT key.
Don't limit your AI Agent to basic conversations. Watch how to configure and add powerful tools making your agent smarter and more functional.

Privacy & Security
At Tars, we take privacy and security very seriously. We are compliant with GDPR, ISO, SOC 2, and HIPAA.