
Gemini
Your AI agent generates text, creates images, produces videos, and builds semantic search indexes, all powered by Google's Gemini and Veo models. Customers get richer answers with AI-generated visuals. Internal teams produce content from simple prompts. One integration, limitless creative capacity.




From text generation to 4K image creation and Veo 3 video production, your AI agent wields Google Gemini's full model suite to deliver intelligent, media-rich responses.
Gemini
See how teams use Gemini through AI agents to generate marketing copy, product images, training videos, and intelligent search capabilities without leaving the chat.
A marketing manager messages 'Create a lifestyle image of our wireless headphones on a minimalist desk.' The AI agent sends the prompt to Gemini's image generation model, selects 4K resolution for print quality, and returns a downloadable image within the conversation. The team gets campaign-ready visuals without scheduling a photoshoot or waiting for a design queue.
A social media manager needs a 6-second product teaser video. They describe the scene to the AI agent. The agent generates the video through Veo 3 with portrait aspect ratio, complete with synchronized audio. The manager downloads the clip and posts it to Instagram Reels within minutes. Video production that once took days now takes a single conversation.
A support team wants customers to find answers through semantic search instead of exact keyword matches. The AI agent generates text embeddings for all knowledge base articles using Gemini's embedding model. When a customer asks a question, the agent converts it to an embedding and finds the closest matching article. Relevant answers surface even when the customer uses different words than the documentation.

Gemini
FAQs
The agent supports the full Gemini model lineup including Gemini 2.5 Flash (fast and efficient, the default), Gemini 2.5 Pro (advanced reasoning), and Gemini 2.5 Flash Lite (cost-optimized). Legacy models like Gemini 1.5 Flash and 1.5 Pro are also available. Use the List Models endpoint to see all currently supported models.
Yes. The image generation endpoint supports 1K, 2K, and 4K resolutions when using the Gemini 3 Pro Image Preview model. Custom aspect ratios are also supported across most image models. The agent can adjust resolution and aspect ratio based on whether the image is for web, social media, or print use.
The agent sends a text prompt to a Veo model (Veo 2, Veo 3, or Veo 3 Fast) and receives an operation ID. Video generation runs asynchronously with durations of 4, 6, or 8 seconds in landscape or portrait. The agent polls for completion and returns a downloadable video URL with natively generated audio.
Embeddings convert text into numerical vectors that capture semantic meaning. The agent creates them for tasks like semantic search (finding related content), document classification, clustering, and similarity comparison. Useful when you want customers to find answers even when they phrase questions differently from your documentation.
Yes. Both text and image generation endpoints accept safety settings for categories including harassment, hate speech, sexually explicit content, and dangerous content. You configure threshold levels from BLOCK_NONE to BLOCK_LOW_AND_ABOVE. The agent applies these filters on every generation request.
Tars does not add charges for the Gemini integration itself. You pay Google directly for API usage based on your Gemini API plan. Costs vary by model: Flash models are cheaper, Pro models cost more. Veo 3 Fast video generation is approximately $0.40 per second. The agent's token counting capability helps estimate costs before generating.
Yes. The system_instruction parameter lets you set persistent behavioral guidance for the model. You can instruct it to write in a specific tone, avoid certain topics, follow formatting rules, or adopt a persona. System instructions apply to both text and image generation requests.
The agent checks the video operation status and reports any failures with error details. You can adjust the timeout parameter (default 300 seconds, maximum 600) for complex prompts. If a job fails, the agent provides the error reason and you can retry with a modified prompt or different Veo model.
Don't limit your AI Agent to basic conversations. Watch how to configure and add powerful tools making your agent smarter and more functional.

Privacy & Security
At Tars, we take privacy and security very seriously. We are compliant with GDPR, ISO, SOC 2, and HIPAA.