
MetaText AI
Your AI agent talks to thousands of customers daily. MetaText AI ensures those conversations stay safe, on-brand, and compliant. Classify customer messages, enforce content policies, detect prompt injection attacks, and moderate responses before they reach the customer.




MetaText AI adds NLP superpowers to your agent, from text classification and content moderation to guardrail enforcement and vulnerability scanning.
MetaText AI
How businesses use MetaText AI to ensure their customer-facing AI agents never say the wrong thing, leak sensitive data, or fall victim to adversarial prompts.
A customer asks your AI Agent to read back their stored credit card number. Before the agent responds, MetaText AI evaluates the proposed response against your 'no PII disclosure' policy. The guardrail triggers, the response is blocked, and the agent replies with a safe alternative: 'For security, I cannot share full payment details. You can view this in your account settings.' Data breach prevented.
Hundreds of messages arrive every hour across different topics. Your AI Agent runs each message through MetaText AI's classification model to identify intent: billing question, technical support, sales inquiry, or complaint. Messages get routed to the right team with confidence scores. High-confidence billing questions get automated answers. Low-confidence edge cases go to humans. Response time drops across the board.
Before launching your customer-facing agent, your security team runs MetaText AI's red team scan against the application. Prompt injection probes, jailbreak attempts, and data extraction attacks are simulated automatically. The scan report identifies three policy gaps. Your team adds targeted guardrails for each vulnerability. The agent launches hardened against real-world adversarial attacks.

MetaText AI
FAQs
When the agent formulates a response, it passes the conversation (user message plus proposed reply) through MetaText AI's Evaluate endpoint. The system checks the response against your configured policies. If a violation is detected, the agent receives either a corrected version or a custom override message. All this happens in milliseconds before the customer sees anything.
Yes. MetaText AI supports project-based classification with trained models. You define your labels (like 'billing,' 'technical,' 'complaint,' 'sales') and train the model on your data. The agent then classifies every incoming message using your custom model, returning labels with confidence scores for each category.
Policies can target user input, assistant output, context, or system messages. Rules can block PII disclosure, prevent off-topic responses, enforce brand voice, restrict discussion of competitors, block harmful content, and more. Each policy supports custom override responses and can be applied to specific conversation roles.
The red team scan sends real adversarial probes against your application through MetaText AI's API. It simulates actual attack patterns including prompt injection, jailbreaking, and data extraction. The probes test your configured policies under realistic conditions. Results show which probes succeeded and which were blocked, giving you actionable hardening data.
MetaText AI processes classification and guardrail evaluation at API speed. The fail_fast option stops evaluation on the first violation for maximum speed. For most conversations, the added latency is under 200 milliseconds, imperceptible to customers. You can tune the balance between thoroughness and speed using the evaluation settings.
OpenAI's moderation flags broad content categories like violence or hate speech. MetaText AI lets you define custom policies specific to your business: no discussing competitors, no sharing internal pricing, no medical advice without disclaimers. Plus MetaText AI adds classification, extraction, red team testing, and full guardrail management, not just binary moderation flags.
Yes. MetaText AI provides both chat completion and text generation endpoints that are OpenAI-compatible. You can use these for generating responses, summaries, or content while simultaneously running guardrail checks on the output. Generation and safety work together in the same API ecosystem.
You can enable the correction_enabled flag, which returns a modified version of the response that satisfies the policy rather than completely blocking it. You can also adjust policy rules to be more or less strict, update definitions, and monitor policy performance. Guardrails are iterative, you refine them as you see real conversation patterns.
Don't limit your AI Agent to basic conversations. Watch how to configure and add powerful tools making your agent smarter and more functional.

Privacy & Security
At Tars, we take privacy and security very seriously. We are compliant with GDPR, ISO, SOC 2, and HIPAA.