
Many conversational AI platforms give LLMs the wheel. In these cases, LLMs decide what to say, when to trigger actions, and when code-based rules should apply. But as many contact centers have learned the hard way: guardrails don’t cut it if the LLM is still driving.
When LLMs are in complete control, “guardrails” are typically limited to prompt instructions, soft constraints, post-hoc checks, or “don’t do X” guidance. When the LLM reasons incorrectly, skips a step, or hallucinates, the system still proceeds, often without detection.
At scale, required actions get missed, compliance breaks, and enterprises become hesitant to trust these systems beyond simple workflows. At Replicant, we invert the control model. We force AI agents through deterministic checkpoints (required data collection, enforced workflows, hard-coded actions, etc.), before they are allowed to proceed.
This safeguards compliance, prevents hallucinations, and guarantees critical steps are never skipped. Ultimately, it’s why we can leverage LLM intelligence and safely automate complex, high-risk workflows while ensuring your next AI project doesn’t veer off the road.
The inevitable failure of LLM-driven agents
It’s never been easier to build and deploy a simple AI agent using LLMs. But challenges arise when the LLM inevitably fails to follow its instructions, leading to misinformation or even harmful actions. When an agent goes awry, it usually fails in one of these ways:
- Providing inaccurate information (damaging user trust or creating liability)
- Misrepresenting its capabilities (creating false expectations)
- Misrepresenting its actions: claiming it has taken an action it has not or vice versa
- Disclosing private information
- Performing an unauthorized action
Fortunately, there are many defensive measures, or guardrails, we use that exponentially reduce the likelihood of such violations. Let’s take a look at a few.
Generative intelligence backed by deterministic execution
The single most important principle you can follow when building a reliable AI agent is:
Don’t let an LLM make a decision that you don’t need an LLM for.
Fundamental to our architecture is a separation between LLM-based execution and deterministic execution. Our agents are designed to use LLM-based execution for the parts of the interaction that demand flexibility and naturalness, and deterministic execution for those that demand correctness and predictability.
What sorts of decisions are good candidates for deterministic execution?
1. Preventing an action that should NEVER happen
For example, a caller may wish to cancel an order that is not eligible for cancellation. So if we have a “cancellation” workflow, we can run a deterministic step before we even begin this workflow that checks if an order is eligible for cancellation. If it isn’t, we never start the cancellation workflow, and instead remind the LLM that the order is ineligible.
This turns our LLM’s “should not” into a “cannot" and avoids costly mishaps like erroneous discounts.
2. Triggering an action that MUST happen
In other cases, we may need to ensure that certain steps are taken when certain criteria are met. For example, an agent may be required to notify any caller from New York of their eligibility for a certain health plan. Rather than instructing an LLM to do this and simply hoping it will remember to satisfy this hard requirement, we can run a deterministic check whenever the caller’s state of residence is collected. If it matches New York, we can deterministically trigger a script that the agent speaks to notify the caller of their options.
In this way, compliance becomes the responsibility of code and not of prompt engineering.

3. Hiding private information from the LLM
There are certain pieces of information relevant to a workflow that are not appropriate for an LLM to be aware of. For example, if the caller needs to provide some form of authentication such as the account’s email address, it is not appropriate for the LLM to see the account email address, because it might be tricked into disclosing it. Instead, we can collect the email address from the user and run a deterministic step to match the provided email against the private account email.
You can’t leak what you have never seen.
Guardrail modules on caller inputs and agent outputs
When a violation of constraints occurs because an LLM does not follow its instructions, we call that a hallucination. One of the most conventional approaches to preventing hallucinations is to use another “guardrail” LLM to check the output from the “talker” LLM (and/or the input from the human).
- On the LLM output, we are checking for hallucinations
- On the human input, we are checking not only for adversarial behavior (trying to trick the LLM into doing something wrong) but also attempts to move the conversation in a direction that the agent does not support
When a guardrail LLM detects a hallucination, we can re-prompt the “talker” to change its response based on the problem we identified.

Why does it help to use a guardrail LLM to “check the work” of another LLM? This is an example of separation of concerns, a far better approach than asking a single model to both reason and police itself.
First, the guardrail LLM can focus exclusively on deciding whether the “talker” LLM is creating an appropriate response based on known facts and capabilities. Because the guardrail LLM’s job is simpler it becomes more likely to do it accurately, as opposed to the talker LLM which must balance many different considerations to create the ideal response.
Another great benefit of having a guardrail step is that we can build many different guardrails, each with even further specialization, and run them in parallel. By doing this we can guard against a variety of behaviors while keeping accuracy high and latency low (the latter of which is critical for effective voice conversations).
Basic foundational considerations
In addition, there are a few simple (if not always easy) measures we take, which we consider table stakes for any LLM-powered AI agent:
- Use best-in-class LLMs that are already tuned to maximize instruction following
- Provide clear instructions to the LLM
- Continuously monitor and refine
Multi-tiered approach
None of the different concepts we have discussed are mutually exclusive, nor are they an exhaustive list of the strategies we use at Replicant. In fact, combining several guardrail concepts is the best way to achieve extremely high reliability while operating under other constraints:
- Move critical logic into deterministic execution
- Run real-time guardrails on caller inputs and agent outputs
- Constantly reevaluate LLMs and associated prompts
Wrap-up
An agent that is carefully designed from first principles is orders of magnitude safer than an agent that is just composed of a prompt and an LLM. Segregating critical logic and using a layered series of guardrails results in a reliable agent that benefits from the power of LLMs without exposing the business to unnecessary risk. This is why Replicant is trusted by customers to handle complex use cases involving authentication, legal disclosures, sensitive data, and more.
Schedule time with an expert to learn more about how Replicant can transform your contact center with AI.