Skip to content

Guardrails

Guardrails are safety controls that govern how AI agents interact with your connected business systems. They let you define rules on individual tool calls — requiring approval before a high-value action runs, blocking certain operations entirely, redacting sensitive fields from results, or flagging calls for review after execution.

Without guardrails, agents can use any tool that has been assigned to them, with no restrictions. Guardrails give you fine-grained control over which calls proceed automatically and which ones need human involvement.

When you connect a business system via the MCP Hub, its tools become available to your AI agents. An order management connection might expose tools like “create_refund”, “update_order”, and “get_customer”. Some of these are low-risk lookups. Others can change data or trigger financial transactions.

Guardrails let you draw the line. You decide which actions the agent can take on its own and which ones require a person to review first. For example:

  • Refunds over a certain amount need manager approval
  • Agents should never delete customer records
  • Customer phone numbers should be hidden from the agent pipeline
  • High-value quotes should be flagged for a human to review after the agent creates them

Each guardrail uses one of four policy types. Choose the type that matches the level of control you need.

TypeWhat happensWhen to use
Require approvalThe tool call is paused. An operator must approve or reject it before it runs.High-value or irreversible actions (e.g. refunds above a threshold, contract modifications).
DenyThe tool call is blocked entirely. The agent receives a denial and cannot retry.Operations that should never happen through AI (e.g. deleting accounts, overriding pricing).
MaskThe tool call runs, but specified fields are redacted from the result before the agent sees it.Sensitive data that the agent does not need (e.g. raw phone numbers, national ID numbers, internal cost prices).
EscalateThe tool call runs normally, but the conversation is flagged for human review afterward.Situations that need post-hoc oversight (e.g. a VIP customer’s return was denied, a warranty claim exceeded coverage).

Guardrails are configured per MCP connection. To add one:

  1. Go to Settings — MCP Hub.
  2. Click on the connection you want to add a guardrail to. This opens the connection detail page.
  3. Scroll down to the Guardrails section.
  4. Click Add guardrail. A panel opens on the right side of the page.
  5. Fill in the guardrail fields (see below).
  6. Click Add guardrail to save.

The guardrail takes effect immediately. Any future tool calls matching the guardrail’s criteria will be governed by it.

FieldDescription
ToolChoose which tool this guardrail applies to. Select a specific tool from the dropdown (e.g. “create_refund”) or choose All tools to apply the guardrail to every tool on this connection.
ActionThe policy type: Require approval, Deny, Mask, or Escalate.
ConditionAn optional expression that determines when the guardrail activates. When left empty, the guardrail applies to every call of the selected tool. See “Conditions” below.
Fields to maskOnly shown when the action is Mask. A comma-separated list of field names to redact from the tool’s result (e.g. “email, phone, ssn”).

A condition lets you create guardrails that only activate in specific situations rather than on every call. By default, a guardrail applies to every call of the selected tool. Click Add condition to restrict it.

The condition builder has three fields in a single row:

FieldWhat to enter
Field nameThe name of the argument you want to check (e.g. “amount”, “total”, “quantity”). This is the name your business system uses for the value.
OperatorHow to compare the value. Options: equals, does not equal, is greater than, is less than, is at least, is at most, contains.
ValueThe threshold or value to compare against (e.g. “5000”, “delete”, “premium”).

For example, to require approval for refunds over 5,000, you would enter: amount / is greater than / 5000.

Some common condition setups:

FieldOperatorValueEffect
amountis greater than5000Activates when the amount exceeds 5,000
quantityis at least100Activates when quantity is 100 or more
statusequalsshippedActivates when status is “shipped”
(no condition)Activates on every call

If you are unsure which field names your tool uses, check the tool description on the connection detail page. The description typically mentions the key arguments and their names. You can also ask your technical team or MCP server provider for the field names.

Conditions use filtrex — a sandboxed expression language. The condition builder covers common cases, but you can write filtrex expressions directly for complex rules. Expressions can reference args (tool input), result (tool output, Escalate only), and context (conversation/account context).

Syntax: dot notation (args.order.total), comparison (>, <, >=, <=, ==, !=), boolean (and, or, not), arithmetic (args.quantity * args.unit_price > 50000), quoted strings (context.account.tier == "vip").

ExpressionWhat it does
args.total > 50000Activates when the total exceeds 50,000
args.quantity > 100 and args.unit_price > 500Activates when both quantity and unit price exceed thresholds
args.quantity * args.unit_price > 50000Activates when the calculated order value exceeds 50,000
context.account.tier == "vip"Activates for VIP accounts regardless of tool arguments

Each tool on a connection has a risk classification that describes the type of operation it performs. You can set this in the tools table on the connection detail page by choosing from the dropdown next to each tool.

ClassificationMeaningExample tools
ReadThe tool only retrieves data. It does not change anything.get_order, search_products, get_customer
WriteThe tool creates or modifies data.create_quote, update_status, send_notification
DestructiveThe tool deletes data or performs an irreversible action.delete_customer, cancel_subscription, issue_refund

Risk classification helps you and your team understand at a glance what each tool does. It also informs the safety pipeline — destructive tools are strong candidates for an approval gate guardrail.

All tools default to “write” when first discovered. Review each tool and adjust the classification to match what it actually does.

When a guardrail with the “Require approval” type activates, the tool call is paused and the conversation is handed to a human operator. Here is what happens from the operator’s perspective:

  1. The agent encounters a tool call that matches an approval gate guardrail (e.g. a refund of 8,000 against a threshold of 5,000).
  2. The tool call does not execute. The conversation pauses.
  3. An Approval needed card appears in the conversation view, showing which tool was called, what arguments were passed, and why the guardrail triggered.
  4. The operator reviews the details and clicks Approve or Reject.
  5. If approved, the agent resumes and the tool call executes with the original arguments.
  6. If rejected, the agent receives a denial and continues the conversation without that action. It may offer the customer an alternative or explain that the request needs to be handled differently.

The approval flow ensures that high-stakes actions always have a human in the loop, while routine operations continue to be handled automatically.

All guardrails for a connection are listed in the Guardrails section of the connection detail page. Each guardrail shows the tool it applies to, the policy type, and any condition or mask fields.

To remove a guardrail, click the delete icon next to it. The change takes effect immediately.

You can add multiple guardrails to the same connection. For example, you might have an approval gate on “create_refund” when the amount exceeds 5,000, a deny rule on “delete_customer” for all calls, and a mask rule on “get_customer” to redact the phone and email fields.

Here are some common guardrail configurations:

Approve large refunds — Tool: create_refund, Action: Require approval, Condition: amount / is greater than / 5000. Any refund above 5,000 needs operator approval.

Block account deletion — Tool: delete_account, Action: Deny, no condition. No agent can ever delete an account through this connection.

Hide personal contact details — Tool: get_customer, Action: Mask, Fields to mask: email, phone. The agent can look up customer records but will not see raw email addresses or phone numbers in the result.

Flag denied returns for review — Tool: process_return, Action: Escalate, no condition. Every return processed by the agent is flagged for a human to review after the fact.