Guardrails

Guardrails are safety controls that govern how AI agents interact with your connected business systems. They let you define rules on individual tool calls — requiring approval before a high-value action runs, blocking certain operations entirely, redacting sensitive fields from results, or flagging calls for review after execution.

Without guardrails, agents can use any tool that has been assigned to them, with no restrictions. Guardrails give you fine-grained control over which calls proceed automatically and which ones need human involvement.

Why guardrails matter

When you connect a business system via the MCP Hub, its tools become available to your AI agents. An order management connection might expose tools like “create_refund”, “update_order”, and “get_customer”. Some of these are low-risk lookups. Others can change data or trigger financial transactions.

Guardrails let you draw the line. You decide which actions the agent can take on its own and which ones require a person to review first. For example:

Refunds over a certain amount need manager approval
Agents should never delete customer records
Customer phone numbers should be hidden from the agent pipeline
High-value quotes should be flagged for a human to review after the agent creates them

The four guardrail types

Each guardrail uses one of four policy types. Choose the type that matches the level of control you need.

Type	What happens	When to use
Require approval	The tool call is paused. An operator must approve or reject it before it runs.	High-value or irreversible actions (e.g. refunds above a threshold, contract modifications).
Deny	The tool call is blocked entirely. The agent receives a denial and cannot retry.	Operations that should never happen through AI (e.g. deleting accounts, overriding pricing).
Mask	The tool call runs, but specified fields are redacted from the result before the agent sees it.	Sensitive data that the agent does not need (e.g. raw phone numbers, national ID numbers, internal cost prices).
Escalate	The tool call runs normally, but the conversation is flagged for human review afterward.	Situations that need post-hoc oversight (e.g. a VIP customer’s return was denied, a warranty claim exceeded coverage).

Adding a guardrail

Guardrails are configured per MCP connection. To add one:

Go to Settings — MCP Hub.
Click on the connection you want to add a guardrail to. This opens the connection detail page.
Scroll down to the Guardrails section.
Click Add guardrail. A panel opens on the right side of the page.
Fill in the guardrail fields (see below).
Click Add guardrail to save.

The guardrail takes effect immediately. Any future tool calls matching the guardrail’s criteria will be governed by it.

Guardrail fields

Field	Description
Tool	Choose which tool this guardrail applies to. Select a specific tool from the dropdown (e.g. “create_refund”) or choose All tools to apply the guardrail to every tool on this connection.
Action	The policy type: Require approval, Deny, Mask, or Escalate.
Condition	An optional expression that determines when the guardrail activates. When left empty, the guardrail applies to every call of the selected tool. See “Conditions” below.
Fields to mask	Only shown when the action is Mask. A comma-separated list of field names to redact from the tool’s result (e.g. “email, phone, ssn”).

Conditions

A condition lets you create guardrails that only activate in specific situations rather than on every call. By default, a guardrail applies to every call of the selected tool. Click Add condition to restrict it.

The condition builder has three fields in a single row:

Field	What to enter
Field name	The name of the argument you want to check (e.g. “amount”, “total”, “quantity”). This is the name your business system uses for the value.
Operator	How to compare the value. Options: equals, does not equal, is greater than, is less than, is at least, is at most, contains.
Value	The threshold or value to compare against (e.g. “5000”, “delete”, “premium”).

For example, to require approval for refunds over 5,000, you would enter: amount / is greater than / 5000.

Some common condition setups:

Field	Operator	Value	Effect
amount	is greater than	5000	Activates when the amount exceeds 5,000
quantity	is at least	100	Activates when quantity is 100 or more
status	equals	shipped	Activates when status is “shipped”
(no condition)			Activates on every call

If you are unsure which field names your tool uses, check the tool description on the connection detail page. The description typically mentions the key arguments and their names. You can also ask your technical team or MCP server provider for the field names.

Advanced conditions

Conditions use filtrex — a sandboxed expression language. The condition builder covers common cases, but you can write filtrex expressions directly for complex rules. Expressions can reference args (tool input), result (tool output, Escalate only), and context (conversation/account context).

Syntax: dot notation (args.order.total), comparison (>, <, >=, <=, ==, !=), boolean (and, or, not), arithmetic (args.quantity * args.unit_price > 50000), quoted strings (context.account.tier == "vip").

Expression	What it does
`args.total > 50000`	Activates when the total exceeds 50,000
`args.quantity > 100 and args.unit_price > 500`	Activates when both quantity and unit price exceed thresholds
`args.quantity * args.unit_price > 50000`	Activates when the calculated order value exceeds 50,000
`context.account.tier == "vip"`	Activates for VIP accounts regardless of tool arguments

Risk classification

Each tool on a connection has a risk classification that describes the type of operation it performs. You can set this in the tools table on the connection detail page by choosing from the dropdown next to each tool.

Classification	Meaning	Example tools
Read	The tool only retrieves data. It does not change anything.	get_order, search_products, get_customer
Write	The tool creates or modifies data.	create_quote, update_status, send_notification
Destructive	The tool deletes data or performs an irreversible action.	delete_customer, cancel_subscription, issue_refund

Risk classification helps you and your team understand at a glance what each tool does. It also informs the safety pipeline — destructive tools are strong candidates for an approval gate guardrail.

All tools default to “write” when first discovered. Review each tool and adjust the classification to match what it actually does.

The approval flow

When a guardrail with the “Require approval” type activates, the tool call is paused and the conversation is handed to a human operator. Here is what happens from the operator’s perspective:

The agent encounters a tool call that matches an approval gate guardrail (e.g. a refund of 8,000 against a threshold of 5,000).
The tool call does not execute. The conversation pauses.
An Approval needed card appears in the conversation view, showing which tool was called, what arguments were passed, and why the guardrail triggered.
The operator reviews the details and clicks Approve or Reject.
If approved, the agent resumes and the tool call executes with the original arguments.
If rejected, the agent receives a denial and continues the conversation without that action. It may offer the customer an alternative or explain that the request needs to be handled differently.

The approval flow ensures that high-stakes actions always have a human in the loop, while routine operations continue to be handled automatically.

Managing guardrails

All guardrails for a connection are listed in the Guardrails section of the connection detail page. Each guardrail shows the tool it applies to, the policy type, and any condition or mask fields.

To remove a guardrail, click the delete icon next to it. The change takes effect immediately.

You can add multiple guardrails to the same connection. For example, you might have an approval gate on “create_refund” when the amount exceeds 5,000, a deny rule on “delete_customer” for all calls, and a mask rule on “get_customer” to redact the phone and email fields.

Examples

Here are some common guardrail configurations:

Approve large refunds — Tool: create_refund, Action: Require approval, Condition: amount / is greater than / 5000. Any refund above 5,000 needs operator approval.

Block account deletion — Tool: delete_account, Action: Deny, no condition. No agent can ever delete an account through this connection.

Hide personal contact details — Tool: get_customer, Action: Mask, Fields to mask: email, phone. The agent can look up customer records but will not see raw email addresses or phone numbers in the result.

Flag denied returns for review — Tool: process_return, Action: Escalate, no condition. Every return processed by the agent is flagged for a human to review after the fact.