AI Agent Effect Firewall avatar
AI Agent Effect Firewall

Pricing

Pay per usage

Go to Apify Store
AI Agent Effect Firewall

AI Agent Effect Firewall

Execution firewall for AI agents. Blocks untrusted inputs from triggering real-world actions by enforcing decision-level allow / deny / approval gates. Infrastructure safety layer for tool-enabled agents

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Tomasz Trojanowski

Tomasz Trojanowski

Maintained by Community

Actor stats

0

Bookmarked

1

Total users

0

Monthly active users

2 days ago

Last modified

Share

Decision‑level execution firewall for tool‑enabled AI agents
(Execution boundaries, not prompt scanning)


TL;DR

AI Agent Effect Firewall is a drop‑in execution guard for agent systems.

It decides whether an agent is allowed to do something — not what it is allowed to say.

For every requested effect (tool call, API call, filesystem access, publish action), the firewall returns one of three decisions:

  • allow
  • deny
  • require_approval

with a risk score and explicit reasons.

This is infrastructure, not an agent.


Why this exists (real problem, not theory)

Modern AI systems increasingly look like this:

External input (email / RSS / web / docs)
LLM reasoning
Tool / API / filesystem / publish

This architecture is fundamentally unsafe by default.

Why?

Because natural language does not preserve trust boundaries.

An LLM can unintentionally turn:

  • scraped text
  • emails
  • RSS headlines
  • documents into real‑world actions.

This is usually called prompt injection.

That name is misleading.

The real problem is:

Reasoning and execution are coupled without a mandatory decision gate.


What this firewall actually does

AI Agent Effect Firewall sits between reasoning and execution.

It evaluates effects, not text.

For every request it checks:

  • What effect is requested
    • filesystem.read
    • api.call
    • publish_post
    • delegation
  • Where the influence came from
    • system
    • internal agent
    • external input
  • What resource is targeted
    • file path
    • API host
    • external service
  • Policy
    • deny lists
    • allow lists
    • sensitivity patterns
    • risk thresholds

It returns a deterministic decision with audit logs.


What this firewall is NOT

  • ❌ Not a prompt scanner
  • ❌ Not a jailbreak detector
  • ❌ Not content moderation
  • ❌ Not sentiment or intent analysis
  • ❌ Not an LLM wrapper

It does not read prompts.

It enforces execution boundaries.


Concrete architecture placement

Untrusted input
LLM planning / reasoning
Requested effect (tool call)
AI Agent Effect Firewall ← YOU ARE HERE
Execution OR approval OR block

This makes execution non‑bypassable.


Real usage example (practical)

Example: RSS → LLM → social media posting

Without firewall:

  • RSS headline influences LLM
  • LLM calls publish_post
  • Post goes live

With firewall:

  1. LLM proposes an action:
publish_post(target="social_media")
  1. Your orchestrator sends this to the firewall:
{
"requestedEffect": "publish_post",
"contextOrigin": "external_input",
"action": {
"type": "tool_call",
"name": "publish_post",
"target": "social_media"
}
}
  1. Firewall responds:
{
"decision": "require_approval",
"riskScore": 72,
"reasons": [
"High‑impact effect influenced by external input"
]
}
  1. Post does NOT go live automatically.

This is the difference between:

  • automation
  • controlled automation

Executable test examples (curl)

Health check

$curl https://<RUN>.runs.apify.net/health

External input tries to read private SSH key (DENY)

curl -X POST https://<RUN>.runs.apify.net/evaluate \
-H "Content-Type: application/json" \
-d '{
"requestedEffect": "filesystem.read",
"contextOrigin": "external_input",
"resource": "/home/user/.ssh/id_rsa",
"action": { "type": "tool_call", "name": "filesystem.read" }
}'

Expected:

  • decision: deny
  • high riskScore
  • reason: sensitive resource + external origin

External input performing read‑only navigation (ALLOW)

curl -X POST https://<RUN>.runs.apify.net/evaluate \
-H "Content-Type: application/json" \
-d '{
"requestedEffect": "browser.navigate",
"contextOrigin": "external_input",
"action": {
"type": "tool_call",
"name": "browser.navigate",
"params": { "url": "https://example.com" }
}
}'

Internal agent reading internal file (ALLOW)

curl -X POST https://<RUN>.runs.apify.net/evaluate \
-H "Content-Type: application/json" \
-d '{
"requestedEffect": "filesystem.read",
"contextOrigin": "internal_agent",
"resource": "/data/report.csv",
"action": { "type": "tool_call", "name": "filesystem.read" }
}'

Allow‑listed API call (ALLOW)

curl -X POST https://<RUN>.runs.apify.net/evaluate \
-H "Content-Type: application/json" \
-d '{
"requestedEffect": "api.call",
"contextOrigin": "system",
"meta": { "host": "api.notion.com" },
"action": "api.call"
}'

Non‑allow‑listed API call (REQUIRE_APPROVAL)

curl -X POST https://<RUN>.runs.apify.net/evaluate \
-H "Content-Type: application/json" \
-d '{
"requestedEffect": "api.call",
"contextOrigin": "system",
"meta": { "host": "unknown.example.com" },
"action": "api.call"
}'

Real‑world case: Clawdbot / Moltbot (2026)

In early 2026, the Clawdbot project (later renamed Moltbot) became a widely discussed example of agent exposure gone wrong.

What happened:

  • Tool‑enabled agents were exposed to the internet
  • Agents ran with broad filesystem and execution privileges
  • No non‑bypassable execution gate existed
  • Untrusted influence could reach real effects

There was no sophisticated AI attack.

The failure was architectural.

Language crossed trust boundaries and directly caused execution.


Why this is NOT a prompt injection story

No clever prompts were required.

The system failed because:

  • reasoning and execution were coupled
  • agents ran with production privileges
  • no deterministic decision gate existed

This is a confused deputy problem, amplified by autonomy.


How this firewall would have changed the outcome

With AI Agent Effect Firewall:

  • external influence could not directly trigger tools
  • filesystem and API access would be gated
  • high‑impact effects would require approval
  • every attempt would be logged

The incident would have been contained, not viral.


Threat model (concise)

Assets

  • filesystem
  • APIs and credentials
  • public outputs
  • delegation capability

Trust levels

  1. system / developer
  2. internal agents
  3. external input

Security guarantee

No irreversible or privileged effect executes without passing a deterministic, auditable decision gate.


Why this matters

LLMs are not the problem.

Unbounded execution is.

This firewall turns agent systems from:

  • hope‑based automation

into:

  • controlled, observable systems

License

Open pattern.
Free to use, modify, and adapt.