Experiment 01 results are live

We study what AI agents
do wrong.

AI agents are sending emails, updating CRM, and taking real actions. We test how they fail — and build open-source tools to control them.

Finding: ~1 in 5 agent-generated emails contained risky content — credentials, PII, internal strategy.

Open-source tools for evaluating and controlling agent actions.

What we're seeing in real tests

We ran 19 real-world agent email scenarios using GPT-4o-mini. No guardrails. No oversight. Here's what happened.

~1 in 5

agent emails contained risky content

Credentials, sensitive data, internal strategy

~70%

of scenarios had sensitive data in context

The agent had access — it just didn't always use it

Random

not consistent failure — unpredictable failure

Same scenario, different runs, different results

That's not safety. That's randomness.

When your agent doesn't leak data, it's not because it knows better — it's because it got lucky.

AI agents are powerful.
That's the problem.

Send the wrong email

Leak sensitive data

Update the wrong deal

Take actions you didn't expect

One bad action breaks trust.

What we're studying

We focus on what agents do— not just what they say.

Sales agents

Wrong pricing, wrong recipients, leaked internal strategy

Support agents

Bad replies, PII exposure, credential leaks

Internal agents

Incorrect CRM updates, unauthorized data changes

What catching a bad action looks like

An agent tries to send sensitive customer data to an external recipient. A control layer intercepts it.

tether — agent-control-layer
Agent>sendEmail({ to: "client@external.com" })
Body>"Attaching full customer dataset…"
BlockedSensitive data detected in email body
FlaggedExternal recipient — not in approved domain list
ActionApproval required — Slack notification sent to manager@company.com
Approvedmanager@company.com approved — email sent — event logged
Tether Research

We test how AI agents behave
when they send emails, update CRM, and take real actions.

Everyone is building AI agents. Nobody is measuring what goes wrong when those agents take real-world actions. We run the experiments and publish everything.

Published

The Unsupervised Agent

What happens when you give an AI agent email access with zero guardrails?

19 scenarios. GPT-4o-mini. No rules. ~1 in 5 emails contained risky content — credentials, internal data, PII.

Coming soon

Guardrails vs. No Guardrails

Same agent, same scenarios — what changes when a control layer is in the middle?

A direct before-and-after comparison. How many risky actions get caught? How many slip through?

Coming soon

The 100 Email Test

At scale, how often do agent emails require intervention?

100 email scenarios. One agent. Full audit trail. Statistical confidence on failure rates.

Without control vs. with control

The difference between an agent you hope works and one you know works.

Without control

With control

Unpredictable behavior

Controlled actions

Silent failures

Explicit decisions

No visibility

Full audit trail

Hope the model behaves

Verify before execution

Study the problem publicly.
Build the solution openly.

We're publishing experiments, sharing findings, and building open-source tools for controlling agent actions.