Action execution is where agents earn trust
Anyone can wire a model to a tool. The hard part is the action layer: the governed boundary where an agent stops generating text and starts changing your database, your ledger, your customers' accounts — safely, idempotently, and on the record.
- Typed, versioned tool registry
- Validate → authorize → execute → record
- Idempotency keys & compensating steps
- Reversibility-aware routing
Where generation ends and consequences begin
A chatbot's worst failure is a wrong sentence. An agent's worst failure is a wrong wire transfer.
The moment you give a model tools, you've crossed a line: its output now has side effects in systems of record. The action-execution layer is the seam at that line — a thin, deliberate piece of engineering that sits between the model's intent and your infrastructure's reality.
We treat the model as an untrusted planner. It proposes an action and arguments; it never touches your systems directly. The execution layer validates the request, checks who is allowed to do what, decides whether the action can run autonomously, runs it through a real adapter, and writes down exactly what happened. None of that logic lives in the prompt — it lives in code you can test, review, and audit.
What the action layer actually does
Every tool call passes through the same pipeline before it changes anything. The model never skips a stage.
Validate
Arguments are checked against a typed schema. Malformed or hallucinated payloads fail closed at the boundary — they never reach your systems.
Authorize
Each action runs under a scoped identity with least-privilege permissions, so the agent can only ever do what that workflow is allowed to do.
Route by risk
We classify actions by reversibility and blast radius, then send high-stakes ones to an approval gate or a risk threshold first.
Execute
A real adapter performs the side effect — an API call, a DB write, a message — behind an idempotency key so retries can't duplicate it.
Gate when needed
Irreversible or high-value actions pause for human approval, with the full proposed payload shown before anyone clicks yes.
Record
Inputs, outputs, the deciding model version, and the human approver are written to an immutable lineage trail for every action.
From proposed action to recorded result
The same four moves run on every action, whether it's reading a row or refunding a customer.
Propose
The model emits a structured tool call. It's a request, not a command — nothing has happened yet.
Adjudicate
We validate the schema, check permissions, and look up the action's risk tier to decide auto vs. gated vs. staged.
Effect
An idempotent adapter runs the side effect. Multi-step actions execute as a saga with compensating steps defined up front.
Reconcile
The real result — success, failure, or partial — is written to lineage and fed back so the agent plans against reality, not assumptions.
Idempotency, retries, and rollback
Distributed actions fail in the middle. The network drops after the charge succeeds but before you recorded it; the agent retries and charges twice. The default behavior of a naive tool loop is to make this happen.
We design every action to be safely repeatable. Each gets a stable execution key, so a retry of an already-completed action is a no-op rather than a second side effect. Actions that span multiple systems run as sagas: an ordered set of steps, each with an explicit compensating action that undoes it. Anything we genuinely can't compensate — a sent email, a filed report — is routed to a human gate instead of retried on hope.
- Idempotency keys on every side effect
- Sagas with compensating steps for multi-system actions
- Non-compensable actions gated, never blind-retried
Naive tool loop vs. a governed action layer
Both let an agent call functions. Only one is safe to point at production.
| A naive tool loop | An Automatic.co action layer | |
|---|---|---|
| Trust model | Model output executes directly | Model proposes; code validates and executes |
| Bad arguments | Reach your systems and error | Fail closed at a schema boundary |
| Retries | Duplicate the side effect | Idempotency keys make them safe |
| High-risk actions | Run like any other | Routed to approval gates or thresholds |
| Partial failure | Inconsistent state | Compensating steps or a human gate |
| Auditability | Logs, if you're lucky | Immutable lineage per action |
Frequently asked questions
What's the difference between tool calling and action execution?
Tool calling is the model emitting a structured request to invoke a function. Action execution is everything that happens after: validating arguments, enforcing permissions, running the side effect, handling partial failure, and recording what changed. The model proposes; the execution layer disposes — and that layer is plain, testable code you own.
How do you stop an agent from doing something irreversible?
We classify every action by reversibility and blast radius, then route accordingly. Read-only and easily-undone actions run autonomously; irreversible or high-value ones pass through an approval gate or a risk threshold first. Destructive operations are wrapped so they can be staged, previewed, and rolled back.
What happens when an action fails halfway through?
We make actions idempotent and assign each one an execution key, so a retry can't double-charge or double-send. Multi-step actions get a saga with explicit compensating steps, and anything we can't safely compensate is gated behind a human checkpoint rather than retried blindly.
Can the agent call tools that don't exist yet, or pick the wrong one?
The model only ever sees a typed, versioned tool registry. Arguments are schema-validated before execution and rejected if they don't conform, so a hallucinated tool name or malformed payload fails closed at the boundary instead of reaching your systems.
Related architecture decisions
Action execution is one choice in a larger agent design. Here's where it connects.
Show us the action you're afraid to automate
Bring the one operation that scares you — the refund, the wire, the delete. We'll design the execution layer that makes it safe to hand to an agent.