What AI Agents Should Never Do on Their Own

focuses on what they can do.

Autonomy will get framed because the aim: give them instruments, give them entry, allow them to run.

The extra freedom, the higher the output.

That framing is generally correct. I exploit brokers day by day. They’ve genuinely elevated my output. I’m a believer!

And I’ve additionally misplaced two hours of labor by means of an agent that was doing precisely what I requested.

I used to be engaged on a characteristic department cleanup.

The duty description stated “take away unused recordsdata and clear up the repo.” The agent interpreted “unused” broadly, deleted a config listing I hadn’t touched in months however nonetheless referenced from the deploy script, and saved going.

I caught it in the course of the diff overview. The config wasn’t in model management. Two hours reconstructing it from reminiscence and git historical past.

The duty was clear and the agent adopted directions, the one drawback was that nothing instructed it the place to cease.

Figuring out which duties to gate is a part of working brokers properly. Give them full freedom on the mistaken class and also you’ll spend the afternoon undoing what took them thirty seconds.

Hey there! My identify is Sara Nóbrega and I train you learn how to turn into an AI energy person on Learn AI. Free to subscribe!

What the agent ought to by no means contact alone

Some duties are reversible. For instance, a refactored perform will be reverted or a brand new unit take a look at will be eliminated. The price of a mistake is low.

Restoration price varies by job. A refactored perform takes seconds to revert; you simply revert the commit, however a dropped manufacturing desk may take your complete week, if restoration is even potential.

The query earlier than you run a job: can this be undone?

If sure, let the agent transfer. If no, add a checkpoint earlier than it runs.

Right here’s the permission matrix I work from:

Desk displaying really helpful agent autonomy ranges and human overview necessities by job kind. Small refactors and unit exams can have excessive agent autonomy, whereas API modifications, dependencies, migrations, safety, infrastructure, and manufacturing deployment require growing ranges of human overview. Picture by Creator and ChatGPT.

The classes that ought to at all times require a human

Some classes require a human checkpoint no matter how well-specified the duty is.

The chance of a mistake is just too excessive, and the restoration price too steep, to let an agent resolve by itself.

What AI Agents should not tackle alone, part 1. Image generated with DALL-E. — Picture by Creator and ChatGPT.

Harmful file operations

`rm -rf`, `git clear -fd`, `git reset --hard`.

These delete or discard work that will not be recoverable.

An agent will run them if the duty description implies cleanup.

I’ve had one run `git clear -fd` in the course of a refactor as a result of the duty stated “clear up momentary recordsdata.”

My uncommitted work was gone. There was no malfunction, because the agent did precisely what the phrases stated. The safeguard is an specific block checklist with a affirmation step, not trusting the agent to deduce the place “clear up” ends.

2. Database writes and migrations

Any DELETE and not using a WHERE clause, any DROP or TRUNCATE, any schema migration touching manufacturing information.

A typo in a WHERE clause can wipe a desk. A migration that runs out of order can corrupt information that’s unimaginable to reconstruct. All the time overview earlier than working.

3. Cloud infrastructure

`terraform apply`, `kubectl delete`, `aws iam *`, `gcloud iam *`.

Infrastructure modifications have an effect on reside techniques and infrequently different groups. Permissions modifications are particularly harmful as a result of the harm will be invisible till one thing fails.

What AI Agents should not tackle alone, part 2. Image generated with DALL-E. — Picture by Creator and ChatGPT.

4. Manufacturing deployments

Any deployment to a manufacturing surroundings ought to undergo a human overview step, even when the code was agent-generated.

CI/CD pipelines can run agent output routinely, and that’s fantastic. The choice to deploy to manufacturing is yours.

You realize what’s in flight, what incidents are open, what upkeep is scheduled. The agent doesn’t have any of that context, and it might’t ask for it mid-pipeline.

5. Auth and safety logic

Authentication flows, authorization guidelines, token dealing with, session administration.

Bugs right here don’t present up in unit exams, they present up in incident stories, typically months later.

An agent writing auth logic will produce one thing that appears right and passes the glad path.

The damaging instances are the sting situations: a token that doesn’t expire below a selected sequence of API calls, a route that bypasses middleware when a parameter is lacking.

These are precisely what unit exams miss and what safety overview catches. Each auth change wants a human who’s particularly on the lookout for these gaps, not one who’s glad the glad path is roofed.

6. Secrets and techniques, `.env`recordsdata, API keys

An agent studying or writing credentials creates publicity threat. Hold this class off-limits by default and deal with it manually.

git push --force sits in its personal class as a result of it rewrites historical past on the distant. As soon as pushed, different contributors’ native branches diverge. Restoration is painful and typically unimaginable.

People ought to be cautious with all of those instructions too. Brokers simply make them simpler to set off accidentally, buried inside an extended sequence of in any other case secure steps.

AGENTS.md: write the contract

Give brokers particular construction from the beginning. An AGENTS.md file on the root of your repo tells the agent what the undertaking is, learn how to run it, and what it’s not allowed to the touch with out asking.

A imprecise AGENTS.md will get you an agent filling gaps with guesses. I discovered this on a codebase that had no AGENTS.md in any respect.

The duty was “manage the undertaking construction.” The agent moved recordsdata throughout directories primarily based on naming conventions that made sense to it. All the things that referenced these paths broke.

The duty took the agent twenty minutes; the cleanup took me two hours. Three strains of scope constraints would have prevented it totally.

Right here’s the template I exploit:

# AGENTS.md

## Undertaking

[Brief description of the project and tech stack]

## Setup

```bash

# Set up

npm set up  # or pip set up -r necessities.txt

# Run

npm run dev

# Check

npm take a look at

# Lint

npm run lint

```

## Coding guidelines

- Make minimal modifications. Do not refactor unrelated code.

- If conduct modifications, add or replace exams.

- Do not contact recordsdata outdoors the scope of the duty.

- Hold diffs readable. One concern per commit.

## Security guidelines

Ask earlier than working any command in blocked_commands.md.

If you happen to're not sure whether or not a command is secure, cease and ask.

## Definition of accomplished

- Assessments move

- Diff is explainable in a single sentence

- Closing report supplied (see under)

## Closing report format

After each job, present:

1. Abstract of modifications

2. Recordsdata modified

3. Assessments run and end result

4. Dangers or assumptions

5. Something not accomplished

```

The companion file, blocked_commands.md, lists precisely what wants human approval earlier than working:

# blocked_commands.md

## Harmful file operations

- rm -rf

- git clear -fd

- git reset --hard

## Git operations

- git push --force

- git push --force-with-lease

## Database operations

- DROP TABLE

- TRUNCATE TABLE

- DELETE with out WHERE clause

- Any migration that alters a manufacturing schema

## Cloud / infrastructure

- terraform apply

- kubectl delete

- aws iam *

- gcloud iam *

## Secrets and techniques

- Any command studying or writing .env recordsdata

- Any command touching API keys or credentials

When the AGENTS.md is imprecise, the agent guesses. When it’s particular, the agent executes, and so the file is your contract. Write it earlier than you begin the duty, not after one thing breaks.

Test my two newest articles the place you may study how to give your AI unlimited context and discover six widespread hard decisions AI Engineers must make in manufacturing.

The 2-agent loop

For something medium-complexity or above, don’t use one agent, use two.

Agent 1 implements. Agent 2 evaluations. Then Agent 1 applies solely the important suggestions.

Implementer immediate:

You're a senior software program engineer implementing a selected job.

Process: [describe the task]

Context: [link to AGENTS.md or paste relevant sections]

Guidelines:

- Make minimal modifications.

- Keep in scope.

- Do not refactor unrelated code.

- Add exams if conduct modifications.

- When accomplished, present a ultimate report: abstract, recordsdata modified,

  exams run, dangers, something incomplete.

Reviewer immediate:

You're a code reviewer with no attachment to the implementation.

Assessment this diff: [paste diff]

Test for:

- Bugs and edge instances

- Lacking exams

- Safety points

- Unintended conduct modifications

- Something outdoors the acknowledged scope

Output:

- Essential points (should repair)

- Minor points (non-compulsory)

- Something you'd flag for a human

Don't rewrite the code. Flag, do not repair.

The reviewer agent has no ego funding within the code. It appears for bugs, edge instances, take a look at protection, and safety points with out attempting to redo the work.

Code overview is the way you catch what you missed. The 2-agent loop is identical course of, automated.

The ultimate report

Require a ultimate report for each agent job:

1. Abstract of modifications

2. Recordsdata modified

3. Assessments run and end result

4. Dangers or assumptions

5. Something not accomplished

This makes the agent accountable. If it might’t summarize what it did in clear phrases, that’s a sign the duty wasn’t clear.

It additionally builds up documentation with out you writing it manually. The stories stack. When one thing breaks every week later, you may hint again precisely what modified and why.

The unglamorous work

The hype round AI brokers is right here to remain, and principally earned. They do improve your output.

The practitioners getting essentially the most from them are those who did the setup work: wrote the AGENTS.md, thought by means of the permission ranges, constructed the blocked instructions checklist, arrange the two-agent loop.

Brokers work properly after they have clear directions. That half is on you.

Thanks for studying!

Yow will discover me on LinkedIn and Substack, the place I share extra particulars concerning AI and LLM.

Source link

What AI Agents Should Never Do on Their Own

Loop Engineering for RAG Question Parsing: The Small Loop That Runs Before Retrieval

How to Find the Optimal Coding Agent Interface

I Completed Five Years in Analytics Consulting: 5 Lessons That Changed How I Work

GPU-Resident Top-K for Agentic RAG: I Built a CUDA Kernel So My Retrieval Step Would Stop Bouncing Off the GPU

Can Machine Learning Predict the World Cup?

Automate Writing Your LLM Prompts

These Were My Favorite Things Samsung Unpacked During Its 2026 Galaxy Event

AI minister role boosted but tech department axed in Burnham shake-up

Loop Engineering for RAG Question Parsing: The Small Loop That Runs Before Retrieval

The risk of weather data sabotage is rising

Featured Picks

Why Your Office Chair Should Have Lumbar Support

Photographer’s watch has built-in light meter

Live TV Streaming Channel Showdown: Sling, YouTube TV and More Compared

What AI Agents Should Never Do on Their Own

What the agent ought to by no means contact alone

The classes that ought to at all times require a human

AGENTS.md: write the contract

The 2-agent loop

The ultimate report

The unglamorous work

Related Posts