Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • 7 of the Best A24 Movies You Can Stream Free on Your Next Movie Night
    • Why AI Is NOT Stealing Your Job
    • First production roadster with a roof
    • GAMING: How Australia decides its Game of The Year
    • Nvidia’s RTX Spark Laptops Look Hell-Bent on Disruption
    • Google Is Testing an Option for Websites to Opt Out of AI Search
    • What AI Agents Should Never Do on Their Own
    • LiveWire acquires Dust Moto for electric off-road expansion
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Wednesday, June 3
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»What AI Agents Should Never Do on Their Own
    Artificial Intelligence

    What AI Agents Should Never Do on Their Own

    Editor Times FeaturedBy Editor Times FeaturedJune 3, 2026No Comments9 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    focuses on what they can do.

    Autonomy will get framed because the aim: give them instruments, give them entry, allow them to run.

    The extra freedom, the higher the output.

    That framing is generally correct. I exploit brokers day by day. They’ve genuinely elevated my output. I’m a believer!

    And I’ve additionally misplaced two hours of labor by means of an agent that was doing precisely what I requested.

    I used to be engaged on a characteristic department cleanup.

    The duty description stated “take away unused recordsdata and clear up the repo.” The agent interpreted “unused” broadly, deleted a config listing I hadn’t touched in months however nonetheless referenced from the deploy script, and saved going.

    I caught it in the course of the diff overview. The config wasn’t in model management. Two hours reconstructing it from reminiscence and git historical past.

    The duty was clear and the agent adopted directions, the one drawback was that nothing instructed it the place to cease.

    Figuring out which duties to gate is a part of working brokers properly. Give them full freedom on the mistaken class and also you’ll spend the afternoon undoing what took them thirty seconds.


    Hey there! My identify is Sara Nóbrega and I train you learn how to turn into an AI energy person on Learn AI. Free to subscribe!


    What the agent ought to by no means contact alone

    Some duties are reversible. For instance, a refactored perform will be reverted or a brand new unit take a look at will be eliminated. The price of a mistake is low.

    Restoration price varies by job. A refactored perform takes seconds to revert; you simply revert the commit, however a dropped manufacturing desk may take your complete week, if restoration is even potential.

    The query earlier than you run a job: can this be undone?

    If sure, let the agent transfer. If no, add a checkpoint earlier than it runs.

    Right here’s the permission matrix I work from:

    Desk displaying really helpful agent autonomy ranges and human overview necessities by job kind. Small refactors and unit exams can have excessive agent autonomy, whereas API modifications, dependencies, migrations, safety, infrastructure, and manufacturing deployment require growing ranges of human overview. Picture by Creator and ChatGPT.

    The classes that ought to at all times require a human

    Some classes require a human checkpoint no matter how well-specified the duty is.

    The chance of a mistake is just too excessive, and the restoration price too steep, to let an agent resolve by itself.

    What AI Agents should not tackle alone, part 1. Image generated with DALL-E.
    Picture by Creator and ChatGPT.
    1. Harmful file operations

    `rm -rf`, `git clear -fd`, `git reset --hard`.

    These delete or discard work that will not be recoverable.

    An agent will run them if the duty description implies cleanup.

    I’ve had one run `git clear -fd` in the course of a refactor as a result of the duty stated “clear up momentary recordsdata.”

    My uncommitted work was gone. There was no malfunction, because the agent did precisely what the phrases stated. The safeguard is an specific block checklist with a affirmation step, not trusting the agent to deduce the place “clear up” ends.

    2. Database writes and migrations

    Any DELETE and not using a WHERE clause, any DROP or TRUNCATE, any schema migration touching manufacturing information.

    A typo in a WHERE clause can wipe a desk. A migration that runs out of order can corrupt information that’s unimaginable to reconstruct. All the time overview earlier than working.

    3. Cloud infrastructure

    `terraform apply`, `kubectl delete`, `aws iam *`, `gcloud iam *`.

    Infrastructure modifications have an effect on reside techniques and infrequently different groups. Permissions modifications are particularly harmful as a result of the harm will be invisible till one thing fails.

    What AI Agents should not tackle alone, part 2. Image generated with DALL-E.
    Picture by Creator and ChatGPT.

    4. Manufacturing deployments

    Any deployment to a manufacturing surroundings ought to undergo a human overview step, even when the code was agent-generated.

    CI/CD pipelines can run agent output routinely, and that’s fantastic. The choice to deploy to manufacturing is yours.

    You realize what’s in flight, what incidents are open, what upkeep is scheduled. The agent doesn’t have any of that context, and it might’t ask for it mid-pipeline.

    5. Auth and safety logic

    Authentication flows, authorization guidelines, token dealing with, session administration.

    Bugs right here don’t present up in unit exams, they present up in incident stories, typically months later.

    An agent writing auth logic will produce one thing that appears right and passes the glad path.

    The damaging instances are the sting situations: a token that doesn’t expire below a selected sequence of API calls, a route that bypasses middleware when a parameter is lacking.

    These are precisely what unit exams miss and what safety overview catches. Each auth change wants a human who’s particularly on the lookout for these gaps, not one who’s glad the glad path is roofed.

    6. Secrets and techniques, `.env`recordsdata, API keys

    An agent studying or writing credentials creates publicity threat. Hold this class off-limits by default and deal with it manually.

    git push --force sits in its personal class as a result of it rewrites historical past on the distant. As soon as pushed, different contributors’ native branches diverge. Restoration is painful and typically unimaginable.

    People ought to be cautious with all of those instructions too. Brokers simply make them simpler to set off accidentally, buried inside an extended sequence of in any other case secure steps.

    AGENTS.md: write the contract

    Give brokers particular construction from the beginning. An AGENTS.md file on the root of your repo tells the agent what the undertaking is, learn how to run it, and what it’s not allowed to the touch with out asking.

    A imprecise AGENTS.md will get you an agent filling gaps with guesses. I discovered this on a codebase that had no AGENTS.md in any respect.

    The duty was “manage the undertaking construction.” The agent moved recordsdata throughout directories primarily based on naming conventions that made sense to it. All the things that referenced these paths broke.

    The duty took the agent twenty minutes; the cleanup took me two hours. Three strains of scope constraints would have prevented it totally.

    Right here’s the template I exploit:

    # AGENTS.md
    
    ## Undertaking
    
    [Brief description of the project and tech stack]
    
    ## Setup
    
    ```bash
    
    # Set up
    
    npm set up  # or pip set up -r necessities.txt
    
    # Run
    
    npm run dev
    
    # Check
    
    npm take a look at
    
    # Lint
    
    npm run lint
    
    ```
    
    ## Coding guidelines
    
    - Make minimal modifications. Do not refactor unrelated code.
    
    - If conduct modifications, add or replace exams.
    
    - Do not contact recordsdata outdoors the scope of the duty.
    
    - Hold diffs readable. One concern per commit.
    
    ## Security guidelines
    
    Ask earlier than working any command in blocked_commands.md.
    
    If you happen to're not sure whether or not a command is secure, cease and ask.
    
    ## Definition of accomplished
    
    - Assessments move
    
    - Diff is explainable in a single sentence
    
    - Closing report supplied (see under)
    
    ## Closing report format
    
    After each job, present:
    
    1. Abstract of modifications
    
    2. Recordsdata modified
    
    3. Assessments run and end result
    
    4. Dangers or assumptions
    
    5. Something not accomplished
    
    ```

    The companion file, blocked_commands.md, lists precisely what wants human approval earlier than working:

    # blocked_commands.md
    
    ## Harmful file operations
    
    - rm -rf
    
    - git clear -fd
    
    - git reset --hard
    
    ## Git operations
    
    - git push --force
    
    - git push --force-with-lease
    
    ## Database operations
    
    - DROP TABLE
    
    - TRUNCATE TABLE
    
    - DELETE with out WHERE clause
    
    - Any migration that alters a manufacturing schema
    
    ## Cloud / infrastructure
    
    - terraform apply
    
    - kubectl delete
    
    - aws iam *
    
    - gcloud iam *
    
    ## Secrets and techniques
    
    - Any command studying or writing .env recordsdata
    
    - Any command touching API keys or credentials

    When the AGENTS.md is imprecise, the agent guesses. When it’s particular, the agent executes, and so the file is your contract. Write it earlier than you begin the duty, not after one thing breaks.


    Test my two newest articles the place you may study how to give your AI unlimited context and discover six widespread hard decisions AI Engineers must make in manufacturing.


    The 2-agent loop

    For something medium-complexity or above, don’t use one agent, use two.

    Agent 1 implements. Agent 2 evaluations. Then Agent 1 applies solely the important suggestions.

    Implementer immediate:

    You're a senior software program engineer implementing a selected job.
    
    Process: [describe the task]
    
    Context: [link to AGENTS.md or paste relevant sections]
    
    Guidelines:
    
    - Make minimal modifications.
    
    - Keep in scope.
    
    - Do not refactor unrelated code.
    
    - Add exams if conduct modifications.
    
    - When accomplished, present a ultimate report: abstract, recordsdata modified,
    
      exams run, dangers, something incomplete.

    Reviewer immediate:

    You're a code reviewer with no attachment to the implementation.
    
    Assessment this diff: [paste diff]
    
    Test for:
    
    - Bugs and edge instances
    
    - Lacking exams
    
    - Safety points
    
    - Unintended conduct modifications
    
    - Something outdoors the acknowledged scope
    
    Output:
    
    - Essential points (should repair)
    
    - Minor points (non-compulsory)
    
    - Something you'd flag for a human
    
    Don't rewrite the code. Flag, do not repair.
    

    The reviewer agent has no ego funding within the code. It appears for bugs, edge instances, take a look at protection, and safety points with out attempting to redo the work.

    Code overview is the way you catch what you missed. The 2-agent loop is identical course of, automated.

    The ultimate report

    Require a ultimate report for each agent job:

    1. Abstract of modifications

    2. Recordsdata modified

    3. Assessments run and end result

    4. Dangers or assumptions

    5. Something not accomplished

    This makes the agent accountable. If it might’t summarize what it did in clear phrases, that’s a sign the duty wasn’t clear.

    It additionally builds up documentation with out you writing it manually. The stories stack. When one thing breaks every week later, you may hint again precisely what modified and why.

    The unglamorous work

    The unglamorous work behind agentic AI. Image generated with DALL-E.
    Picture by Creator and ChatGPT.

    The hype round AI brokers is right here to remain, and principally earned. They do improve your output.

    The practitioners getting essentially the most from them are those who did the setup work: wrote the AGENTS.md, thought by means of the permission ranges, constructed the blocked instructions checklist, arrange the two-agent loop.

    Brokers work properly after they have clear directions. That half is on you.

    Thanks for studying!


    Yow will discover me on LinkedIn and Substack, the place I share extra particulars concerning AI and LLM.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Why AI Is NOT Stealing Your Job

    June 3, 2026

    Exploring Income Patterns with Python Pandas, Matplotlib, and Seaborn

    June 2, 2026

    From Local App to Public Website in Minutes

    June 2, 2026

    Code Is Cheap. Engineering Judgement Is Now the Scarce Resource

    June 2, 2026

    From Regex to Vision Models: Which RAG Technique Fits Which Problem

    June 2, 2026

    Escaping the Valley of Choice in BI

    June 2, 2026
    Leave A Reply Cancel Reply

    Editors Picks

    7 of the Best A24 Movies You Can Stream Free on Your Next Movie Night

    June 3, 2026

    Why AI Is NOT Stealing Your Job

    June 3, 2026

    First production roadster with a roof

    June 3, 2026

    GAMING: How Australia decides its Game of The Year

    June 3, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Two certificate authorities booted from the good graces of Chrome

    June 4, 2025

    Hard Rock Bet adds AI Insights to personalize sports wagering experience

    March 20, 2026

    Theragun Alternatives: Best Budget Massage Guns for 2025

    February 3, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.