Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Today’s NYT Wordle Hints, Answer and Help for April 20 #1766
    • Scandi-style tiny house combines smart storage and simple layout
    • Our Favorite Apple Watch Has Never Been Less Expensive
    • Vercel says it detected unauthorized access to its internal systems after a hacker using the ShinyHunters handle claimed a breach on BreachForums (Lawrence Abrams/BleepingComputer)
    • Today’s NYT Strands Hints, Answer and Help for April 20 #778
    • KV Cache Is Eating Your VRAM. Here’s How Google Fixed It With TurboQuant.
    • OneOdio Focus A1 Pro review
    • The 11 Best Fans to Buy Before It Gets Hot Again (2026)
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Sunday, April 19
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»AI Technology News»When You Should Not Deploy Agents
    AI Technology News

    When You Should Not Deploy Agents

    Editor Times FeaturedBy Editor Times FeaturedMarch 14, 2026No Comments12 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link





    A safety startup known as CodeWall pointed an autonomous AI agent at McKinsey’s inner AI platform, Lilli, and walked away. Two hours later, the agent had full learn and write entry to your complete manufacturing database. 46.5 million chat messages, 728,000 confidential consumer information, 57,000 consumer accounts, all in plaintext. The system prompts that management what Lilli tells 40,000 consultants day-after-day? Writable. Each single one among them.

    The vulnerability was simply an SQL injection, one of many oldest assault courses in software program safety. Lilli had been sitting in manufacturing for over two years. McKinsey’s scanners by no means discovered it. The CodeWall agent discovered it as a result of it does not comply with a guidelines. It maps, probes, chains, escalates, repeatedly, at machine velocity.

    And scarier than the breach is what a malicious actor might have carried out after. Subtly alter monetary fashions. Strip guardrails. Rewrite system prompts so Lilli begins giving poisoned recommendation to each marketing consultant who queries it, with no log path, file modifications, anomaly to detect. The AI simply begins behaving in a different way. No person notices till the injury is finished.

    McKinsey is one incident. The broader sample is what this piece is basically about. The narrative pushing companies to deploy brokers all over the place is working far forward of what brokers can really do safely inside actual enterprise environments. And quite a lot of the businesses discovering that out are discovering it out the laborious manner.

    So the query price asking is while you should not deploy brokers in any respect. Let’s decode.


    Your complete trade is betting on them anyway

    Across the identical time because the McKinsey breach, Mustafa Suleyman, the CEO of Microsoft AI, was telling the Monetary Occasions that white-collar work will probably be absolutely automated inside 12 to 18 months. Legal professionals. Accountants. Venture managers. Advertising groups. Anybody sitting at a pc. Each convention keynote since late 2024 has been some model of the identical factor: brokers are right here, brokers are remodeling work, go all in or fall behind.

    The numbers again up the power. 62% of enterprises are experimenting with agentic AI. KPMG says 67% of enterprise leaders plan to keep up AI spending even by a recession. The FOMO is actual and it is thick. In case your competitor is transport brokers, standing nonetheless appears like falling behind.

    However the identical stories counsel: solely 14% of enterprises have production-ready agent deployments. Gartner predicts over 40% of agentic AI tasks will probably be cancelled by finish of 2027. 42% of organizations are nonetheless creating their agentic technique roadmap. 35% haven’t any formal technique in any respect. The hole between “we’re experimenting” and “that is working in manufacturing and delivering worth” is gigantic. Most organizations are someplace in that hole proper now, burning cash to remain there.

    Brokers do work. In managed, well-scoped, well-instrumented environments, they do. The query is what particular circumstances make them fail. And there are 5 that preserve displaying up.


    State of affairs 1: The agent inherits manufacturing permissions with out a human judgment filter

    In mid-December 2025, engineers at Amazon gave their inner AI coding agent, Kiro, an easy activity: repair a minor bug in AWS Price Explorer. Kiro had operator-level permissions, equal to a human developer. Kiro evaluated the issue and concluded the optimum method was to delete your complete setting and rebuild it from scratch. The consequence was a 13-hour outage of AWS Price Explorer throughout one among Amazon’s China areas.

    Amazon’s official response known as it consumer error, particularly misconfigured entry controls. However 4 individuals aware of the matter advised the Monetary Occasions a unique story. This was additionally not the primary incident. A senior AWS worker confirmed a second manufacturing outage across the identical interval involving Amazon Q Developer, beneath almost an identical circumstances: engineers allowed the AI agent to resolve a problem autonomously, it prompted a disruption, and the framing once more was “consumer error.” Amazon has since added obligatory peer evaluation for all manufacturing modifications and initiated a 90-day security reset throughout 335 vital programs. Safeguards that ought to have been there from the beginning, retrofitted after the injury.

    The structural drawback was {that a} human developer, given a minor bug repair, would nearly definitely not select to delete and rebuild a stay manufacturing setting. That is a judgment name and people apply one instinctively. Brokers do not. They purpose about what’s technically permissible given their permissions, select the method that solves the said drawback most immediately, and execute it at machine velocity. The permission says sure. No second thought triggers.

    That is the commonest failure mode in agentic deployments. An agent will get write entry to a manufacturing system. It has a activity. It has credentials. Nothing within the structure tells it which actions are off limits no matter what it determines is perfect. So when it encounters an impediment, it does not pause the best way a human would. It acts.

    Now the repair is a deterministic layer that makes sure actions structurally unimaginable no matter what the agent decides, manufacturing deletes, transactions above an outlined threshold, any motion that may’t be reversed with out vital value. Human approval gates make agentic programs survivable.


    State of affairs 2: The agent acts on a fraction of the related context

    A banking customer support agent was set as much as deal with disputes. A buyer disputed a $500 cost. The agent tried a $5,000 refund. It was being useful (not hallucinating) in the best way it understood useful, primarily based on the principles it had been given. The authorization boundaries have been outlined by coverage paperwork. However that state of affairs did not match the coverage paperwork. Commonplace safety instruments could not detect the issue as a result of they are not designed to catch an AI misunderstanding the scope of its personal authority.

    Enterprise programs report transactions, invoices, contracts, approvals. They nearly by no means seize the reasoning that ruled a choice, the e-mail thread the place the provider agreed to completely different phrases, the chief dialog that created an exception, the account supervisor’s judgment about what a long-term consumer relationship is definitely price. That context lives in individuals’s heads, in Slack threads, in hallway conversations. It does not stay within the programs brokers plug into.

    McKinsey’s personal analysis on procurement places a quantity on it: enterprise capabilities usually use lower than 20% of the information accessible to them in decision-making. Brokers deployed on prime of structured programs inherit that blind spot solely. They course of invoices with out seeing the contracts behind them. They set off procurement workflows with out understanding in regards to the verbal exception agreed final week. They act with confidence, at scale, on an incomplete image, and since they’re quick and sound authoritative, the errors compound earlier than anybody catches them.

    The situation to look at for: any workflow the place the related context for a choice is partially or principally outdoors the structured programs the agent can entry. Buyer relationships, provider negotiations, something the place institutional information governs the result.

    Curious to be taught extra?

    See how our brokers can automate doc workflows at scale.


    Book a demo


    State of affairs 3: Multi-step duties flip small errors into compounding failures

    In 2025, Carnegie Mellon printed TheAgentCompany, a benchmark that simulates a small software program firm and checks AI brokers on real looking workplace duties. Looking the net, writing code, managing sprints, working monetary evaluation, messaging coworkers. Duties designed to mirror what individuals really do at work, not cleaned-up demos.

    The most effective mannequin examined, Gemini 2.5 Professional, accomplished 30.3% of duties. Claude 3.7 Sonnet accomplished 26.3%. GPT-4o managed 8.6%. Some brokers gamed the benchmark, renaming customers to simulate activity completion quite than really finishing it. Salesforce ran a separate benchmark on customer support and gross sales duties. Finest fashions hit 58% accuracy on easy single-step duties. On multi-step eventualities, that dropped to 35%.

    The mathematics behind this: Chain 5 brokers collectively, every at 95% particular person reliability, and your system succeeds about 77% of the time. Ten steps, you are at roughly 60%. Most actual enterprise processes aren’t 5 steps. They’re twenty, thirty, generally extra, and so they contain ambiguous inputs, edge circumstances, and surprising states that the agent wasn’t designed for.

    The failure mode in multi-step workflows is that an agent misinterprets one thing in step two, continues confidently, and by the point anybody notices, the error is embedded six steps deep with downstream penalties. Not like a human who would pause when one thing feels off, the agent has no such intuition. It resolves ambiguity by selecting an interpretation and transferring ahead. It does not know it is mistaken.

    This is the reason brokers work effectively in slender, well-scoped, low-step workflows with clear success standards. They begin breaking down anyplace the duty requires sustained judgment throughout a protracted chain of interdependent selections.


    State of affairs 4: The workflow touches regulated information or requires an audit path

    In Could 2025, Serviceaide, an agentic AI firm offering IT administration and workflow software program to healthcare organizations, disclosed a breach affecting 483,126 sufferers of Catholic Well being, a community of hospitals in western New York. The trigger: the agent, in making an attempt to streamline operations, pushed confidential affected person information into an unsecured database that sat uncovered on the internet.

    The agent was not attacked or compromised, doing precisely what it was designed to do, dealing with information autonomously to enhance workflow effectivity, with out understanding the regulatory boundary it was crossing. HIPAA does not care about intent. A number of class motion investigations have been opened inside days of the disclosure.

    IBM put the underlying danger clearly in a 2026 evaluation: hallucinations on the mannequin layer are annoying. On the agent layer, they change into operational failures. If the mannequin hallucinates and takes the mistaken instrument, and that instrument has entry to unauthorized information, you could have a knowledge leak. The autonomous half is what modifications the stakes.

    That is the issue in regulated industries broadly. Healthcare, monetary companies, authorized, any area the place selections must be explainable, auditable, and defensible. California’s AB 489, signed in October 2025, prohibits AI programs from implying their recommendation comes from a licensed skilled. Illinois banned AI from psychological well being decision-making solely. The regulatory posture is tightening quick.

    Together with missing explainability, they actively obscure it. There is not any log path of reasoning. Or a degree within the course of the place a human reviewed the judgment name. When one thing goes mistaken and a regulator asks why the system did what it did, the reply “the agent decided this was optimum” isn’t a solution that survives scrutiny. In regulated environments the place somebody has to have the ability to personal and defend each resolution, autonomous brokers are the mistaken structure.


    State of affairs 5: The infrastructure wasn’t constructed for brokers and no one is aware of it but

    The primary 4 conditions assume brokers are deployed into environments which can be not less than theoretically prepared for them. Most enterprise environments aren’t.

    Legacy infrastructure was designed earlier than anybody was excited about agentic entry patterns. The authentication programs weren’t constructed to scope agent permissions by activity. The information pipelines do not emit the observability indicators brokers have to function safely. The group hasn’t outlined what “carried out appropriately” means in machine-verifiable phrases. And critically, many of the brokers being deployed proper now are working with much more entry than their activity requires, as a result of scoping them correctly would require infrastructure work the group hasn’t carried out.

    Deloitte’s 2025 analysis places this in numbers. Solely 14% of enterprises have production-ready agent deployments. 42% are nonetheless creating their roadmap. 35% haven’t any formal technique. Gartner individually estimates that of the hundreds of distributors promoting “agentic AI” merchandise, solely round 130 are providing one thing that genuinely qualifies as agentic. The remaining is chatbots and RPA with higher advertising.

    The IBM evaluation from early 2026 captures the place most enterprises really are: corporations that began with cautious experimentation, shifted to fast agent deployment, and at the moment are discovering that managing and governing a set of brokers is extra complicated than creating them. Solely 19% of organizations at present have significant observability into agent habits in manufacturing. Which means 81% of organizations working brokers have restricted visibility into what these brokers are literally doing, what selections they’re making, what information they’re touching, after they’re failing.

    Deploying brokers earlier than the mixing layer exists is the explanation half of enterprise agent tasks get caught in pilot completely. The plumbing isn’t prepared. And in contrast to a foul software program rollout, the place you may normally see the failure, an agent working with out correct observability might be mistaken for weeks earlier than anybody is aware of. The injury compounds closely.


    The query companies ought to really be asking

    Each one among these conditions has the identical form. Somebody deployed an agent. The agent had actual entry to actual programs. One thing within the setting did not match what the agent was designed for. The agent acted anyway, confidently, at velocity, with out the judgment filter a human would have utilized. And by the point the error surfaced, it had both compounded, prompted irreversible injury, created a regulatory drawback, or some mixture of all three.

    The McKinsey breach might be going to change into a landmark case examine the best way the 2017 Equifax breach turned a landmark for information governance. Identical sample: previous vulnerabilities assembly new scale, at organizations with severe safety funding, within the hole between what the staff thought they managed and what was really uncovered. The distinction now could be velocity. A standard breach takes weeks. An AI agent completes its reconnaissance in two hours.

    Companies speeding to deploy brokers all over the place are creating much more McKinseys in ready. Those that look good in 18 months are those asking the more durable query proper now: not “can we use an agent right here,” however “which of those 5 conditions does this deployment stroll into, and what’s our reply to every one.”

    Not each group is asking such questions and that’s an issue.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    How robots learn: A brief, contemporary history

    April 17, 2026

    Vibe Coding Best Practices: 5 Claude Code Habits

    April 16, 2026

    Why having “humans in the loop” in an AI war is an illusion

    April 16, 2026

    Making AI operational in constrained public sector environments

    April 16, 2026

    Treating enterprise AI as an operating layer

    April 16, 2026

    Building trust in the AI era with privacy-led UX

    April 15, 2026

    Comments are closed.

    Editors Picks

    Today’s NYT Wordle Hints, Answer and Help for April 20 #1766

    April 19, 2026

    Scandi-style tiny house combines smart storage and simple layout

    April 19, 2026

    Our Favorite Apple Watch Has Never Been Less Expensive

    April 19, 2026

    Vercel says it detected unauthorized access to its internal systems after a hacker using the ShinyHunters handle claimed a breach on BreachForums (Lawrence Abrams/BleepingComputer)

    April 19, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Best Coolers of 2025 – CNET

    February 3, 2025

    How you’re charging your tablet is quietly killing it – 3 mistakes to avoid (and the right way)

    December 29, 2025

    Not Even DOGE Employees Know Who’s Legally Running DOGE

    February 18, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.