Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Salesforce has a stake in Anthropic worth ~$5B; Salesforce first invested about $50M in an early 2023 round and has continually invested in rounds since (Brody Ford/Bloomberg)
    • Russia’s Military Hackers Targeted Home Routers Across 23 States. Here’s What to Do
    • How to Combine Claude Code and Codex for Maximum Coding Power
    • Supermassive black holes may create millions of new planets
    • Cheque in: 3 startups ended May by raising $15.5 million
    • Universal Audio Volt 876 USB Audio Interface Review: Pro-Level Polish
    • New York City-based Mecka AI, which trains robots with human data sourced from body sensors and iPhones, raised $60M, including a $25M Series A (Ben Weiss/Fortune)
    • Is Instagram Down? What to Know
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Monday, June 1
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»AI Technology News»Finding “Silver Bullet” Agentic AI Flows with syftr
    AI Technology News

    Finding “Silver Bullet” Agentic AI Flows with syftr

    Editor Times FeaturedBy Editor Times FeaturedAugust 19, 2025No Comments9 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    TL; DR

    The quickest technique to stall an agentic AI venture is to reuse a workflow that not matches. Utilizing syftr, we recognized “silver bullet” flows for each low-latency and high-accuracy priorities that persistently carry out properly throughout a number of datasets. These flows outperform random seeding and switch studying early in optimization. They recuperate about 75% of the efficiency of a full syftr run at a fraction of the associated fee, which makes them a quick start line however nonetheless leaves room to enhance.

    When you have ever tried to reuse an agentic workflow from one venture in one other, you know the way typically it falls flat. The mannequin’s context size may not be sufficient. The brand new use case would possibly require deeper reasoning. Or latency necessities might need modified. 

    Even when the previous setup works, it might be overbuilt – and overpriced – for the brand new downside. In these circumstances, a less complicated, quicker setup is perhaps all you want. 

    We got down to reply a easy query: Are there agentic flows that carry out properly throughout many use circumstances, so you may select one primarily based in your priorities and transfer ahead?

    Our analysis suggests the reply is sure, and we name them “silver bullets.” 

    We recognized silver bullets for each low-latency and high-accuracy objectives. In early optimization, they persistently beat switch studying and random seeding, whereas avoiding the total value of a full syftr run.

    Within the sections that observe, we clarify how we discovered them and the way they stack up in opposition to different seeding methods.

     A fast primer on Pareto-frontiers

    You don’t want a math diploma to observe alongside, however understanding the Pareto-frontier will make the remainder of this submit a lot simpler to observe. 

    Determine 1 is an illustrative scatter plot – not from our experiments – displaying accomplished syftr optimization trials. Sub-plot A and Sub-plot B are similar, however B highlights the primary three Pareto-frontiers: P1 (pink), P2 (inexperienced), and P3 (blue).

    • Every trial: A selected circulate configuration is evaluated on accuracy and common latency (increased accuracy, decrease latency are higher).
    • Pareto-frontier (P1): No different circulate has each increased accuracy and decrease latency. These are non-dominated.
    • Non-Pareto flows: No less than one Pareto circulate beats them on each metrics. These are dominated.
    • P2, P3: Should you take away P1, P2 turns into the next-best frontier, then P3, and so forth.

    You would possibly select between Pareto flows relying in your priorities (e.g., favoring low latency over most accuracy), however there’s no purpose to decide on a dominated circulate — there’s at all times a greater possibility on the frontier.

    Optimizing agentic AI flows with syftr

    All through our experiments, we used syftr to optimize agentic flows for accuracy and latency. 

    This method permits you to:

    • Choose datasets containing query–reply (QA) pairs
    • Outline a search area for circulate parameters
    • Set aims similar to accuracy and price, or on this case, accuracy and latency

    Briefly, syftr automates the exploration of circulate configurations in opposition to your chosen aims.

    Determine 2 reveals the high-level syftr structure.

    Figure 02 syftr
    Determine 2: Excessive-level syftr structure. For a set of QA pairs, syftr can robotically discover agentic flows utilizing multi-objective Bayesian optimization by evaluating circulate responses with precise solutions.

    Given the virtually infinite variety of potential agentic circulate parametrizations, syftr depends on two key strategies:

    • Multi-objective Bayesian optimization to navigate the search area effectively.
    • ParetoPruner to cease analysis of probably suboptimal flows early, saving time and compute whereas nonetheless surfacing the simplest configurations.

    Silver bullet experiments

    Our experiments adopted a four-part course of (Determine 3).

    Figure 03 experiments
    Determine 3: The workflow begins with a two-step information era part:
    A: Run syftr utilizing easy random sampling for seeding.
    B: Run all completed flows on all different experiments. The ensuing information then feeds into the subsequent step. 
    C: Figuring out silver bullets and conducting switch studying.
    D: Operating syftr on 4 held-out datasets 3 times, utilizing three totally different seeding methods.

    Step 1: Optimize flows per dataset

    We ran a number of hundred trials on every of the next datasets:

    • CRAG Job 3 Music
    • FinanceBench
    • HotpotQA
    • MultihopRAG

    For every dataset, syftr looked for Pareto-optimal flows, optimizing for accuracy and latency (Determine 4).

    Figure 04 training
    Determine 4: Optimization outcomes for 4 datasets. Every dot represents a parameter mixture evaluated on 50 QA pairs. Crimson strains mark Pareto-frontiers with the perfect accuracy–latency tradeoffs discovered by the TPE estimator.

    Step 3: Determine silver bullets

    As soon as we had similar flows throughout all coaching datasets, we may pinpoint the silver bullets — the flows which are Pareto-optimal on common throughout all datasets.

    Figure 05 silver bullets process
    Determine 5: Silver bullet era course of, detailing the “Determine Silver Bullets” step from Determine 3.

    Course of:

    1. Normalize outcomes per dataset.  For every dataset, we normalize accuracy and latency scores by the best values in that dataset.
    2. Group similar flows. We then group matching flows throughout datasets and calculate their common accuracy and latency.
    3. Determine the Pareto-frontier. Utilizing this averaged dataset (see Determine 6), we choose the flows that construct the Pareto-frontier. 

    These 23 flows are our silver bullets — those that carry out properly throughout all coaching datasets.

    Figure 06 silver bullets plot
    Determine 6: Normalized and averaged scores throughout datasets. The 23 flows on the Pareto-frontier carry out properly throughout all coaching datasets.

    Step 4: Seed with switch studying

    In our authentic syftr paper, we explored switch studying as a technique to seed optimizations. Right here, we in contrast it straight in opposition to silver bullet seeding.

    On this context, switch studying merely means deciding on particular high-performing flows from historic (coaching) research and evaluating them on held-out datasets. The information we use right here is identical as for silver bullets (Determine 3).

    Course of:

    1. Choose candidates. From every coaching dataset, we took the top-performing flows from the highest two Pareto-frontiers (P1 and P2).
    2. Embed and cluster. Utilizing the embedding mannequin BAAI/bge-large-en-v1.5, we transformed every circulate’s parameters into numerical vectors. We then utilized Okay-means clustering (Okay = 23) to group related flows (Determine 7).
    3. Match experiment constraints. We restricted every seeding technique (silver bullets, switch studying, random sampling) to 23 flows for a good comparability, since that’s what number of silver bullets we recognized.

    Word: Switch studying for seeding isn’t but totally optimized. We may use extra Pareto-frontiers, choose extra flows, or strive totally different embedding fashions.

    Figure 07 transfer learning
    Determine 7: Clustered trials from Pareto-frontiers P1 and P2 throughout the coaching datasets.

    Step 5: Testing all of it

    Within the remaining analysis part (Step D in Determine 3), we ran ~1,000 optimization trials on 4 check datasets — Vibrant Biology, DRDocs, InfiniteBench, and PhantomWiki — repeating the method 3 times for every of the next seeding methods:

    • Silver bullet seeding
    • Switch studying seeding
    • Random sampling

    For every trial, GPT-4o-mini served because the decide, verifying an agent’s response in opposition to the ground-truth reply.

    Outcomes

    We got down to reply:

    Which seeding method — random sampling, switch studying, or silver bullets — delivers the perfect efficiency for a brand new dataset within the fewest trials?

    For every of the 4 held-out check datasets (Vibrant Biology, DRDocs, InfiniteBench, and PhantomWiki), we plotted:

    • Accuracy
    • Latency
    • Value
    • Pareto-area: a measure of how shut outcomes are to the optimum end result

    In every plot, the vertical dotted line marks the purpose when all seeding trials have accomplished. After seeding, silver bullets confirmed on common:

    • 9% increased most accuracy
    • 84% decrease minimal latency
    • 28% bigger Pareto-area

    in comparison with the opposite methods.

    Vibrant Biology

    Silver bullets had the best accuracy, lowest latency, and largest Pareto-area after seeding. Some random seeding trials didn’t end. Pareto-areas for all strategies elevated over time however narrowed as optimization progressed.

    Figure 08 bright biology
    Determine 8: Vibrant Biology outcomes

    DRDocs

    Much like Vibrant Biology, silver bullets reached an 88% Pareto-area after seeding vs. 71% (switch studying) and 62% (random).

    Figure 09 drdocs
    Determine 9: DRDocs outcomes

    InfiniteBench

    Different strategies wanted ~100 extra trials to match the silver bullet Pareto-area, and nonetheless didn’t match the quickest flows discovered through silver bullets by the top of ~1,000 trials.

    Figure 10 infinitebench
    Determine 10: InfiniteBench outcomes

    PhantomWiki

    Silver bullets once more carried out finest after seeding. This dataset confirmed the widest value divergence. After ~70 trials, the silver bullet run briefly targeted on costlier flows.

    Figure 11 phantomwiki
    Determine 11: PhantomWiki outcomes

    Pareto-fraction evaluation

    In runs seeded with silver bullets, the 23 silver bullet flows accounted for ~75% of the ultimate Pareto-area after 1,000 trials, on common.

    • Crimson space: Beneficial properties from optimization over preliminary silver bullet efficiency.
    • Blue space: Silver bullet flows nonetheless dominating on the finish.
    Figure 12 test plot
    Determine 12: Pareto-fraction for silver bullet seeding throughout all datasets

    Our takeaway

    Seeding with silver bullets delivers persistently sturdy outcomes and even outperforms switch studying, regardless of that technique pulling from a various set of historic Pareto-frontier flows. 

    For our two aims (accuracy and latency), silver bullets at all times begin with increased accuracy and decrease latency than flows from different methods.

    In the long term, the TPE sampler reduces the preliminary benefit. Inside a number of hundred trials, outcomes from all methods typically converge, which is anticipated since every ought to finally discover optimum flows.

    So, do agentic flows exist that work properly throughout many use circumstances? Sure — to some extent:

    • On common, a small set of silver bullets recovers about 75% of the Pareto-area from a full optimization.
    • Efficiency varies by dataset, similar to 92% restoration for Vibrant Biology in comparison with 46% for PhantomWiki.

    Backside line: silver bullets are an affordable and environment friendly technique to approximate a full syftr run, however they aren’t a alternative. Their affect may develop with extra coaching datasets or longer coaching optimizations.

     Silver bullet parametrizations

    We used the next:

    LLMs

    • microsoft/Phi-4-multimodal-instruct
    • deepseek-ai/DeepSeek-R1-Distill-Llama-70B
    • Qwen/Qwen2.5
    • Qwen/Qwen3-32B
    • google/gemma-3-27b-it
    • nvidia/Llama-3_3-Nemotron-Tremendous-49B

    Embedding fashions

    • BAAI/bge-small-en-v1.5
    • thenlper/gte-large
    • mixedbread-ai/mxbai-embed-large-v1
    • sentence-transformers/all-MiniLM-L12-v2
    • sentence-transformers/paraphrase-multilingual-mpnet-base-v2
    • BAAI/bge-base-en-v1.5
    • BAAI/bge-large-en-v1.5
    • TencentBAC/Conan-embedding-v1
    • Linq-AI-Analysis/Linq-Embed-Mistral
    • Snowflake/snowflake-arctic-embed-l-v2.0
    • BAAI/bge-multilingual-gemma2

    Circulate sorts

    • vanilla RAG
    • ReAct RAG agent
    • Critique RAG agent
    • Subquestion RAG

    Right here’s the total listing of all 23 silver bullets, sorted from low accuracy / low latency to excessive accuracy / excessive latency: silver_bullets.json. 

    Strive it your self

    Wish to experiment with these parametrizations? Use the running_flows.ipynb pocket book in our syftr repository — simply be sure to have entry to the fashions listed above. 

    For a deeper dive into syftr’s structure and parameters, try our technical paper or discover the codebase.

    We’ll even be presenting this work on the International Conference on Automated Machine Learning (AutoML) in September 2025 in New York Metropolis.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    How the Pope’s Magnifica Humanitas offers a template for individuals to meet the AI moment

    May 29, 2026

    The AI Hype Index: AI gets booed in graduation season

    May 28, 2026

    Industry-standard LLM benchmarks in DataRobot

    May 27, 2026

    Rethinking organizational design in the age of agentic AI

    May 26, 2026

    A reality check on the AI jobs hysteria

    May 26, 2026

    It’s time to address the looming crisis in entry-level work.

    May 26, 2026

    Comments are closed.

    Editors Picks

    Salesforce has a stake in Anthropic worth ~$5B; Salesforce first invested about $50M in an early 2023 round and has continually invested in rounds since (Brody Ford/Bloomberg)

    June 1, 2026

    Russia’s Military Hackers Targeted Home Routers Across 23 States. Here’s What to Do

    June 1, 2026

    How to Combine Claude Code and Codex for Maximum Coding Power

    June 1, 2026

    Supermassive black holes may create millions of new planets

    June 1, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Australia’s First Commercial Hypersonic Flight Achieves Mach 8

    March 3, 2026

    Today’s NYT Connections Hints, Answers for Jan. 25 #959

    January 24, 2026

    AI Enhances Deep Brain Stimulation for Depression

    June 30, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.