Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Onda tiny house flips layout to fit three bedrooms and two bathrooms
    • Best Meta Glasses (2026): Ray-Ban, Oakley, AR
    • At the Beijing half-marathon, several humanoid robots beat human winners by 10+ minutes; a robot made by Honor beat the human world record held by Jacob Kiplimo (Reuters)
    • 1000xResist Studio’s Next Indie Game Asks: Can You Convince an AI It Isn’t Human?
    • Efficient hybrid minivan delivers MPG
    • How Can Astronauts Tell How Fast They’re Going?
    • A look at the AI nonprofit METR, whose time-horizon metrics are used by AI researchers and Wall Street investors to track the rapid development of AI systems (Kevin Roose/New York Times)
    • Double Dazzle: This Weekend, There Are 2 Meteor Showers in the Night Sky
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Sunday, April 19
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»Building Research Agents for Tech Insights
    Artificial Intelligence

    Building Research Agents for Tech Insights

    Editor Times FeaturedBy Editor Times FeaturedSeptember 13, 2025No Comments11 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    ChatGPT one thing like: “Please scout all of tech for me and summarize tendencies and patterns based mostly on what you assume I’d be fascinated with,” that you simply’d get one thing generic, the place it searches just a few web sites and information sources and arms you these.

    It’s because ChatGPT is constructed for basic use instances. It applies regular search strategies to fetch info, usually limiting itself to some net pages.

    This text will present you easy methods to construct a distinct segment agent that may scout all of tech, combination thousands and thousands of texts, filter knowledge based mostly on a persona, and discover patterns and themes you’ll be able to act on.

    The purpose of this workflow is to keep away from sitting and scrolling by means of boards and social media by yourself. The agent ought to do it for you, grabbing no matter is helpful.


    We’ll be capable to pull this off utilizing a singular knowledge supply, a managed workflow, and a few immediate chaining strategies.

    The three completely different processes, the API, fetching/filtering knowledge, summarizing | Picture by writer

    By caching knowledge, we are able to maintain the associated fee down to some cents per report.

    If you wish to strive the bot with out booting it up your self, you’ll be able to be part of this Discord channel. You’ll discover the repository here if you wish to construct it by yourself.

    This text focuses on the overall structure and easy methods to construct it, not the smaller coding particulars as you will discover these in Github.

    Notes on constructing

    In the event you’re new to constructing with brokers, you would possibly really feel like this one isn’t groundbreaking sufficient.

    Nonetheless, if you wish to construct one thing that works, you’ll need to use various software program engineering to your AI purposes. Even when LLMs can now act on their very own, they nonetheless want steerage and guardrails.

    For workflows like this, the place there’s a clear path the system ought to take, you need to construct extra structured “workflow-like” techniques. You probably have a human within the loop, you’ll be able to work with one thing extra dynamic.

    The rationale this workflow works so effectively is as a result of I’ve an excellent knowledge supply behind it. With out this knowledge moat, the workflow wouldn’t be capable to do higher than ChatGPT.

    Getting ready and caching knowledge

    Earlier than we are able to construct an agent, we have to put together an information supply it might probably faucet into.

    One thing I believe lots of people get flawed after they work with LLM techniques is the assumption that AI can course of and combination knowledge fully by itself.

    Sooner or later, we’d be capable to give them sufficient instruments to construct on their very own, however we’re not there but when it comes to reliability.

    So once we construct techniques like this, we’d like knowledge pipelines to be simply as clear as for another system.

    The system I’ve constructed right here makes use of an information supply I already had accessible, which implies I perceive easy methods to educate the LLM to faucet into it.

    It ingests 1000’s of texts from tech boards and web sites per day and makes use of small NLP fashions to interrupt down the primary key phrases, categorize them, and analyze sentiment.

    This lets us see which key phrases are trending inside completely different classes over a selected time interval.


    To construct this agent, I added one other endpoint that collects “information” for every of those key phrases.

    This endpoint receives a key phrase and a time interval, and the system types feedback and posts by engagement. Then it course of the texts in chunks with smaller fashions that may determine which “information” to maintain.

    The “facts” extracting process for each keyword | Image by author
    We apply a final LLM to summarize which information are most essential, maintaining the supply citations intact.


    It is a type of immediate chaining course of, and I constructed it to imitate LlamaIndex’s quotation engine.

    The primary time the endpoint is named for a key phrase, it might probably take as much as half a minute to finish. However for the reason that system caches the consequence, any repeat request takes only a few milliseconds.

    So long as the fashions are sufficiently small, the price of working this on just a few hundred key phrases per day is minimal. Later, we are able to have the system run a number of key phrases in parallel.

    You possibly can in all probability think about now that we are able to construct a system to fetch these key phrases and information to construct completely different reviews with LLMs.

    When to work with small vs bigger fashions

    Earlier than transferring on, let’s simply point out that selecting the best mannequin dimension issues.

    I believe that is on everybody’s thoughts proper now.

    There are fairly superior fashions you should utilize for any workflow, however as we begin to apply an increasing number of LLMs to those purposes, the variety of calls per run provides up shortly and this may get costly.

    So, when you’ll be able to, use smaller fashions.

    You noticed that I used smaller fashions to quote and group sources in chunks. Different duties which can be nice for small fashions embody routing and parsing pure language into structured knowledge.

    In the event you discover that the mannequin is faltering, you’ll be able to break the duty down into smaller issues and use immediate chaining, first do one factor, then use that consequence to do the following, and so forth.

    You continue to wish to use bigger LLMs when you have to discover patterns in very giant texts, or once you’re speaking with people.

    On this workflow, the associated fee is minimal as a result of the information is cached, we use smaller fashions for many duties, and the one distinctive giant LLM calls are the ultimate ones.

    How this agent works

    Let’s undergo how the agent works underneath the hood. I constructed the agent to run inside Discord, however that’s not the main focus right here. We’ll deal with the agent structure.

    I break up the method into two elements: one setup, and one information. The primary course of asks the consumer to arrange their profile.


    Since I already know easy methods to work with the information supply, I’ve constructed a reasonably intensive system immediate that helps the LLM translate these inputs into one thing we are able to fetch knowledge with later.

    PROMPT_PROFILE_NOTES = """
    You might be tasked with defining a consumer persona based mostly on the consumer's profile abstract.
    Your job is to:
    1. Choose a brief persona description for the consumer.
    2. Choose essentially the most related classes (main and minor).
    3. Select key phrases the consumer ought to monitor, strictly following the foundations beneath (max 6).
    4. Determine on time interval (based mostly solely on what the consumer asks for).
    5. Determine whether or not the consumer prefers concise or detailed summaries.
    Step 1. Persona
    - Write a brief description of how we must always take into consideration the consumer.
    - Examples:
    - CMO for non-technical product → "non-technical, skip jargon, deal with product key phrases."
    - CEO → "solely embody extremely related key phrases, no technical overload, straight to the purpose."
    - Developer → "technical, fascinated with detailed developer dialog and technical phrases."
    [...]
    """
    

    I’ve additionally outlined a schema for the outputs I want:

    class ProfileNotesResponse(BaseModel):
     persona: str
     major_categories: Listing[str]
     minor_categories: Listing[str]
     key phrases: Listing[str]
     time_period: str
     concise_summaries: bool

    With out having area information of the API and the way it works, it’s unlikely that an LLM would work out how to do that by itself.

    You possibly can strive constructing a extra intensive system the place the LLM first tries to study the API or the techniques it’s supposed to make use of, however that might make the workflow extra unpredictable and dear.

    For duties like this, I attempt to all the time use structured outputs in JSON format. That method we are able to validate the consequence, and if validation fails, we re-run it.

    That is the best strategy to work with LLMs in a system, particularly when there’s no human within the loop to examine what the mannequin returns.

    As soon as the LLM has translated the consumer profile into the properties we outlined within the schema, we retailer the profile someplace. I used MongoDB, however that’s optionally available.

    Storing the persona isn’t strictly required, however you do have to translate what the consumer says right into a kind that allows you to generate knowledge.

    Producing the reviews

    Let’s take a look at what occurs within the second step when the consumer triggers the report.

    When the consumer hits the /information command, with or with no time interval set, we first fetch the consumer profile knowledge we’ve saved.

    This provides the system the context it must fetch related knowledge, utilizing each classes and key phrases tied to the profile. The default time interval is weekly.

    From this, we get a listing of prime and trending key phrases for the chosen time interval which may be fascinating to the consumer.

    Example of trending keywords that can come up from the system in two different categories | Image by author
    With out this knowledge supply, constructing one thing like this is able to have been tough. The info must be ready prematurely for the LLM to work with it correctly.

    After fetching key phrases, it might make sense so as to add an LLM step that filters out key phrases irrelevant to the consumer. I didn’t do this right here.

    The extra pointless info an LLM is handed, the tougher it turns into for it to deal with what actually issues. Your job is to ensure that no matter you feed it’s related to the consumer’s precise query.

    Subsequent, we use the endpoint ready earlier, which accommodates cached “information” for every key phrase. This provides us already vetted and sorted info for each.

    We run key phrase calls in parallel to hurry issues up, however the first individual to request a brand new key phrase nonetheless has to attend a bit longer.

    As soon as the outcomes are in, we mix the information, take away duplicates, and parse the citations so every reality hyperlinks again to a selected supply by way of a key phrase quantity.

    We then run the information by means of a prompt-chaining course of. The primary LLM finds 5 to 7 themes and ranks them by relevance, based mostly on the consumer profile. It additionally pulls out the important thing factors.

    Short chain of prompting, breaking the task into smaller ones | Image by author
    The second LLM go makes use of each the themes and authentic knowledge to generate two completely different abstract lengths, together with a title.

    We will do that to ensure to scale back cognitive load on the mannequin.
    This final step to construct the report takes essentially the most time, since I selected to make use of a reasoning mannequin like GPT-5.

    You possibly can swap it for one thing quicker, however I discover superior fashions are higher at this final stuff.

    The complete course of takes a couple of minutes, relying on how a lot has already been cached that day.

    Try the completed consequence beneath.

    How the tech scounting bot works in Discord | Image by author
    If you wish to take a look at the code and construct this bot your self, you will discover it here. In the event you simply wish to generate a report, you’ll be able to be part of this channel.

    I’ve some plans to enhance it, however I’m completely satisfied to listen to suggestions if you happen to discover it helpful.

    And in order for you a problem, you’ll be able to rebuild it into one thing else, like a content material generator.

    Notes on constructing brokers

    Each agent you construct can be completely different, so that is on no account a blueprint for constructing with LLMs. However you’ll be able to see the extent of software program engineering this calls for.

    LLMs, a minimum of for now, don’t take away the necessity for good software program and knowledge engineers.

    For this workflow, I’m principally utilizing LLMs to translate pure language into JSON after which transfer that by means of the system programmatically. It’s the best strategy to management the agent course of, but additionally not what folks often think about after they consider AI purposes.

    There are conditions the place utilizing a extra free-moving agent is good, particularly when there’s a human within the loop.

    Nonetheless, hopefully you realized one thing, or acquired inspiration to construct one thing by yourself.

    If you wish to comply with my writing, comply with me right here, my website, Substack, or LinkedIn.

    ❤



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    AI Agents Need Their Own Desk, and Git Worktrees Give Them One

    April 18, 2026

    Your RAG System Retrieves the Right Data — But Still Produces Wrong Answers. Here’s Why (and How to Fix It).

    April 18, 2026

    Europe Warns of a Next-Gen Cyber Threat

    April 18, 2026

    How to Learn Python for Data Science Fast in 2026 (Without Wasting Time)

    April 18, 2026

    A Practical Guide to Memory for Autonomous LLM Agents

    April 17, 2026

    You Don’t Need Many Labels to Learn

    April 17, 2026

    Comments are closed.

    Editors Picks

    Onda tiny house flips layout to fit three bedrooms and two bathrooms

    April 19, 2026

    Best Meta Glasses (2026): Ray-Ban, Oakley, AR

    April 19, 2026

    At the Beijing half-marathon, several humanoid robots beat human winners by 10+ minutes; a robot made by Honor beat the human world record held by Jacob Kiplimo (Reuters)

    April 19, 2026

    1000xResist Studio’s Next Indie Game Asks: Can You Convince an AI It Isn’t Human?

    April 19, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Best Internet Providers in Meridian, Idaho

    February 19, 2025

    Liraglutide helps bariatric patients lose more weight and avoid repeat surgery

    November 14, 2025

    The Artemis II Crew Gives a Behind-the-Scenes Tour Inside Their Orion Spacecraft

    April 7, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.