Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • 15-second semicylinder air tent unboxes from the cube
    • Emergency First Responders Say Waymos Are Getting Worse
    • Motorola Razr Fold vs. Samsung Galaxy Z Fold 7: How the Book-Style Phones Compare
    • Agentic AI: How to Save on Tokens
    • Lightweight ebike conversion kit electrifies your bike
    • Taylor Swift Wants to Trademark Her Likeness. These TikTok Deepfake Ads Show Why
    • New Releases on Prime Video in May 2026: Jack Reacher, Spider-Noir and More
    • 4 YAML Files Instead of PySpark: How We Let Analysts Build Data Pipelines Without Engineers
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Thursday, April 30
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»How to Ensure Reliability in LLM Applications
    Artificial Intelligence

    How to Ensure Reliability in LLM Applications

    Editor Times FeaturedBy Editor Times FeaturedJuly 15, 2025Updated:July 15, 2025No Comments8 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    have entered the world of laptop science at a document tempo. LLMs are highly effective fashions able to successfully performing all kinds of duties. Nevertheless, LLM outputs are stochastic, making them unreliable. On this article, I focus on how one can guarantee reliability in your LLM purposes by correctly prompting the mannequin and dealing with the output.

    This infographic highlights the contents of this text. I’ll primarily focus on making certain output consistency and dealing with errors. Picture by ChatGPT.

    It’s also possible to learn my articles on Attending NVIDIA GTC Paris 2025 and Creating Powerful Embeddings for Machine Learning.

    Desk of Contents

    Motivation

    My motivation for this text is that I’m constantly creating new purposes utilizing LLMs. LLMs are generalized instruments that may be utilized to most text-dependent duties akin to classification, summarization, info extraction, and way more. Moreover, the rise of imaginative and prescient language fashions additionally allow us to deal with pictures much like how we deal with textual content.

    I typically encounter the issue that my LLM purposes are inconsistent. Typically the LLM doesn’t reply within the desired format, or I’m unable to correctly parse the LLM response. It is a large downside when you find yourself working in a manufacturing setting and are absolutely depending on consistency in your software. I’ll thus focus on the strategies I take advantage of to make sure reliability for my purposes in a manufacturing setting.

    Guaranteeing output consistency

    Markup tags

    To make sure output consistency, I take advantage of a method the place my LLM solutions in markup tags. I take advantage of a system immediate like:

    immediate = f"""
    Classify the textual content into "Cat" or "Canine"
    
    Present your response in   tags
    
    """

    And the mannequin will virtually at all times reply with:

    Cat
    
    or 
    
    Canine

    Now you can simply parse out the response utilizing the next code:

    def _parse_response(response: str):
        return response.cut up("")[1].cut up("")[0]

    The explanation utilizing markup tags works so effectively is that that is how the mannequin is educated to behave. When OpenAI, Qwen, Google, and others prepare these fashions, they use markup tags. The fashions are thus tremendous efficient at using these tags and can, in virtually all instances, adhere to the anticipated response format.

    For instance, with reasoning fashions, which have been on the rise these days, the fashions first do their considering enclosed in … tags, after which present their reply to the consumer.


    Moreover, I additionally attempt to use as many markup tags as potential elsewhere in my prompts. For instance, if I’m offering a number of shot examples to my mannequin, I’ll do one thing like:

    immediate = f"""
    Classify the textual content into "Cat" or "Canine"
    
    Present your response in   tags
    
    
    That is a picture exhibiting a cat -> Cat
    
    
    That is a picture exhibiting a canine -> Canine
    
    """

    I do two issues that assist the mannequin carry out right here:

    1. I present examples in tags.
    2. In my examples, I guarantee to stick to my very own anticipated response format, utilizing the

    Utilizing markup tags, you possibly can thus guarantee a excessive degree of output consistency out of your LLM

    Output validation

    Pydantic is a software you should utilize to make sure and validate the output of your LLMs. You’ll be able to outline varieties and validate that the output of the mannequin adheres to the kind we count on. For instance, you possibly can comply with the instance under, primarily based on this article:

    from pydantic import BaseModel
    from openai import OpenAI
    
    consumer = OpenAI()
    
    
    class Profile(BaseModel):
        identify: str
        electronic mail: str
        telephone: str
    
    resp = consumer.chat.completions.create(
        mannequin="gpt-4o",
        messages=[
            {
                "role": "user",
                "content": "Return the `name`, `email`, and `phone` of user {user} in a json object."
            },
        ]
    )
    
    Profile.model_validate_json(resp.selections[0].message.content material)

    As you possibly can see, we immediate GPT to reply with a JSON object, and we then run Pydantic to make sure the response is as we count on.


    I’d additionally like to notice that generally it’s simpler to easily create your personal output validation operate. Within the final instance, the one necessities for the response object are basically that the response object incorporates the keys identify, electronic mail, and telephone, and that each one of these are of the string sort. You’ll be able to validate this in Python with a operate:

    def validate_output(output: str):
        assert "identify" in output and isinstance(output["name"], str)
        assert "electronic mail" in output and isinstance(output["email"], str)
        assert "telephone" in output and isinstance(output["phone"], str)

    With this, you wouldn’t have to put in any packages, and in numerous instances, it’s simpler to arrange.

    Tweaking the system immediate

    It’s also possible to make a number of different tweaks to your system immediate to make sure a extra dependable output. I at all times suggest making your immediate as structured as potential, utilizing:

    • Markup tags as talked about earlier
    • Lists, such because the one I’m writing in right here

    Normally, you must also at all times guarantee clear directions. You should use the next to make sure the standard of your immediate

    When you gave the immediate to a different human, that had by no means seen the duty earlier than, and with no prior data of the duty. Would the human be capable to carry out the duty successfully?

    When you can’t have a human do the duty, you normally can’t count on an AI to do it (at the very least for now).

    Dealing with errors

    Errors are inevitable when coping with LLMs. When you carry out sufficient API calls, it’s virtually sure that generally the response won’t be in your required format, or one other problem.

    In these situations, it’s necessary that you’ve a strong software outfitted to deal with such errors. I take advantage of the next strategies to deal with errors:

    • Retry mechanism
    • Improve the temperature
    • Have backup LLMs

    Now, let me elaborate on every level.

    Exponential backoff retry mechanism

    It’s necessary to have a retry mechanism in place, contemplating numerous points can happen when making an API name. You may encounter points akin to price limiting, incorrect output format, or a sluggish response. In these situations, you need to guarantee to wrap the LLM name in a try-catch and retry. Often, it’s additionally sensible to make use of an exponential backoff, particularly for rate-limiting errors. The explanation for that is to make sure you wait lengthy sufficient to keep away from additional rate-limiting points.

    Temperature improve

    I additionally generally suggest growing the temperature a bit. When you set the temperature to 0, you inform the mannequin to behave deterministically. Nevertheless, generally this may have a unfavorable impact.

    For instance, in case you have an enter instance the place the mannequin failed to reply within the correct output format. When you retry this utilizing a temperature of 0, you might be more likely to simply expertise the identical problem. I thus suggest you set the temperature to a bit increased, for instance 0.1, to make sure some stochasticness within the mannequin, whereas additionally making certain its outputs are comparatively deterministic.

    This is identical logic that numerous brokers use: a better temperature.

    They should keep away from being stuch in a loop. Having a better temperature will help them keep away from repetitive errors.

    Backup LLMs

    One other highly effective methodology to take care of errors is to have backup LLMs. I like to recommend utilizing a sequence of LLM suppliers for all of your API calls. For instance, you first attempt OpenAI, if that fails, you employ Gemini, and if that fails, you should utilize Claude.

    This ensures reliability within the occasion of provider-specific points. These might be points akin to:

    • The server is down (for instance, if OpenAI’s API is just not obtainable for a time frame)
    • Filtering (generally, an LLM supplier will refuse to reply your request if it believes your request is in violation of jailbreak insurance policies or content material moderation)

    Normally, it’s merely good observe to not be absolutely depending on one supplier.

    Conclusion

    On this article, I’ve mentioned how one can guarantee reliability in your LLM software. LLM purposes are inherently stochastic since you can’t immediately management the output of an LLM. It’s thus necessary to make sure you have correct insurance policies in place, each to reduce the errors that happen and to deal with the errors once they happen.

    I’ve mentioned the next approaches to reduce errors and deal with errors:

    • Markup tags
    • Output validation
    • Tweaking the system immediate
    • Retry mechanism
    • Improve the temperature
    • Have backup LLMs

    When you mix these strategies into your software, you possibly can obtain each a robust and sturdy LLM software.

    👉 Observe me on socials:

    🧑‍💻 Get in touch
    🌐 Personal Blog
    🔗 LinkedIn
    🐦 X / Twitter
    ✍️ Medium
    🧵 Threads



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Agentic AI: How to Save on Tokens

    April 29, 2026

    4 YAML Files Instead of PySpark: How We Let Analysts Build Data Pipelines Without Engineers

    April 29, 2026

    Ensembles of Ensembles of Ensembles: A Guide to Stacking

    April 29, 2026

    How AI Policy in South Africa Is Ruining Itself

    April 29, 2026

    PyTorch NaNs Are Silent Killers — So I Built a 3ms Hook to Catch Them at the Exact Layer

    April 28, 2026

    Correlation Doesn’t Mean Causation! But What Does It Mean?

    April 28, 2026

    Comments are closed.

    Editors Picks

    15-second semicylinder air tent unboxes from the cube

    April 30, 2026

    Emergency First Responders Say Waymos Are Getting Worse

    April 29, 2026

    Motorola Razr Fold vs. Samsung Galaxy Z Fold 7: How the Book-Style Phones Compare

    April 29, 2026

    Agentic AI: How to Save on Tokens

    April 29, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    La Liga Soccer: Stream Elche vs. Barcelona Live From Anywhere

    January 31, 2026

    AI-powered fleet logistics startup Adiona Tech drives away with Startup World Cup Sydney win

    August 13, 2025

    Mad scientist swaps bicycle suspension springs for magnets

    December 15, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.