Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Encore ROG 12RK-FB teardrop camper with pop-up wet bathroom tent
    • Munich-based encosa raises €25 million to bring battery storage to German SMEs
    • Websites Can Now Spy on You Through Your Hard Drive
    • Kalshi debuts regulated crypto perpetual futures
    • Apple Will Reportedly Add Bill-Splitting Feature to iOS 27
    • Escaping the Valley of Choice in BI
    • SEO headline New urine test uses gut biomarkers to identify autism earlier
    • Socceroos legend Tim Cahill backs sports swag design platform Nardo in $1 million pre-Seed raise
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Tuesday, June 2
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»Generating Structured Outputs from LLMs
    Artificial Intelligence

    Generating Structured Outputs from LLMs

    Editor Times FeaturedBy Editor Times FeaturedAugust 8, 2025No Comments12 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    interface for interacting with LLMs is thru the traditional chat UI present in ChatGPT, Gemini, or DeepSeek. The interface is kind of easy, the place the consumer inputs a physique of textual content and the mannequin responds with one other physique, which can or might not observe a selected construction. Since people can perceive unstructured pure language, this interface is appropriate and fairly efficient for the target market it was designed for.

    Nonetheless, the consumer base of LLMs is way bigger than the 8 billion people residing on Earth. It expands to hundreds of thousands of software program packages that may doubtlessly harness the ability of such giant generative fashions. Not like people, software program packages can’t perceive unstructured knowledge, stopping them from exploiting the data generated by these neural networks.

    To handle this problem, varied strategies have been developed to generate outputs from LLMs following a predefined schema. This text will overview three of the preferred approaches for producing structured outputs from LLMs. It’s written for engineers occupied with integrating LLMs into their software program purposes.

    Structured Output Era

    Structured output era from LLMs includes utilizing these fashions to supply knowledge that adheres to a predefined schema, fairly than producing unstructured textual content. The schema might be outlined in varied codecs, with JSON and regex being the most typical. For instance, when using JSON format, the schema specifies the anticipated keys and the info sorts (resembling int, string, float, and so forth.) for every worth. The LLM then outputs a JSON object that features solely the outlined keys and accurately formatted values.

    There are numerous conditions the place structured output is required from LLMs. Formatting unstructured our bodies of textual content is one giant utility space of this know-how. You need to use a mannequin to extract particular data from giant our bodies of textual content and even pictures (utilizing VLMs). For instance, you need to use a common VLM to extract the acquisition date, whole value, and retailer title from receipts.

    There are numerous strategies to generate structured outputs from LLMs. This text will focus on three.

    1. Counting on API Suppliers
    2. Prompting and Reprompting Methods
    3. Constrained Decoding

    Counting on API Suppliers ‘Magic’

    A number of LLM service API suppliers, together with OpenAI and Google’s Gemini, enable customers to outline a schema for the mannequin’s output. This schema is normally outlined utilizing a Pydantic class and supplied to the API endpoint. If you’re utilizing LangChain, you may observe this tutorial to combine structured outputs into your utility.

    Simplicity is the best side of this explicit method. You outline the required schema in a fashion acquainted to you, move it to the API supplier, and sit again and chill out because the service supplier performs all of the magic for you.

    Utilizing this method, nevertheless, will restrict you to utilizing solely API suppliers that present the described service. This limits the expansion and suppleness of your tasks, because it shuts the door to utilizing a number of fashions, significantly open supply ones. If the API suppliers instantly determine to spike the value of the service, you may be pressured both to just accept the additional prices or search for one other supplier.

    Furthermore, it isn’t precisely Hogwarts Magic that the service supplier does. The supplier follows a sure method to generate the structured output for you. Information of the underlying know-how will facilitate the app growth and speed up the debugging course of and error understanding. For the talked about causes, greedy the underlying science might be definitely worth the effort.

    Prompting and Reprompting-Primarily based Strategies

    You probably have chatted with an LLM earlier than, then this method might be in your thoughts. If you’d like a mannequin to observe a sure construction, simply inform it to take action! Within the system immediate, instruct the mannequin to observe a sure construction, present a number of examples, and ask it to not add any further textual content or description.

    After the mannequin responds to the consumer request and the system receives the output, it is best to use a parser to rework the sequence of bytes to an applicable illustration within the system. If parsing succeeds, then congratulate your self and thank the ability of immediate engineering. If parsing fails, then your system must get well from the error.

    Prompting is Not Sufficient

    The issue with prompting is unreliability. By itself, prompting isn’t sufficient to belief a mannequin to observe a required construction. It would add additional rationalization, disregard sure fields, and use an incorrect knowledge sort. Prompting might be and needs to be coupled with error restoration strategies that deal with the case the place the mannequin defies the schema, which is detected by parsing failure.

    Some folks may assume {that a} parser acts like a boolean operate. It takes a string as enter, checks its adherence to predefined grammar guidelines, and returns a easy ‘sure’ or ‘no’ reply. In actuality, parsers are extra advanced than that and supply a lot richer data than ‘follows’ or ‘doesn’t observe’ construction.

    Parsers can detect errors and incorrect tokens in enter textual content in response to grammar guidelines (Aho et al. 2007, 192–96). This data supplies us with invaluable data on the specifics of misalignments within the enter string. For instance, the parser is what detects a lacking semicolon error if you’re working Java code.

    Figure 1 depicts the circulation used within the prompting-based strategies.

    Determine 1: Normal Stream of Prompting and Reprompting Strategies. Generated utilizing mermaid by the Writer

    Prompting Instruments

    Some of the common libraries for immediate primarily based structured output era from LLMs is instructor. Teacher is a Python library with over 11k stars on GitHub. It helps knowledge definition with Pydantic, integrates with over 15 suppliers, and supplies computerized retries on parsing failure. Along with Python, the bundle can be avillable in TypeScript, Go, Ruby, and Rust (2).

    The fantastic thing about Teacher lies in its simplicity. All you want is to outline a Pydantic class, initialize a shopper utilizing solely its title and API key (if required), and move your request. The pattern code beneath, from the docs, shows the simplicity of Teacher.

    import teacher
    from pydantic import BaseModel
    from openai import OpenAI
    
    
    class Individual(BaseModel):
        title: str
        age: int
        occupation: str
    
    
    shopper = teacher.from_openai(OpenAI())
    individual = shopper.chat.completions.create(
        mannequin="gpt-4o-mini",
        response_model=Individual,
        messages=[
            {
              "role": "user",
              "content": "Extract: John is a 30-year-old software engineer"
            }
        ],
    )
    print(individual)  # Individual(title='John', age=30, occupation='software program engineer')

    The Price of Reprompting

    As handy because the reprompting approach may be, it comes at a hefty price. LLM utilization price, both service supplier API prices or GPU utilization, scales linearly with the variety of enter tokens and the variety of generated tokens.

    As talked about earlier prompting primarily based strategies may require reprompting. The reprompt may have roughly the identical price as the unique one. Therefore, the associated fee scales linearly with the variety of reprompts.

    When you’re going to make use of this method, you must maintain the associated fee drawback in thoughts. Nobody needs to be stunned by a big invoice from an API supplier. One concept to assist lower shocking prices is to place emergency brakes into the system by making use of a hard-coded restrict on the variety of allowed reprompts. This can aid you put an higher restrict on the prices of a single immediate and reprompt cycle.

    Constrained Decoding

    Not like the prompting, constrained decoding doesn’t want retries to generate a sound, structure-following output. Constrained decoding makes use of computational linguistics strategies and data of the token era course of in LLMs to generate outputs which are assured to observe the required schema.

    How It Works?

    LLMs are autoregressive models. They generate one token at a time and the generated tokens are used as inputs to the same model.

    The last layer of an LLM is basically a logistic regression model that calculates for each token in the model’s vocabulary the probability of it following the input sequence. The model calculates the logits value for each token, then using the softmax function, these value are scaled and transformed to probability values.

    Constrained decoding produces structured outputs by limiting the available tokens at each generation step. The tokens are picked so that the final output obeys the required structure. To figure out how the set of possible next tokens can be determined, we need to visit RegEx.

    Regular expressions, RegEx, are used to define specific patterns of text. They are used to check if a sequence of text matches an expected structure or schema. So basically, RegEx is a language that can be used to define expected structures from LLMs. Because of its popularity, there is a wide array of tools and libraries that transforms other forms of data structure definition like Pydantic classes and JSON to RegEx. Because of its flexibility and the wide availability of conversion tools, we can transform our goal now and focus on using LLMs to generate outputs following a RegEx pattern.

    Deterministic Finite Automata (DFA)

    One of the ways a RegEx pattern can be compiled and tested against a body of text is by transforming the pattern into a deterministic finite automata (DFA). A DFA is simply a state machine that is used to check if a string follows a certain structure or pattern.

    A DFA consists of 5 components.

    1. A set of tokens (called the alphabet of the DFA)
    2. A set of states
    3. A set of transitions. Each transition connects two states (maybe connecting a state with itself) and is annotated with a token from the alphabet
    4. A start state (marked with an input arrow)
    5. One or more final states (marked as double circles)

    A string is a sequence of tokens. To test a string against the pattern defined by a DFA, you begin at the start state and loop over the string’s tokens, taking the transition corresponding to the token at each move. If at any point you have a token for which no corresponding transition exists from the current state, parsing fails and the string defies the schema. If parsing ends at one of the final states, then the string matches the pattern; otherwise it also fails.

    Figure 2: Example for a DFA with the alphabet {a, b}, states {q0, q1, q2}, and a single final state, q2. Generated using Grpahviz by the Writer.

    For instance, the string abab matches the sample in Figure 2 as a result of beginning at q0 and following the transitions marked with a, b, a, and b on this order will land us at q2, which is a remaining state.

    However, the string abba doesn’t match the sample as a result of its path ends at q0 which isn’t a remaining state.

    A wonderful thing about RegEx is that it may be compiled right into a DFA; in spite of everything, they’re simply two other ways to specify patterns. Dialogue of such a change is out of scope for this text. The reader can verify Aho et al. (2007, 152–66) for a dialogue of two strategies to carry out the transformation.

    DFA for Legitimate Subsequent Tokens Set

    Determine 3: Instance for a DFA generated from the RegEx a(b|c)*d. Generated utilizing Grpahviz by the Writer.

    Let’s recap what we have now reached to date. We needed a way to determine the set of legitimate subsequent tokens to observe a sure schema. We outlined the schema utilizing RegEx and reworked it right into a DFA. Now we’re going to present {that a} DFA informs us of the set of attainable tokens at any level throughout parsing, becoming our necessities and desires.

    After constructing the DFA, we will simply decide in O(1) the set of legitimate subsequent tokens whereas standing at any state. It’s the set of tokens annotating any transition exiting from the present state.

    Contemplate the DFA in Figure 3, for instance. The next desk reveals the set of legitimate subsequent tokens for every state.

    State Legitimate Subsequent Tokens
    q0 {a}
    q1 {b, c, d}
    q2 {}

    Making use of the DFA to LLMs

    Getting again to our structured output from LLMs drawback, we will rework our schema to a RegEx then to a DFA. The alphabet of this DFA will likely be set to the LLM’s vocabulary (the set of all tokens the mannequin can generate). Whereas the mannequin generates tokens, we’ll transfer by means of the DFA, beginning at first state. At every step, we can decide the set of legitimate subsequent tokens.

    The trick now occurs on the softmax scaling stage. By zeroing out the logits of all tokens that aren’t within the legitimate tokens set, we’ll calculate chances just for legitimate tokens, forcing the mannequin to generate a sequence of tokens that follows the schema. That manner, we will generate structured outputs with zero further prices!

    Constrained Decoding Instruments

    One of the most popular Python libraries for constrained decoding is Outlines (Willard and Louf 2023). It is rather easy to make use of and integrates with many LLM suppliers like OpenAI, Anthropic, Ollama, and vLLM.

    You possibly can outline the schema utilizing a Pydantic class, for which the library handles the RegEx transformation, or immediately utilizing a RegEx sample.

    from pydantic import BaseModel
    from typing import Literal
    import outlines
    import openai
    
    class Buyer(BaseModel):
        title: str
        urgency: Literal["high", "medium", "low"]
        problem: str
    
    shopper = openai.OpenAI()
    mannequin = outlines.from_openai(shopper, "gpt-4o")
    
    buyer = mannequin(
        "Alice wants assist with login points ASAP",
        Buyer
    )
    # ✓ All the time returns legitimate Buyer object
    # ✓ No parsing, no errors, no retries

    The code snippet above from the docs shows the simplicity of utilizing Outlines. For extra data on the library, you may verify the docs and the dottxt blogs.

    Conclusion

    Structured output era from LLMs is a strong software that expands the attainable use instances of LLMs past the straightforward human chat. This text mentioned three approaches: counting on API suppliers, prompting and reprompting methods, and constrained decoding. For many eventualities, constrained decoding is the favoured technique due to its flexibility and low price. Furthermore, the existence of common libraries like Outlines simplifies the introduction of constrained decoding to software program tasks.

    If you wish to study extra about constrained decoding, then I’d extremely suggest this course from deeplearning.ai and dottxt, the creators of Outlines library. Utilizing movies and code examples, this course will aid you get hands-on expertise getting structured outputs from LLMs utilizing the strategies mentioned on this put up.

    References

    [1] Aho, Alfred V., Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman, Compilers: Principles, Techniques, & Tools (2007), Pearson/Addison Wesley

    [2] Willard, Brandon T., and Rémi Louf, Efficient Guided Generation for Large Language Models (2023), https://arxiv.org/abs/2307.09702.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Escaping the Valley of Choice in BI

    June 2, 2026

    Ensuring Data Integrity with Cryptographic Hashing and the Ethereum Blockchain

    June 1, 2026

    RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem

    June 1, 2026

    How to Combine Claude Code and Codex for Maximum Coding Power

    June 1, 2026

    It’s the Lessons We Learned Along the Way. Or, Is It?

    June 1, 2026

    Proxy-Pointer RAG: Eliminating Wasteful Entity & Relations Extraction in Knowledge Graphs

    May 31, 2026

    Comments are closed.

    Editors Picks

    Encore ROG 12RK-FB teardrop camper with pop-up wet bathroom tent

    June 2, 2026

    Munich-based encosa raises €25 million to bring battery storage to German SMEs

    June 2, 2026

    Websites Can Now Spy on You Through Your Hard Drive

    June 2, 2026

    Kalshi debuts regulated crypto perpetual futures

    June 2, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Viral post sparks debate over Kroger gambling machines in Georgia

    February 12, 2026

    Startups Boost Light in Phone Cameras

    May 24, 2025

    Should You Buy an iPhone 16 or Wait for the iPhone 17?

    July 24, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.