Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • How to Edit, Merge, and Split PDFs With Free Online Tools
    • Florida crackdown targets illegal machines in Sarasota
    • Audiophile-Oriented Noble Audio Debuts More Affordable Osprey Earbuds
    • New radio bursts detected from binary stars
    • Remarkable, Catalysr and Indigenous pre-accelerators score NSW government support for diverse founders
    • Whoop Promo Codes May 2026: 20% Off | June 2026
    • Hawthorne bankruptcy dispute targets Illinois racing funds
    • Today’s NYT Connections: Sports Edition Hints, Answers for June 2 #617
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Tuesday, June 2
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»Hitchhiker’s Guide to RAG with ChatGPT API and LangChain
    Artificial Intelligence

    Hitchhiker’s Guide to RAG with ChatGPT API and LangChain

    Editor Times FeaturedBy Editor Times FeaturedJune 27, 2025No Comments7 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    generate tons of phrases and responses based mostly on normal data, however what occurs after we want solutions requiring correct and particular data? Solely generative fashions continuously wrestle to offer solutions on area particular questions for a bunch of causes; possibly the information they have been educated on are actually outdated, possibly what we’re asking for is actually particular and specialised, possibly we wish responses that keep in mind private or company information that simply aren’t public… 🤷‍♀️ the record goes on.

    So, how can we leverage generative AI whereas preserving our responses correct, related, and down-to-earth? A great reply to this query is the Retrieval-Augmented Generation (RAG) framework. RAG is a framework that consists of two key parts: retrieval and technology (duh!). Not like solely generative fashions which might be pre-trained on particular information, RAG incorporates an additional step of retrieval that enables us to push extra info into the mannequin from an exterior supply, corresponding to a database or a doc. To place it otherwise, a RAG pipeline permits for offering coherent and pure responses (offered by the technology step), that are additionally factually correct and grounded in a data base of our selection (offered by the retrieval step).

    On this means, RAG might be an especially precious software for purposes the place extremely specialised information is required, as as an example buyer assist, authorized recommendation, or technical documentation. One typical instance of a RAG software is buyer assist chatbots, answering buyer points based mostly on an organization’s database of assist paperwork and FAQs. One other instance can be advanced software program or technical merchandise with in depth troubleshooting guides. Another instance can be authorized recommendation — a RAG mannequin would entry and retrieve customized information from legislation libraries, earlier circumstances, or agency pointers. The examples are actually countless; nevertheless, in all these circumstances, the entry to exterior, particular, and related to the context information allows the mannequin to supply extra exact and correct responses.

    So, on this submit, I stroll you thru constructing a easy RAG pipeline in Python, using ChatGPT API, LangChain, and FAISS.

    What about RAG?

    From a extra technical perspective, RAG is a method used to reinforce an LLM’s responses by injecting it with extra, domain-specific info. In essence, RAG permits for a mannequin to additionally keep in mind extra exterior info — like a recipe ebook, a technical handbook, or an organization’s inside data base — whereas forming its responses.

    This is essential as a result of it permits us to get rid of a bunch of issues inherent to LLMs, as as an example:

    • Hallucinations — making issues up
    • Outdated info — if the mannequin wasn’t educated on current information
    • Transparency — not understanding the place responses are coming from

    To make this work, the exterior paperwork are first processed into vector embeddings and saved in a vector database. Then, after we submit a immediate to the LLM, any related information is retrieved from the vector database and handed to the LLM together with our immediate. Because of this, the response of the LLM is fashioned by contemplating each our immediate and any related info current within the vector database within the background. Such a vector database might be hosted domestically or within the cloud, utilizing a service like Pinecone or Weaviate.

    Picture by writer

    What about ChatGPT API, LangChain, and FAISS?

    The primary part for constructing a RAG pipeline is the LLM mannequin that can generate the responses. This may be any LLM, like Gemini or Claude, however on this submit, I will likely be utilizing OpenAI’s ChatGPT fashions through their API platform. In an effort to use their API, we have to sign up and acquire an API key. We additionally want to verify the respective Python libraries are put in.

    pip set up openai

    The opposite main part of constructing a RAG is processing exterior information — producing embeddings from paperwork and storing them in a vector database. The most well-liked framework for performing such a activity is LangChain. Specifically, LangChain permits:

    • Load and extract textual content from varied doc varieties (PDFs, DOCX, TXT, and many others.)
    • Break up the textual content into chunks appropriate for producing the embeddings
    • Generate vector embeddings (on this submit, with the help of OpenAI’s API)
    • Retailer and search embeddings through vector databases like FAISS, Chroma, and Pinecone

    We are able to simply set up the required LangChain libraries by:

    pip set up langchain langchain-community langchain-openai

    On this submit, I’ll be utilizing LangChain along with FAISS, an area vector database developed by Fb AI Analysis. FAISS is a really light-weight package deal, and is thus applicable for constructing a easy/small RAG pipeline. It may be simply put in with:

    pip set up faiss-cpu

    Placing every thing collectively

    So, in abstract, I’ll use:

    • ChatGPT fashions through OpenAI’s API because the LLM
    • LangChain, together with OpenAI’s API, to load the exterior recordsdata, course of them, and generate the vector embeddings
    • FAISS to generate an area vector database

    The file that I will likely be feeding into the RAG pipeline for this submit is a textual content file with some info about me. This textual content file is positioned within the folder ‘RAG recordsdata’.

    Now we’re all arrange, and we will begin by specifying our API key and initializing our mannequin:

    from openai import OpenAI # Chat_GPT API key api_key = "your key" 
    
    # initialize LLM 
    llm = ChatOpenAI(openai_api_key=api_key, mannequin="gpt-4o-mini", temperature=0.3)

    Then we will load the recordsdata we need to use for the RAG, generate the embeddings, and retailer them as a vector database as follows:

    # loading paperwork for use for RAG 
    text_folder = "rag_files"  
    
    all_documents = []
    for filename in os.listdir(text_folder):
        if filename.decrease().endswith(".txt"):
            file_path = os.path.be part of(text_folder, filename)
            loader = TextLoader(file_path)
            all_documents.prolong(loader.load())
    
    # generate embeddings
    embeddings = OpenAIEmbeddings(openai_api_key=api_key)
    
    # create vector database w FAISS 
    vector_store = FAISS.from_documents(paperwork, embeddings)
    retriever = vector_store.as_retriever()

    Lastly, we will wrap every thing in a easy executable Python file:

    def principal():
        print("Welcome to the RAG Assistant. Kind 'exit' to stop.n")
        
        whereas True:
            user_input = enter("You: ").strip()
            if user_input.decrease() == "exit":
                print("Exiting…")
                break
    
            # get related paperwork
            relevant_docs = retriever.get_relevant_documents(user_input)
            retrieved_context = "nn".be part of([doc.page_content for doc in relevant_docs])
    
            # system immediate
            system_prompt = (
                "You're a useful assistant. "
                "Use ONLY the next data base context to reply the consumer. "
                "If the reply just isn't within the context, say you do not know.nn"
                f"Context:n{retrieved_context}"
            )
    
            # messages for LLM 
            messages = [
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_input}
            ]
    
            # generate response
            response = llm.invoke(messages)
            assistant_message = response.content material.strip()
            print(f"nAssistant: {assistant_message}n")
    
    if __name__ == "__main__":
        principal()

    Discover how the system immediate is outlined. Basically, a system immediate is an instruction given to the LLM that units the habits, tone, or constraints of the assistant earlier than the consumer interacts. For instance, we may set the system immediate to make the LLM present responses like speaking to a 4-year-old or a rocket scientist — right here we ask to offer responses solely based mostly on the exterior information we offered, the ‘Maria info’

    So, let’s see what we’ve cooked! 🍳

    Firstly, I ask a query that’s irrelevant to the offered exterior datasource, to ensure that the mannequin solely makes use of the offered datasource when forming the responses and never normal data.


    … after which I requested some questions particularly from the file I offered…

    ✨✨✨✨

    On my thoughts

    Apparently, it is a very simplistic instance of a RAG setup — there’s rather more to think about when implementing it in an actual enterprise surroundings, corresponding to safety issues round how information is dealt with, or efficiency points when coping with a bigger, extra lifelike data corpus and elevated token utilization. Nonetheless, I consider OpenAI’s API is actually spectacular and gives immense, untapped potential for constructing customized, context-specific AI purposes.


    Liked this submit? Let’s be mates! Be part of me on

    📰Substack 💌 Medium 💼LinkedIn ☕Buy me a coffee!



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Escaping the Valley of Choice in BI

    June 2, 2026

    Ensuring Data Integrity with Cryptographic Hashing and the Ethereum Blockchain

    June 1, 2026

    RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem

    June 1, 2026

    How to Combine Claude Code and Codex for Maximum Coding Power

    June 1, 2026

    It’s the Lessons We Learned Along the Way. Or, Is It?

    June 1, 2026

    Proxy-Pointer RAG: Eliminating Wasteful Entity & Relations Extraction in Knowledge Graphs

    May 31, 2026

    Comments are closed.

    Editors Picks

    How to Edit, Merge, and Split PDFs With Free Online Tools

    June 2, 2026

    Florida crackdown targets illegal machines in Sarasota

    June 2, 2026

    Audiophile-Oriented Noble Audio Debuts More Affordable Osprey Earbuds

    June 2, 2026

    New radio bursts detected from binary stars

    June 2, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    New Bill Aims to Block Both Online Adult Content and VPNs

    September 17, 2025

    Florida lawmakers advance bill targeting illegal gambling properties with stronger penalties

    March 19, 2026

    Tech Shares Pain Perception Measured by Brain Waves

    November 12, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.