📙 in a multi-part sequence on creating net functions with generative AI integration. Half 1 centered on discussing the AI stack and why the applying layer is the perfect place within the stack to be. Verify it out here. Half 2 centered on why Ruby is the perfect net language for constructing AI MVPs. Verify it out here. I extremely advocate you learn by each components earlier than studying this text to get caught up on terminology used right here.
Desk of Contents
Introduction
On this article, we will probably be conducting a enjoyable thought experiment. We search to reply the query:
How easy can we make an internet utility with AI integration?
My readers will know that I worth simplicity very highly. Easy net apps are simpler to grasp, sooner to construct, and extra maintainable. In fact, because the app scales, complexity arises out of necessity. However you at all times wish to begin easy.
We are going to take a typical case research for an internet utility with AI integration (RAG), and have a look at 4 totally different implementations. We’re going to start with essentially the most complicated setup that’s composed of the preferred instruments, and try and simplify it step-by-step, till we find yourself with the simplest setup potential.
Why are we doing this?
I wish to encourage builders to assume extra merely. Oftentimes, the “mainstream” path to constructing net apps or integrating AI is much too complicated for the use case. Builders take inspiration from firms like Google or Apple, with out acknowledging that instruments that work for them are oftentimes inappropriate for purchasers working at a a lot smaller scale.
Seize a espresso or tea, and let’s dive in.
Degree 1: As Advanced As It Will get
Suppose a consumer has requested you to construct a RAG utility for them. This utility could have one web page the place customers can add their paperwork and one other web page the place they’ll chat with their paperwork utilizing RAG. Going with the preferred net stack presently in use, you determine to go together with the MERN stack (MongoDB, Specific.js, React, and Node.js) to construct your utility.
To construct the RAG pipelines that will probably be dealing with doc parsing, chunking, embedding, retrieval, and extra, you once more determine to go together with the preferred stack: LangChain deployed through FastAPI. The net app will make API calls to the endpoints outlined in FastAPI. There’ll have to be no less than two endpoints: one for calling the indexing pipeline and one other for calling the question pipeline. In observe, additionally, you will want upsert and delete endpoints, to make sure that the information in your database stays in sync with the embeddings in your vector retailer.
Word that you’ll be utilizing JavaScript for the net utility, and Python for the AI integration. This duo-lingual app means you’ll doubtless be utilizing a Microservices structure (see part 2 of this sequence for extra on this). This isn’t a strict requirement, however is usually inspired in a setup like this.
There may be yet another option to be made: what vector database will you be utilizing? The vector database is the place the place you retailer the doc chunks created by the indexing pipeline. Let’s once more go together with the preferred alternative on the market: Pinecone. It is a managed cloud vector database that many AI builders are presently utilizing.
The entire system would possibly look one thing like the next:
Yikes! There are a variety of shifting items right here. Let’s break issues down:
- On the backside rectangle, we now have the net utility and MongoDB backend. Within the center we now have the RAG pipelines constructed with LangChain and FastAPI. On the prime, we now have the Pinecone vector database. Every rectangle right here represents a unique service with their very own separate deployments. Whereas the Pinecone cloud vector database will probably be managed, the remaining is on you.
- I’ve wrapped instance HTTP requests and corresponding responses with a dotted border. Bear in mind, it is a microservices structure, so this implies HTTP requests will probably be wanted anytime inter-service communication happens. For simplicity, I’ve solely illustrated what the question pipeline calls would seem like and I’ve omitted any calls to OpenAI, Anthropic, and so on. For readability, I numbered the requests/responses within the order through which they might happen in a question state of affairs.
- As an example one ache level, guaranteeing the paperwork in your MongoDB database are synced with their corresponding embeddings within the Pinecone index is doable however might be difficult. It takes a number of HTTP requests to go out of your MongoDB database to the cloud vector database. It is a level of complexity and overhead for the developer.
A easy analogy: that is like making an attempt to maintain your bodily bookshelf synced up with a digital e-book catalog. Any time you get a brand new e-book or donate a e-book out of your shelf (seems you solely just like the Sport of Thrones present, not the e-book), it’s important to go and manually replace the catalog to replicate the change. On this world of books a small discrepancy gained’t actually influence you, however on the earth of net functions this could be a large drawback.
Degree 2: Drop the Cloud
Can we make this structure any less complicated? Maybe you learn an article not too long ago that mentioned how Postgres has an extension referred to as pgvector. This implies you’ll be able to forgo Pinecone and simply use Postgres as your vector database. Ideally you’ll be able to migrate your knowledge over from MongoDB so that you just stick with just one database. Nice! You refactor your utility to now seem like the next:

Now we solely have two providers to fret about: the net utility + database and the RAG pipelines. As soon as once more, any calls to mannequin suppliers has been omitted.
What have we gained with this simplification? Now, your embeddings and the related paperwork or chunks can stay in the identical desk in the identical database. For instance, you’ll be able to add an embeddings column to a desk in PostgreSQL by doing:
ALTER TABLE paperwork
ADD COLUMN embedding vector(1536);
Sustaining coherence between the paperwork and embeddings must be a lot less complicated now. Postgres’ ON INSERT/UPDATE
 triggers allow you to compute embeddings in-place, eliminating the two-phase “write doc/then embed” dance solely.
Returning to the bookshelf analogy, that is like ditching the digital catalog and as a substitute simply attaching a label straight to each e-book. Now, if you transfer round a e-book or toss one, there is no such thing as a have to replace a separate system, for the reason that labels go wherever the books go.
Degree 3: Microservices Begone!
You’ve executed job simplifying issues. Nevertheless, you assume you are able to do even higher. Maybe you’ll be able to create a monolithic app, as a substitute of utilizing the microservices structure. A monolith simply signifies that your utility and your RAG pipelines are developed and deployed collectively. A difficulty arises, nonetheless. You coded up the net app in JavaScript utilizing the MERN stack. However the RAG pipelines have been constructed utilizing Python and LangChain deployed through FastAPI. Maybe you’ll be able to attempt to squeeze these right into a single container, utilizing one thing like Supervisor to supervise the Python and JavaScript processes, however it isn’t a pure match for polyglot stacks.
So what you determine to do is to ditch React/Node and as a substitute use Django, a Python net framework to develop your app. Now, your RAG pipeline code can simply stay in a utility module in your Django app. This implies no extra HTTP requests are being made, which removes complexity and latency. Any time you wish to run your question or indexing pipelines all it’s important to do is make a perform name. Spinning up dev environments and deployments is now a breeze. In fact, for those who learn half 2, our desire is to not use an all Python stack, however as a substitute go together with an all Ruby stack.
You’ve simplified even additional, and now have the next structure:

An necessary observe: in earlier diagrams, I mixed the net utility and database right into a single service, for simplicity. At this level I believe it’s necessary to point out that they’re, actually, separate providers themselves! This does not imply you might be nonetheless utilizing a microservices structure. So long as the 2 providers are developed and deployed altogether, that is nonetheless a monolith.
Wow! Now you solely have a single deployment to spin up and preserve. You may have your database arrange as an adjunct to your net utility. This sadly means you’ll nonetheless doubtless wish to use Docker Compose to develop and deploy your database and net utility providers collectively. However with the pipelines now simply working as features as a substitute of a separate service, now you can ditch FastAPI! You’ll now not want to take care of these endpoints; simply use perform calls.
A little bit of technical element: on this chart, the legend signifies that the dotted line is not HTTP, however as a substitute a Postgres frontend/backend protocol. These are two totally different protocols on the utility layer of the internet protocol model. It is a totally different utility layer than the one I mentioned in part 1. Utilizing an HTTP connection to switch knowledge between the applying and the database is theoretically potential, however not optimum. As a substitute the creators of Postgres created their very own protocol that’s lean and tightly coupled to the wants of the database.
Degree 4: SQLite Enters the Chat
“Certainly we’re executed simplifying?”, it’s possible you’ll be asking your self.
Flawed!
There may be yet another simplification we will make. As a substitute of utilizing Postgres, we will use SQLite! You see, presently your app and your database are two separate providers deployed collectively. However what in the event that they weren’t two totally different providers, however as a substitute your database was only a file that lives in your utility? That is what SQLite can provide you. With the not too long ago launched sqlite-vec
 library, it may well even deal with RAG, identical to how pgvector
 works for Postgres. The caveat right here is that sqlite-vec
 is pre-v1, however that is nonetheless nice for an early stage MVP.

Actually superb. Now you can ditch Docker Compose! That is really a single service Web Application. The LangChain modules and your database now all are simply features and recordsdata dwelling in your repository.
Involved about the usage of SQLite in a manufacturing net utility? I wrote recently about how SQLite, as soon as thought of only a plaything on the earth of net apps, can grow to be production-ready by some tweaks in its configuration. The truth is Ruby on Rails 8 not too long ago made these variations default and is now pushing SQLite as a default database for new applications. In fact because the app scales, you’ll doubtless have to migrate to Postgres or another database, however bear in mind the mantra I discussed to start with: solely introduce complexity when completely essential. Don’t assume your app goes to explode with thousands and thousands of concurrent writes if you find yourself simply making an attempt to get your first few customers.
Abstract
On this article, we began with the normal stacks for constructing an internet utility with AI integration. We noticed the quantity of complexity concerned, and determined to simplify piece by piece till we ended up with the Platonic perfect of straightforward apps.
However don’t let the simplicity idiot you; the app continues to be a beast. The truth is, due to the simplicity, it may well run a lot sooner than the normal app. If you’re noticing that the app is beginning to decelerate, I might attempt sizing up the server earlier than contemplating migrating to a brand new database or breaking apart the monolith.
With such a lean utility, you’ll be able to really transfer quick. Native growth is a dream, and including new capabilities might be executed at lightning velocity. You may nonetheless get backups of your SQLite database utilizing one thing like Litestream. As soon as your app is exhibiting actual indicators of pressure, then transfer up the degrees of complexity. However I counsel in opposition to beginning a brand new utility at stage 1.
I hope you might have loved this sequence on constructing net functions with AI integration. And I hope I’ve impressed you to assume easy, not sophisticated!
🔥 If you’d like a custom web application with generative AI integration, visit losangelesaiapps.com