Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • X says it is cracking down on large accounts that have been gaming its revenue-sharing program by “programmatically reuploading content from smaller accounts” (Lakshmi Varanasi/Business Insider)
    • Today’s NYT Wordle Hints, Answer and Help for May 26 #1802
    • IEEE TryEngineering OnCampus Now At 7 Universities
    • Can AI write your code? | Towards Data Science
    • Penguin-inspired material offers adaptable heating and cooling
    • A Swimmer Broke a World Record at the Enhanced Games
    • the EU plans to fine Google a high triple-digit million euro amount as part of a 2025 probe over concerns it favors its own services in search results (Reuters)
    • Pope Leo’s AI Encyclical Has Landed. It Offers Wisdom for Big Tech, Governments and You
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Monday, May 25
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»Why Every Analytics Engineer Needs to Understand Data Architecture
    Artificial Intelligence

    Why Every Analytics Engineer Needs to Understand Data Architecture

    Editor Times FeaturedBy Editor Times FeaturedFebruary 19, 2026No Comments11 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    , however in actuality, little nuances in designing your knowledge structure could have expensive implications. Therefore, on this article, I needed to offer a crash course on the architectures that form your each day choices — from relational databases to event-driven systems.

    Information structure is the blueprint that determines the place your knowledge lives, the way it strikes between techniques, what occurs to it throughout transformations, and who will get entry to what. Consider it because the structure of a metropolis: residential areas, industrial districts, industrial zones, and the transportation networks connecting all the pieces collectively.

    Earlier than we begin, I need you to recollect this: your structure determines whether or not your group is sort of a well-planned metropolis with environment friendly highways and clear zoning, or like a metropolis that grew with none planning and now has visitors jams in all places.

    I’ve seen this firsthand. An organization that had grown quickly by acquisitions had inherited knowledge techniques from every firm they purchased, and no one had ever taken the time to consider how all of it match collectively. Buyer knowledge lived in 5 completely different CRM techniques, monetary knowledge was cut up between three completely different ERPs, and every system had its personal definition of fundamental ideas like buyer and income. Their “weekly” enterprise evaluate took two weeks to arrange. Six months later, after implementing a well-thought-out knowledge structure, they might generate the identical evaluate in beneath two hours.

    The distinction wasn’t the newest expertise or large budgets. It was merely having a considerate strategy to how knowledge ought to be organized and managed.

    On this article, I’ll stroll you thru the core knowledge structure varieties, their strengths, weaknesses, and the place each really shines. Fasten your seatbelts!

    The info structure evolution – picture by creator

    1. Relational Database — The High quality Outdated Wine

    Relational databases date all the best way again to the Seventies, when Edgar F. Codd proposed the relational mannequin. At its core, a relational database is a extremely organized, digital submitting cupboard. Every desk is a drawer devoted to 1 factor, consider clients, orders, merchandise. Every row is a single document, every column a selected attribute.

    The relational half is the place the ability comes from. The database understands how tables are linked. It is aware of that Buyer X within the Clients desk is identical Buyer X who positioned an order within the Orders desk. This construction is what permits us to ask advanced questions utilizing SQL.

    When working with relational databases, you comply with a strict rule known as schema-on-write. Consider constructing a home: you could have an in depth blueprint earlier than you can begin laying the muse. You outline each room, each window, and each doorway upfront. The info should match this blueprint completely if you reserve it. This upfront work ensures all the pieces is constant and the info is reliable.

    The other strategy, known as schema-on-read, is like dumping all of your constructing supplies into an enormous pile. There’s no blueprint to begin. You solely resolve how you can construction it when it’s essential construct one thing. Versatile? Completely. However it places the burden of creating sense of the chaos on whoever analyzes the info later.

    Picture by creator

    2. Relational Information Warehouse — The Analyst’s Playground

    Relational databases have been (and nonetheless are) incredible at working the each day operations of a enterprise — processing orders, managing stock, updating buyer data. We name these operational (OLTP) techniques, and so they have to be lightning-fast.

    However this created an enormous problem — what I prefer to name the “Don’t contact the dwell system!” drawback. Enterprise leaders wanted to analyze knowledge. However working advanced analytical queries on the identical dwell database processing hundreds of transactions a minute would grind all the pieces to a halt. It’s like making an attempt to do a deep stock rely in a busy grocery store throughout peak hours.

    The answer? Create a separate playground for analysts. The relational knowledge warehouse was born: a devoted database constructed particularly for evaluation, the place you centralize copies of knowledge from varied operational techniques.

    Picture by creator

    Two Faculties of Thought: Inmon vs. Kimball

    There are two elementary approaches to constructing an information warehouse. The top-down strategy, launched by Invoice Inmon — “the daddy of the info warehouse” — begins with designing the general, normalized knowledge warehouse first, then creating department-specific knowledge marts from it. It offers you constant knowledge illustration and decreased duplication, however comes with excessive upfront prices and complexity.

    The bottom-up strategy, championed by Ralph Kimball, flips this round. You begin by constructing particular person knowledge marts for particular departments utilizing denormalized truth and dimension tables. Over time, these join by way of conformed dimensions to kind a unified view. It’s sooner to get began, extra versatile, and cheaper, however dangers inconsistencies and knowledge silos if not managed rigorously.

    Picture by creator

    Neither strategy is universally “higher.” High-down works properly for big organizations needing enterprise-wide consistency. Backside-up shines if you want fast wins and iterative supply. Most real-world implementations find yourself being a realistic mix of each.

    3. Information Lake — The Promise and the Swamp

    Round 2010, a brand new idea emerged promising to resolve all our issues (sounds acquainted?): the knowledge lake. The gross sales pitch was alluring — in contrast to a structured knowledge warehouse, an information lake was basically a large, low-cost space for storing. You don’t want a plan, simply dump all the pieces in there: structured knowledge, log recordsdata, PDFs, social media feeds, even pictures and movies. Retailer all the pieces now, determine what to do with it later.

    That is the schema-on-read strategy in apply. And in contrast to relational knowledge warehouses, which offer each storage and compute, an information lake is simply storage — no bundled compute engine. It depends on object storage, which doesn’t require knowledge to be structured in tabular kind.

    For some time, the hype was actual. Then actuality hit. Storing knowledge in a lake was simple — getting it out in a helpful method was extremely tough. Enterprise customers have been informed: “Hey, all the info is within the lake! Simply open a Jupyter pocket book and use some Python to research it.” Most enterprise customers didn’t have superior coding expertise. The crystal-clear lake shortly turned a murky, unusable knowledge swamp.

    Picture by creator

    However the knowledge lake didn’t disappear. The business realized the preliminary imaginative and prescient was flawed, however the core expertise remained extremely helpful. Immediately, the info lake has discovered its true calling — not as a warehouse substitute, however as a staging and preparation space: the proper place to land uncooked knowledge earlier than deciding what to wash, rework, and promote for dependable evaluation.

    4. Information Lakehouse — The Better of Each Worlds

    Once you mix an information warehouse and an information lake, what do you get? A knowledge lakehouse. Databricks pioneered this time period round 2020, and the idea has been gaining critical traction ever since.

    I can nearly hear you asking: “Wait, didn’t you simply say knowledge lakes failed spectacularly making an attempt to fulfill precisely these necessities? Why would this work now?”

    Honest query. There was a single change to the basic knowledge lake strategy, but it surely was large enough to shift your complete paradigm: including a transactional storage layer on prime of current knowledge lake storage. This layer, exemplified by Delta Lake, Apache Iceberg, and Apache Hudi, allows the info lake to work extra like a conventional relational database administration system, with ACID transactions, schema enforcement, and time journey.

    The one change that shifted your complete paradigm – picture by creator

    The lakehouse promotes a compelling thought: take away the necessity for a separate relational knowledge warehouse and leverage solely an information lake in your whole structure. All knowledge codecs: structured, semi-structured, and unstructured, are saved within the lake, and all evaluation occurs immediately from it. The transactional layer is the lacking ingredient that makes this possible.

    5. Information Mesh — Decentralizing Information Possession

    So knowledge lakehouses solved the storage and evaluation drawback. Case closed, proper? Not precisely. As firms grew, even an amazing centralized knowledge platform created a brand new bottleneck.

    Consider your central knowledge group because the kitchen of a very fashionable restaurant. Advertising and marketing, Gross sales, Finance, and Logistics all place advanced “orders” (knowledge requests). The kitchen workers — your knowledge engineers — are expert however swamped. They don’t have deep, nuanced understanding of each “dish.” The advertising and marketing group asks for a buyer segmentation, and the kitchen has to first ask: “What do you imply by an energetic buyer?” The consequence? An extended line of pissed off “clients” and a burned-out kitchen workers.

    Information mesh asks a radical query: what if, as an alternative of 1 central kitchen, we gave every division its personal specialised kitchen station? And what if we made the area specialists — the individuals who really know their very own knowledge — accountable for getting ready high-quality knowledge merchandise for everybody else?

    The 4 pillars of knowledge mesh structure – picture by creator

    Information mesh rests on 4 key ideas: domain-oriented possession (the individuals closest to the info personal it), knowledge as a product (handled with the identical care as any customer-facing product), a self-serve knowledge platform (central group gives the infrastructure, domains construct the merchandise), and federated computational governance (world requirements enforced by a council with area representatives).

    A phrase of warning: Information mesh isn’t a expertise you purchase and set up. It’s a sociotechnical shift, as a lot about organizational construction and tradition as it’s about expertise. It’s not for each firm, particularly smaller ones. Area groups want real technical experience and long-term dedication, and knowledge sharing can simply turn out to be a political situation.

    6. Occasion-Pushed Structure — The Gossipy Neighbor

    Now let’s change gears. Consider event-driven structure because the gossipy neighbor strategy to knowledge — techniques that react immediately to issues taking place, relatively than continuously checking for updates. As an alternative of System B asking System A each 5 minutes “Hey, did something occur but?” (like checking your fridge hoping meals has magically appeared), an event-driven system faucets you on the shoulder the second one thing essential happens.

    A buyer locations an order? That’s an occasion. The system that creates it’s the producer. The techniques that pay attention and react are customers. And the middleman the place occasions get posted is the occasion dealer — assume Apache Kafka, Azure Occasion Hubs, or Eventstream in Microsoft Material.

    Picture by creator

    The wonder lies within the phrases loosely coupled. The Advertising and marketing group can spin up a brand new service that listens to “Buyer Signed Up” occasions with out requiring the Gross sales group to alter a single line of code. If the welcome e mail service crashes, new clients nonetheless get signed up — the occasions simply pile up within the dealer, ready for the service to recuperate.

    However this energy comes with trade-offs. You now have a brand new piece of infrastructure to handle. Debugging will get tougher as a result of when one thing goes unsuitable, tracing a single occasion throughout a number of decoupled techniques could be a critical problem. And the dealer doesn’t all the time assure the order of supply: you would possibly get an “Order Shipped” occasion earlier than the “Order Paid” occasion.

    When to make use of it: Actual-time analytics (IoT, clickstream, fraud detection), microservices integration, and asynchronous workflows.

    When NOT to make use of it: Easy CRUD apps, tightly coupled workflows requiring speedy assured responses, and strictly transactional techniques the place multi-step processes should succeed or fail atomically.

    The Cheat Sheet

    There’s no magic bullet — every structure has its place. Right here’s the short comparability that can assist you resolve:

    The Key Takeaway

    Understanding when to make use of what’s the essential talent for any analytics engineer. Each single day, you make choices about how you can construction knowledge, the place to retailer it, how you can rework it, and how you can make it accessible. These choices might sound minor within the second: Ought to I create this as a view or a desk? The place ought to I put this transformation logic? — however they add as much as create the muse your whole analytics ecosystem sits on.

    The info structure panorama has developed from normalized relational databases, by the “don’t contact the dwell system!” period of knowledge warehouses, previous the spectacular rise and fall (and redemption) of knowledge lakes, into the lakehouse paradigm that provides us the most effective of each worlds. Fashionable approaches like knowledge mesh push possession to the individuals closest to the info, and event-driven architectures let techniques react immediately relatively than continuously polling for updates.

    Thanks for studying!



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Can AI write your code? | Towards Data Science

    May 25, 2026

    I Built My First ETL Pipeline as a Complete Beginner. Here’s How.

    May 25, 2026

    From TF-IDF to Transformers: Implementing Four Generations of Semantic Search

    May 25, 2026

    The Ultimate Beginners’ Guide to Building an AI Agent in Python

    May 24, 2026

    Beyond the Model: Why Data Scientists Must Embrace APIs and API Documentation

    May 24, 2026

    From Prototype to Profit: Solving the Agentic Token-Burn Problem

    May 23, 2026

    Comments are closed.

    Editors Picks

    X says it is cracking down on large accounts that have been gaming its revenue-sharing program by “programmatically reuploading content from smaller accounts” (Lakshmi Varanasi/Business Insider)

    May 25, 2026

    Today’s NYT Wordle Hints, Answer and Help for May 26 #1802

    May 25, 2026

    IEEE TryEngineering OnCampus Now At 7 Universities

    May 25, 2026

    Can AI write your code? | Towards Data Science

    May 25, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Nvidia says it will restart H20 artificial intelligence chip sales to China

    July 15, 2025

    Scaling Carbon Capture to Billions of Tonnes

    September 19, 2025

    Experience Matters: How Retirees Can Bring Value to Startup Teams

    August 20, 2024
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.