Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • GM reimagines Hummer off-roader with California ideas unit
    • London’s DEScycle secures over €10 million in grant funding to scale critical metals recovery platform
    • How to Edit, Merge, and Split PDFs With Free Online Tools
    • Florida crackdown targets illegal machines in Sarasota
    • Audiophile-Oriented Noble Audio Debuts More Affordable Osprey Earbuds
    • New radio bursts detected from binary stars
    • Remarkable, Catalysr and Indigenous pre-accelerators score NSW government support for diverse founders
    • Whoop Promo Codes May 2026: 20% Off | June 2026
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Tuesday, June 2
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»The Machine Learning “Advent Calendar” Day 5: GMM in Excel
    Artificial Intelligence

    The Machine Learning “Advent Calendar” Day 5: GMM in Excel

    Editor Times FeaturedBy Editor Times FeaturedDecember 5, 2025No Comments7 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    In the previous article, we explored distance-based clustering with Ok-Means.

    additional: to enhance how the space might be measured we add variance, with a view to get the Mahalanobis distance.

    So, if k-Means is the unsupervised model of the Nearest Centroid classifier, then the pure query is:

    What’s the unsupervised model of QDA?

    Which means like QDA, every cluster now needs to be described not solely by its imply, but in addition by its variance (and we even have so as to add covariance if the variety of options is larger than 2). However right here the whole lot is realized with out labels.

    So that you see the concept, proper?

    And nicely, the title of this mannequin is the Gaussian Combination Mannequin (GMM)…

    GMM and the names of those fashions…

    As it’s typically the case, the names of the fashions come from historic causes. They aren’t all the time designed to focus on the connections between fashions, if they don’t seem to be discovered collectively.

    Completely different researchers, totally different intervals, totally different use circumstances… and we find yourself with names that generally cover the true construction behind the concepts.

    Right here, the title “Gaussian Combination Mannequin” merely implies that the information is represented as a combination of a number of Gaussian distributions.

    If we comply with the identical naming logic as k-Means, it could have been clearer to name it one thing like k-Gaussian Combination

    As a result of, in observe, as an alternative of solely utilizing the means, we add the variance. And we might simply use the Mahalanobis distance, or one other weighted distance utilizing each means and variance. However Gaussian distribution provides us possibilities which can be simpler to interpret.

    So we select a quantity ok of Gaussian elements.

    And by the way in which, GMM just isn’t the one one.

    In reality, the complete machine studying framework is definitely way more latest than most of the fashions it comprises. Most of those strategies had been initially developed in statistics, sign processing, econometrics, or sample recognition.

    Then, a lot later, the sphere we now name “machine studying” emerged and regrouped all these fashions underneath one umbrella. However the names didn’t change.

    So in the present day we use a combination of vocabularies coming from totally different eras, totally different communities, and totally different intentions.

    Because of this the relationships between fashions usually are not all the time apparent while you look solely on the names.

    If we needed to rename the whole lot with a contemporary, unified machine-learning model, the panorama would really be a lot clearer:

    • GMM would turn into k-Gaussian Clustering
    • QDA would turn into Nearest Gaussian Classifier
    • LDA, nicely, Nearest Gaussian Classifier with the identical variance throughout courses.

    And all of a sudden, all of the hyperlinks seem:

    • k-Means ↔ Nearest Centroid
    • GMM ↔ Nearest Gaussian (QDA)

    Because of this GMM is so pure after Ok-Means. If Ok-Means teams factors by their closest centroid, then GMM teams them by their closest Gaussian form.

    Why this whole part to debate the names?

    Effectively, the reality is that, since we already lined the k-means algorithm, and we already did the transition from Nearest Centroids Classifier to QDA, we already know all about this algorithm, and the coaching algorithm is not going to change…

    And what’s the NAME of this coaching algorithm?

    Oh, Lloyd’s algorithm.

    Truly, earlier than k-means was referred to as so, it was merely referred to as Lloyd’s algorithm, revealed by Stuart Lloyd in 1957. Solely later, the machine studying neighborhood modified it to “k-means”.

    And this algorithm manipulated solely the means, so we want one other title, proper?

    You see the place that is going: the Expectation-Maximizing algorithm!

    EM is just the final type of Lloyd’s thought. Lloyd updates the means, EM updates the whole lot: means, variances, weights, and possibilities.

    So, you already know the whole lot about GMM!

    However since my article is named “GMM in Excel”, I can’t finish my article right here…

    GMM in 1 Dimension

    Allow us to begin with this straightforward dataset, the identical we used for k-means: 1, 2, 3, 11, 12, 13

    Hmm, the 2 Gaussians could have the identical variances. So take into consideration enjoying with different numbers in Excel!

    And we naturally need 2 clusters.

    Listed below are the totally different steps.

    Initialization

    We begin with guesses for means, variances, and weights.

    GMM in Excel – initialization step- picture by creator

    Expectation step (E-step)

    For every level, we compute how probably it’s to belong to every Gaussian.

    GMM in Excel – expectation step – picture by creator

    Maximization step (M-step)

    Utilizing these possibilities, we replace the means, variances, and weights.

    GMM in Excel – maximization step – picture by creator

    Iteration

    We repeat E-step and M-step till the parameters stabilise.

    GMM in Excel -iterations – picture by creator

    Every step is very simple as soon as the formulation are seen.
    You will notice that EM is nothing greater than updating averages, variances, and possibilities.

    We are able to additionally do some visualization to see how the Gaussian curves transfer through the iterations.

    At first, the 2 Gaussian curves overlap closely as a result of the preliminary means and variances are simply guesses.

    The curves slowly separate, modify their widths, and at last settle precisely on the 2 teams of factors.

    By plotting the Gaussian curves at every iteration, you’ll be able to actually watch the mannequin be taught:

    • the means slide towards the facilities of the information
    • the variances shrink to match the unfold of every group
    • the overlap disappears
    • the ultimate shapes match the construction of the dataset

    This visible evolution is extraordinarily useful for instinct. When you see the curves transfer, EM is not an summary algorithm. It turns into a dynamic course of you’ll be able to comply with step-by-step.

    GMM in Excel – picture by creator

    GMM in 2 Dimensions

    The logic is precisely the identical as in 1D. Nothing new conceptually. We merely lengthen the formulation…

    As a substitute of getting one characteristic per level, we now have two.

    Every Gaussian should now be taught:

    • a imply for x1
    • a imply for x2
    • a variance for x1
    • a variance for x2
    • AND a covariance time period between the 2 options.

    When you write the formulation in Excel, you will note that the method stays precisely the identical:

    Effectively, the reality is that if you happen to take a look at the screenshot, you may assume: “Wow, the formulation is so lengthy!” And this isn’t all of it.

    2D GMM in Excel – picture by creator

    However don’t be fooled. The formulation is lengthy solely as a result of we write out the 2-dimensional Gaussian density explicitly:

    • one half for the space in x1
    • one half for the space in x2
    • the covariance time period
    • the normalization fixed

    Nothing extra.

    It’s merely the density formulation expanded cell by cell.
    Lengthy to sort, however completely comprehensible when you see the construction: a weighted distance, inside an exponential, divided by the determinant.

    So sure, the formulation appears to be like large… however the thought behind this can be very easy.

    Conclusion

    Ok-Means provides laborious boundaries.

    GMM provides possibilities.

    As soon as the EM formulation are written in Excel, the mannequin turns into easy to comply with: the means transfer, the variances modify, and the Gaussians naturally settle across the knowledge.

    GMM is simply the following logical step after k-Means, providing a extra versatile solution to symbolize clusters and their shapes.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Escaping the Valley of Choice in BI

    June 2, 2026

    Ensuring Data Integrity with Cryptographic Hashing and the Ethereum Blockchain

    June 1, 2026

    RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem

    June 1, 2026

    How to Combine Claude Code and Codex for Maximum Coding Power

    June 1, 2026

    It’s the Lessons We Learned Along the Way. Or, Is It?

    June 1, 2026

    Proxy-Pointer RAG: Eliminating Wasteful Entity & Relations Extraction in Knowledge Graphs

    May 31, 2026

    Comments are closed.

    Editors Picks

    GM reimagines Hummer off-roader with California ideas unit

    June 2, 2026

    London’s DEScycle secures over €10 million in grant funding to scale critical metals recovery platform

    June 2, 2026

    How to Edit, Merge, and Split PDFs With Free Online Tools

    June 2, 2026

    Florida crackdown targets illegal machines in Sarasota

    June 2, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    How Deepfakes Crash the News Cycle

    September 26, 2025

    The Pentagon Releases New Trove of Declassified UFO Files

    May 8, 2026

    With half of chronically ill patients failing to take medication correctly, Oska Health secures €11 million for personal health app

    February 26, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.