Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Munich-based encosa raises €25 million to bring battery storage to German SMEs
    • Websites Can Now Spy on You Through Your Hard Drive
    • Kalshi debuts regulated crypto perpetual futures
    • Apple Will Reportedly Add Bill-Splitting Feature to iOS 27
    • Escaping the Valley of Choice in BI
    • SEO headline New urine test uses gut biomarkers to identify autism earlier
    • Socceroos legend Tim Cahill backs sports swag design platform Nardo in $1 million pre-Seed raise
    • ‘Sexual Chocolate’ Faces Recalls After FDA Tests Reveal Undisclosed Viagra
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Tuesday, June 2
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»The Machine Learning “Advent Calendar” Day 4: k-Means in Excel
    Artificial Intelligence

    The Machine Learning “Advent Calendar” Day 4: k-Means in Excel

    Editor Times FeaturedBy Editor Times FeaturedDecember 4, 2025No Comments7 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    4 of the Machine Learning Advent Calendar.

    Throughout the first three days, we explored distance-based fashions for supervised studying:

    In all these fashions, the thought was the identical: we measure distances, and we resolve the output based mostly on the closest factors or nearest facilities.

    In the present day, we keep on this similar household of concepts. However we use the distances in an unsupervised approach: k-means.

    Now, one query for many who already know this algorithm: k-means appears to be like extra much like which mannequin, the k-NN classifier, or the Nearest Centroid classifier?

    And in case you keep in mind, for all of the fashions we now have seen thus far, there was probably not a “coaching” part or hyperparameter tuning.

    • For k-NN, there isn’t a coaching in any respect.
    • For LDA, QDA, or GNB, coaching is simply computing means and variances. And there are additionally no actual hyperparameters.

    Now, with k-means, we’re going to implement a coaching algorithm that lastly appears to be like like “actual” machine studying.

    We begin with a tiny 1D instance. Then we transfer to 2D.

    Objective of k-means

    Within the coaching dataset, there are no preliminary labels.

    The purpose of k-means is to create significant labels by grouping factors which can be shut to one another.

    Allow us to take a look at the illustration under. You’ll be able to clearly see two teams of factors. Every centroid (the purple sq. and the inexperienced sq.) is in the midst of its cluster, and each level is assigned to the closest one.

    This provides a really intuitive image of how k-means discovers construction utilizing solely distances.

    And right here, okay means the variety of facilities we attempt to discover.

    k-means in Excel – picture by writer

    Now, allow us to reply the query: Which algorithm is k-means nearer to, the k-NN classifier or the Nearest Centroid classifier?

    Don’t be fooled by the okay in k-NN and k-means.
    They don’t imply the identical factor:

    • in k-NN, okay is the variety of neighbors, not the variety of courses;
    • in k-means, okay is the variety of centroids.

    Ok-means is way nearer to the Nearest Centroid classifier.

    Each fashions are represented by centroids, and for a brand new commentary we merely compute the space to every centroid to resolve to which one it belongs.

    The distinction, in fact, is that within the Nearest Centroid classifier, we already know the centroids as a result of they arrive from labeled courses.

    In k-means, we have no idea the centroids. The entire purpose of the algorithm is to uncover appropriate ones straight from the info.

    The enterprise downside is totally totally different: as a substitute of predicting labels, we are attempting to create them.

    And in k-means, the worth of okay (the variety of centroids) is unknown. So it turns into a hyperparameter that we will tune.

    k-means with solely One function

    We begin with a tiny 1D instance in order that the whole lot is seen on one axis. And we are going to select the values in such a trivial approach that we will immediately see the 2 centroids.

    1, 2, 3, 11, 12, 13

    Sure, 2, and 12.

    However how would the pc know? The machine will “be taught” by guessing step-by-step.

    Right here comes the algorithm known as Lloyd’s algorithm.

    We’ll implement it in Excel with the next loop:

    1. select preliminary centroids
    2. compute the space from every level to every centroid
    3. assign every level to the closest centroid
    4. recompute the centroids as the typical of the factors in every cluster
    5. repeat steps 2 to 4 till the centroids not transfer

    1. Select preliminary centroids

    Decide two preliminary facilities, for instance:

    They need to be inside the knowledge vary (between 1 and 13).

    k-means in Excel – picture by writer

    2. Compute distances

    For every knowledge level x:

    • compute the space to c_1,
    • compute the space to c_2.

    Usually, we use absolute distance in 1D.

    We now have two distance values for every level.

    k-means in Excel – picture by writer

    3. Assign clusters

    For every level:

    • examine the 2 distances,
    • assign the cluster of the smallest one (1 or 2).

    In Excel, this can be a easy IF or MIN based mostly logic.

    k-means in Excel – picture by writer

    4. Compute the brand new centroids

    For every cluster:

    • take the factors assigned to that cluster,
    • compute their common,
    • this common turns into the brand new centroid.
    k-means in Excel – picture by writer

    5. Iterate till reaching convergence

    Now in Excel, because of the formulation, we will merely paste the brand new centroid values into the cells of the preliminary centroids.

    The replace is rapid, and after doing this a couple of instances, you will note that the values cease altering. That’s when the algorithm has converged.

    k-means in Excel – picture by writer

    We are able to additionally document every step in Excel, so we will see how the centroids and clusters evolve over time.

    k-means in Excel – picture by writer

    k-means with Two Options

    Now allow us to use two options. The method is precisely the identical, we merely use the Euclidean distance in 2D.

    You’ll be able to both do the copy-paste of the brand new centroids as values (with just some cells to replace),

    k-means in Excel – picture by writer

    or you’ll be able to show all of the intermediate steps to see the total evolution of the algorithm.

    k-means in Excel – picture by writer

    Visualizing the Transferring Centroids in Excel

    To make the method extra intuitive, it’s useful to create plots that present how the centroids transfer.

    Sadly, Excel or Google Sheets are usually not best for this sort of visualization, and the info tables shortly change into a bit advanced to arrange.

    If you wish to see a full instance with detailed plots, you’ll be able to learn this article I wrote virtually three years in the past, the place every step of the centroid motion is proven clearly.

    k-means in Excel – picture by writer

    As you’ll be able to see on this image, the worksheet grew to become fairly unorganized, particularly in comparison with the sooner desk, which was very simple.

    k-means in Excel – picture by writer

    Selecting the optimum okay: The Elbow Technique

    So now, it’s potential to attempt okay = 2 and okay = 3 in our case, and compute the inertia for each. Then we merely examine the values.

    We are able to even start with okay=1.

    For every worth of okay:

    • we run k-Means till convergence,
    • compute the inertia, which is the sum of squared distances between every level and its assigned centroid.

    In Excel:

    • For every level, take the space to its centroid and sq. it.
    • Sum all these squared distances.
    • This provides the inertia for this okay.

    For instance:

    • for okay = 1, the centroid is simply the general imply of x1 and x2,
    • for okay = 2 and okay = 3, we take the converged centroids from the sheets the place you ran the algorithm.

    Then we will plot inertia as a operate of okay, for instance for (okay = 1, 2, 3).

    For this dataset

    • from 1 to 2, the inertia drops quite a bit,
    • from 2 to three, the development is way smaller.

    The “elbow” is the worth of okay after which the lower in inertia turns into marginal. Within the instance, it means that okay = 2 is enough.

    k-means in Excel – picture by writer

    Conclusion

    Ok-means is a really intuitive algorithm when you see it step-by-step in Excel.

    We begin with easy centroids, compute distances, assign factors, replace the centroids, and repeat. Now, we will see how “machines be taught”, proper?

    Nicely, that is solely the start, we are going to see that totally different fashions “be taught” in actually alternative ways.

    And right here is the transition for tomorrow’s article: the unsupervised model of the Nearest Centroid classifier is certainly k-means.

    So what can be the unsupervised model of LDA or QDA? We’ll reply this within the subsequent article.

    k-means – picture by writer



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    Escaping the Valley of Choice in BI

    June 2, 2026

    Ensuring Data Integrity with Cryptographic Hashing and the Ethereum Blockchain

    June 1, 2026

    RAG Is Not Machine Learning, and the ML Toolkit Solves the Wrong Problem

    June 1, 2026

    How to Combine Claude Code and Codex for Maximum Coding Power

    June 1, 2026

    It’s the Lessons We Learned Along the Way. Or, Is It?

    June 1, 2026

    Proxy-Pointer RAG: Eliminating Wasteful Entity & Relations Extraction in Knowledge Graphs

    May 31, 2026

    Comments are closed.

    Editors Picks

    Munich-based encosa raises €25 million to bring battery storage to German SMEs

    June 2, 2026

    Websites Can Now Spy on You Through Your Hard Drive

    June 2, 2026

    Kalshi debuts regulated crypto perpetual futures

    June 2, 2026

    Apple Will Reportedly Add Bill-Splitting Feature to iOS 27

    June 2, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Voters Approve Incorporation of SpaceX Hub as Starbase, Texas

    May 4, 2025

    A digital goodbye: Aura Funerals raises €905K to modernise end-of-life planning

    February 19, 2025

    Oklahoma’s Legends Tower still aims to be USA’s tallest skyscraper

    August 23, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.