Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Cheque in: 3 startups ended May by raising $15.5 million
    • Universal Audio Volt 876 USB Audio Interface Review: Pro-Level Polish
    • New York City-based Mecka AI, which trains robots with human data sourced from body sensors and iPhones, raised $60M, including a $25M Series A (Ben Weiss/Fortune)
    • Is Instagram Down? What to Know
    • It’s the Lessons We Learned Along the Way. Or, Is It?
    • The forever chemicals impacting your health
    • WiseTech CEO threatened amid job cuts; founder Richard White calls in police
    • Best Sleep Trackers of 2026: Oura, Whoop, and Eight Sleep
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Monday, June 1
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»The Machine Learning “Advent Calendar” Day 23: CNN in Excel
    Artificial Intelligence

    The Machine Learning “Advent Calendar” Day 23: CNN in Excel

    Editor Times FeaturedBy Editor Times FeaturedDecember 24, 2025No Comments9 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    have been first launched for photos, and for photos they’re usually straightforward to grasp.

    A filter slides over pixels and detects edges, shapes, or textures. You may read this article I wrote earlier to grasp how CNNs work for photos with Excel.

    For textual content, the concept is identical.

    As an alternative of pixels, we slide filters over phrases.

    As an alternative of visible patterns, we detect linguistic patterns.

    And lots of essential patterns in textual content are very native. Let’s take these quite simple examples:

    • “good” is constructive
    • “dangerous” is detrimental
    • “not good” is detrimental
    • “not dangerous” is usually constructive

    In my previous article, we noticed the way to symbolize phrases as numbers utilizing embeddings.

    We additionally noticed a key limitation: after we used a worldwide common, phrase order was fully ignored.

    From the mannequin’s viewpoint, “not good” and “good not” seemed precisely the identical.

    So the following problem is evident: we would like the mannequin to take phrase order under consideration.

    A 1D Convolutional Neural Community is a pure software for this, as a result of it scans a sentence with small sliding home windows and reacts when it acknowledges acquainted native patterns.

    1. Understanding a 1D CNN for Textual content: Structure and Depth

    1.1. Constructing a 1D CNN for textual content in Excel

    On this article, we construct a 1D CNN structure in Excel with the next elements:

    • Embedding dictionary
      We use a 2-dimensional embedding. As a result of one dimension will not be sufficient for this activity.
      One dimension encodes sentiment, and the second dimension encodes negation.
    • Conv1D layer
      That is the core part of a CNN structure.
      It consists of filters that slide throughout the sentence with a window size of two phrases. We select 2 phrases to be easy.
    • ReLU and international max pooling
      These steps hold solely the strongest matches detected by the filters.
      We can even focus on the truth that ReLU is elective.
    • Logistic regression
      That is the ultimate classification layer, which mixes the detected patterns right into a likelihood.
    1D CNN in Excel – all photos by writer

    This pipeline corresponds to a typical CNN textual content classifier.
    The one distinction right here is that we explicitly write and visualize the ahead go in Excel.

    1.2. What “deep studying” means on this structure

    Earlier than going additional, allow us to take a step again.
    Sure, I do know, I do that usually, however having a worldwide view of fashions actually helps to grasp them.

    The definition of deep studying is usually blurred.
    For many individuals, deep studying merely means “many layers”.

    Right here, I’ll take a barely totally different viewpoint.

    What actually characterizes deep studying will not be the variety of layers, however the depth of the transformation utilized to the enter information.

    With this definition:

    • Even a mannequin with a single convolution layer might be thought of deep studying,
    • as a result of the enter is remodeled right into a extra structured and summary illustration.

    Then again, taking uncooked enter information, making use of one-hot encoding, and stacking many absolutely related layers doesn’t essentially make a mannequin deep in a significant sense.
    In concept, if we don’t have any transformation, one layer is sufficient.

    In CNNs, the presence of a number of layers has a really concrete motivation.

    Contemplate a sentence like:

    This film will not be excellent

    With a single convolution layer and a small window, we will detect easy native patterns akin to: “very + good”

    However we can not but detect higher-level patterns akin to: “not + (excellent)”

    Because of this CNNs are sometimes stacked:

    • the primary layer detects easy native patterns,
    • the second layer combines them into extra advanced ones.

    On this article, we intentionally deal with one convolution layer.
    This makes each step seen and simple to grasp in Excel, whereas protecting the logic an identical to deeper CNN architectures.

    2. Turning phrases into embeddings

    Allow us to begin with some easy phrases.  We are going to attempt to detect negation, so we are going to use these phrases, with different phrases (that we’ll not mannequin)

    • “good”
    • “dangerous”
    • “not good”
    • “not dangerous”

    We hold the illustration deliberately small so that each step is seen.

    We are going to solely use a dictionary of three phrases : good, dangerous and never.

    All different phrases may have 0 as embeddings.

    2.1 Why one dimension will not be sufficient

    In a earlier article on sentiment detection, we used a single dimension.
    That labored for “good” versus “dangerous”.

    However now we wish to deal with negation.

    One dimension can solely symbolize one idea effectively.
    So we’d like two dimensions:

    • senti: sentiment polarity
    • neg: negation marker

    2.2 The embedding dictionary

    Every phrase turns into a 2D vector:

    • good → (senti = +1, neg = 0)
    • dangerous → (senti = -1, neg = 0)
    • not → (senti = 0, neg = +1)
    • every other phrase → (0, 0)

    This isn’t how actual embeddings look. Actual embeddings are discovered, high-dimensional, and never instantly interpretable.

    However for understanding how Conv1D works, this toy embedding is ideal.

    In Excel, that is only a lookup desk.
    In an actual neural community, this embedding matrix could be trainable.

    3. Conv1D filters as sliding sample detectors

    Now we arrive on the core thought of a 1D CNN.

    A Conv1D filter is nothing mysterious. It’s only a small set of weights plus a bias that slides over the sentence.

    As a result of:

    • every phrase embedding has 2 values (senti, neg)
    • our window accommodates 2 phrases

    every filter has:

    • 4 weights (2 dimensions × 2 positions)
    • 1 bias

    That’s all.

    You may consider a filter as repeatedly asking the identical query at each place:

    “Do these two neighboring phrases match a sample I care about?”

    3.1 Sliding home windows: how Conv1D sees a sentence

    Contemplate this sentence:

    it isn’t dangerous in any respect

    We select a window measurement of two phrases.

    Which means the mannequin seems to be at each adjoining pair:

    • (it, is)
    • (is, not)
    • (not, dangerous)
    • (dangerous, at)
    • (at, all)

    Necessary level:
    The filters slide in every single place, even when each phrases are impartial (all zeros).

    3.2 4 intuitive filters

    To make the habits straightforward to grasp, we use 4 filters.

    Filter 1 – “I see GOOD”

    This filter seems to be solely on the sentiment of the present phrase.

    Plain-text equation for one window:

    z = senti(current_word)

    If the phrase is “good”, z = 1
    If the phrase is “dangerous”, z = -1
    If the phrase is impartial, z = 0

    After ReLU, detrimental values change into 0. However it’s elective.

    Filter 2 – “I see BAD”

    This one is symmetric.

    z = -senti(current_word)

    So:

    • “dangerous” → z = 1
    • “good” → z = -1 → ReLU → 0

    Filter 3 – “I see NOT GOOD”

    This filter seems to be at two issues on the similar time:

    • neg(previous_word)
    • senti(current_word)

    Equation:

    z = neg(previous_word) + senti(current_word) – 1

    Why the “-1”?
    It acts like a threshold in order that each situations should be true.

    Outcomes:

    • “not good” → 1 + 1 – 1 = 1 → activated
    • “is sweet” → 0 + 1 – 1 = 0 → not activated
    • “not dangerous” → 1 – 1 – 1 = -1 → ReLU → 0

    Filter 4 – “I see NOT BAD”

    Identical thought, barely totally different signal:

    z = neg(previous_word) + (-senti(current_word)) – 1

    Outcomes:

    • “not dangerous” → 1 + 1 – 1 = 1
    • “not good” → 1 – 1 – 1 = -1 → 0

    This can be a crucial instinct:

    A CNN filter can behave like a native logical rule, discovered from information.

    3.3 Remaining results of sliding home windows

    Right here is the ultimate outcomes of those 4 filters.

    4. ReLU and max pooling: from native to international

    4.1 ReLU

    After computing z for each window, we apply ReLU:

    ReLU(z) = max(0, z)

    That means:

    • detrimental proof is ignored
    • constructive proof is saved

    Every filter turns into a presence detector.

    By the way in which, it’s an activation operate within the Neural community. So a Neural community will not be that troublesome in any case.

    4.2 World Max pooling

    Then comes international max pooling.

    For every filter, we hold solely:

    max activation over all home windows

    Interpretation:
    “I don’t care the place the sample seems, solely whether or not it seems strongly someplace.”

    At this level, the entire sentence is summarized by 4 numbers:

    • strongest “good” sign
    • strongest “dangerous” sign
    • strongest “not good” sign
    • strongest “not dangerous” sign

    4.3 What occurs if we take away ReLU?

    With out ReLU:

    • detrimental values keep detrimental
    • max pooling could choose detrimental values

    This mixes two concepts:

    • absence of a sample
    • reverse of a sample

    The filter stops being a clear detector and turns into a signed rating.

    The mannequin may nonetheless work mathematically, however interpretation turns into tougher.

    5. The ultimate layer is logistic regression

    Now we mix these indicators.

    We compute a rating utilizing a linear mixture:

    rating = 2 × F_good – 2 × F_bad – 3 × F_not_good – 3 × F_not_bad – bias

    Then we convert the rating right into a likelihood:

    likelihood = 1 / (1 + exp(-score))

    That’s precisely logistic regression.

    So sure:

    • the CNN extracts options: this step might be thought of as function engineering, proper?
    • logistic regression makes the ultimate choices, it’s a basic machine studying mannequin we all know effectively

    6. Full examples with sliding filters

    Instance 1

    “it’s dangerous, so it isn’t good in any respect”

    The sentence accommodates:

    After max pooling:

    • F_good = 1 (as a result of “good” exists)
    • F_bad = 1
    • F_not_good = 1
    • F_not_bad = 0

    Remaining rating turns into strongly detrimental.
    Prediction: detrimental sentiment.

    Instance 2

    “it’s good. sure, not dangerous.”

    The sentence accommodates:

    After max pooling:

    • F_good = 1
    • F_bad = 1 (as a result of the phrase “dangerous” seems)
    • F_not_good = 0
    • F_not_bad = 1

    The ultimate linear layer learns that “not dangerous” ought to outweigh “dangerous”.

    Prediction: constructive sentiment.

    This additionally exhibits one thing essential: max pooling retains all sturdy indicators.
    The ultimate layer decides the way to mix them.

    Exemple 3 with A limitation that explains why CNNs get deeper

    Do that sentence:

    “it isn’t very dangerous”

    With a window of measurement 2, the mannequin sees:

    It by no means sees (not, dangerous), so the “not dangerous” filter by no means fires.

    It explains why actual fashions use:

    • bigger home windows
    • a number of convolution layers
    • or different architectures for longer dependencies

    Conclusion

    The power of Excel is visibility.

    You may see:

    • the embedding dictionary
    • all filter weights and biases
    • each sliding window
    • each ReLU activation
    • the max pooling outcome
    • the logistic regression parameters

    Coaching is just the method of adjusting these numbers.

    When you see that, CNNs cease being mysterious.

    They change into what they are surely: structured, trainable sample detectors that slide over information.



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    It’s the Lessons We Learned Along the Way. Or, Is It?

    June 1, 2026

    Proxy-Pointer RAG: Eliminating Wasteful Entity & Relations Extraction in Knowledge Graphs

    May 31, 2026

    Solving a Murder Mystery Using Bayesian Inference

    May 31, 2026

    Rerankers Aren’t Magic Either: When the Cross-Encoder Layer Is Worth the Cost

    May 31, 2026

    Qdrant TurboQuant Explained: Is TurboQuant the Silver Bullet?

    May 30, 2026

    Meta-Cognitive Regulation Might Be the Most Important AI Skill Nobody Is Talking About

    May 30, 2026

    Comments are closed.

    Editors Picks

    Cheque in: 3 startups ended May by raising $15.5 million

    June 1, 2026

    Universal Audio Volt 876 USB Audio Interface Review: Pro-Level Polish

    June 1, 2026

    New York City-based Mecka AI, which trains robots with human data sourced from body sensors and iPhones, raised $60M, including a $25M Series A (Ben Weiss/Fortune)

    June 1, 2026

    Is Instagram Down? What to Know

    June 1, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    TerraUSD creator sentenced to 15 years in prison over $40bn crash

    December 12, 2025

    Today’s NYT Mini Crossword Answers for July 18

    July 18, 2025

    Researchers Seize Control of Smart Homes With Malicious Gemini AI Prompts

    August 6, 2025
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.