Close Menu
    Facebook LinkedIn YouTube WhatsApp X (Twitter) Pinterest
    Trending
    • Portable water filter provides safe drinking water from any source
    • MAGA Is Increasingly Convinced the Trump Assassination Attempt Was Staged
    • NCAA seeks faster trial over DraftKings disputed March Madness branding case
    • AI Trusted Less Than Social Media and Airlines, With Grok Placing Last, Survey Says
    • Extragalactic Archaeology tells the ‘life story’ of a whole galaxy
    • Swedish semiconductor startup AlixLabs closes €15 million Series A to scale atomic-level etching technology
    • Republican Mutiny Sinks Trump’s Push to Extend Warrantless Surveillance
    • Yocha Dehe slams Vallejo Council over rushed casino deal approval process
    Facebook LinkedIn WhatsApp
    Times FeaturedTimes Featured
    Saturday, April 18
    • Home
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    • More
      • AI
      • Robotics
      • Industries
      • Global
    Times FeaturedTimes Featured
    Home»Artificial Intelligence»When Shapley Values Break: A Guide to Robust Model Explainability
    Artificial Intelligence

    When Shapley Values Break: A Guide to Robust Model Explainability

    Editor Times FeaturedBy Editor Times FeaturedJanuary 15, 2026No Comments10 Mins Read
    Facebook Twitter Pinterest Telegram LinkedIn Tumblr WhatsApp Email
    Share
    Facebook Twitter LinkedIn Pinterest Telegram Email WhatsApp Copy Link


    Explainability in AI is important for gaining belief in mannequin predictions and is extremely vital for bettering mannequin robustness. Good explainability usually acts as a debugging device, revealing flaws within the mannequin coaching course of. Whereas Shapley Values have turn into the business normal for this job, we should ask: Do they all the time work? And critically, the place do they fail?

    To grasp the place Shapley values fail, the most effective strategy is to manage the bottom reality. We’ll begin with a easy linear mannequin, after which systematically break down the reason. By observing how Shapley values react to those managed modifications, we will exactly determine precisely the place they yield deceptive outcomes and repair them.

    The Toy Mannequin

    We’ll begin with a mannequin with 100 uniform random variables.

    import numpy as np
    from sklearn.linear_model import LinearRegression
    import shap
    
    def get_shapley_values_linear_independent_variables(
        weights: np.ndarray, knowledge: np.ndarray
    ) -> np.ndarray:
        return weights * knowledge
    
    # Prime examine the theoretical outcomes with shap package deal
    def get_shap(weights: np.ndarray, knowledge: np.ndarray):
        mannequin = LinearRegression()
        mannequin.coef_ = weights  # Inject your weights
        mannequin.intercept_ = 0
        background = np.zeros((1, weights.form[0]))
        explainer = shap.LinearExplainer(mannequin, background) # Assumes impartial between all options
        outcomes = explainer.shap_values(knowledge) 
        return outcomes
    
    DIM_SPACE = 100
    
    np.random.seed(42)
    # Generate random weights and knowledge
    weights = np.random.rand(DIM_SPACE)
    knowledge = np.random.rand(1, DIM_SPACE)
    
    # Set particular values to check our instinct
    # Function 0: Excessive weight (10), Function 1: Zero weight
    weights[0] = 10
    weights[1] = 0
    # Set maximal worth for the primary two options
    knowledge[0, 0:2] = 1
    
    shap_res = get_shapley_values_linear_independent_variables(weights, knowledge)
    shap_res_pacakge = get_shap(weights, knowledge)
    idx_max = shap_res.argmax()
    idx_min = shap_res.argmin()
    
    print(
        f"Anticipated: idx_max 0, idx_min 1nActual: idx_max {idx_max},  idx_min: {idx_min}"
    )
    
    print(abs(shap_res_pacakge - shap_res).max()) # No distinction

    On this easy instance, the place all variables are impartial, the calculation simplifies dramatically.

    Recall that the Shapley system relies on the marginal contribution of every characteristic, the distinction within the mannequin’s output when a variable is added to a coalition of identified options versus when it’s absent.

    [ V(S∪{i}) – V(S)
    ]

    For the reason that variables are impartial, the particular mixture of pre-selected options (S) doesn’t affect the contribution of characteristic i. The impact of pre-selected and non-selected options cancel one another out in the course of the subtraction, having no impression on the affect of characteristic i. Thus, the calculation reduces to measuring the marginal impact of characteristic i immediately on the mannequin output:

    [ W_i · X_i ]

    The result’s each intuitive and works as anticipated. As a result of there isn’t a interference from different options, the contribution relies upon solely on the characteristic’s weight and its present worth. Consequently, the characteristic with the most important mixture of weight and worth is probably the most contributing characteristic. In our case, characteristic index 0 has a weight of 10 and a price of 1.

    Let’s Break Issues

    Now, we are going to introduce dependencies to see the place Shapley values begin to fail.

    On this state of affairs, we are going to artificially induce good correlation by duplicating probably the most influential characteristic (index 0) 100 instances. This ends in a brand new mannequin with 200 options, the place 100 options are equivalent copies of our unique prime contributor and impartial of the remainder of the 99 options. To finish the setup, we assign a zero weight to all these added duplicate options. This ensures the mannequin’s predictions stay unchanged. We’re solely altering the construction of the enter knowledge, not the output. Whereas this setup appears excessive, it mirrors a standard real-world state of affairs: taking a identified vital sign and creating a number of derived options (corresponding to rolling averages, lags, or mathematical transformations) to higher seize its data.

    Nevertheless, as a result of the unique Function 0 and its new copies are completely dependent, the Shapley calculation modifications.

    Based mostly on the Symmetry Axiom: if two options contribute equally to the mannequin (on this case, by carrying the identical data), they have to obtain equal credit score.

    Intuitively, realizing the worth of anybody clone reveals the total data of the group. Because of this, the large contribution we beforehand noticed for the one characteristic is now cut up equally throughout it and its 100 clones. The “sign” will get diluted, making the first driver of the mannequin seem a lot much less vital than it really is.
    Right here is the corresponding code:

    import numpy as np
    from sklearn.linear_model import LinearRegression
    import shap
    
    def get_shapley_values_linear_correlated(
        weights: np.ndarray, knowledge: np.ndarray
    ) -> np.ndarray:
        res = weights * knowledge
        duplicated_indices = np.array(
            [0] + record(vary(knowledge.form[1] - DUPLICATE_FACTOR, knowledge.form[1]))
        )
        # we are going to sum these contributions and cut up contribution amongst them
        full_contrib = np.sum(res[:, duplicated_indices], axis=1)
        duplicate_feature_factor = np.ones(knowledge.form[1])
        duplicate_feature_factor[duplicated_indices] = 1 / (DUPLICATE_FACTOR + 1)
        full_contrib = np.tile(full_contrib, (DUPLICATE_FACTOR+1, 1)).T
        res[:, duplicated_indices] = full_contrib
        res *= duplicate_feature_factor
        return res
    
    def get_shap(weights: np.ndarray, knowledge: np.ndarray):
        mannequin = LinearRegression()
        mannequin.coef_ = weights  # Inject your weights
        mannequin.intercept_ = 0
        explainer = shap.LinearExplainer(mannequin, knowledge, feature_perturbation="correlation_dependent")    
        outcomes = explainer.shap_values(knowledge)
        return outcomes
    
    DIM_SPACE = 100
    DUPLICATE_FACTOR = 100
    
    np.random.seed(42)
    weights = np.random.rand(DIM_SPACE)
    weights[0] = 10
    weights[1] = 0
    knowledge = np.random.rand(10000, DIM_SPACE)
    knowledge[0, 0:2] = 1
    
    # Duplicate copy of characteristic 0, 100 instances:
    dup_data = np.tile(knowledge[:, 0], (DUPLICATE_FACTOR, 1)).T
    knowledge = np.concatenate((knowledge, dup_data), axis=1)
    # We'll put zero weight for all these added options:
    weights = np.concatenate((weights, np.tile(0, (DUPLICATE_FACTOR))))
    
    
    shap_res = get_shapley_values_linear_correlated(weights, knowledge)
    
    shap_res = shap_res[0, :] # Take First document to check outcomes
    idx_max = shap_res.argmax()
    idx_min = shap_res.argmin()
    
    print(f"Anticipated: idx_max 0, idx_min 1nActual: idx_max {idx_max},  idx_min: {idx_min}")

    That is clearly not what we supposed and fails to offer clarification to mannequin habits. Ideally, we would like the reason to mirror the bottom reality: Function 0 is the first driver (with a weight of 10), whereas the duplicated options (indices 101–200) are merely redundant copies with zero weight. As a substitute of diluting the sign throughout all copies, we’d clearly desire an attribution that highlights the true supply of the sign.

    Observe: If you happen to run this utilizing Python shap package deal, you would possibly discover the outcomes are comparable however not equivalent to our handbook calculation. It is because calculating Shapley values is computationally infeasible. Subsequently libraries like shap depend on approximation strategies which barely introduce variance.

    Picture by writer (generated with Google Gemini).

    Can We Repair This?

    Since correlation and dependencies between options are extraordinarily widespread, we can not ignore this problem.

    On the one hand, Shapley values do account for these dependencies. A characteristic with a coefficient of 0 in a linear mannequin and no direct impact on the output receives a non-zero contribution as a result of it incorporates data shared with different options. Nevertheless, this habits, pushed by the Symmetry Axiom, just isn’t all the time what we would like for sensible explainability. Whereas “pretty” splitting the credit score amongst correlated options is mathematically sound, it usually hides the true drivers of the mannequin.

    A number of strategies can deal with this, and we are going to discover them.

    Grouping Options

    This strategy is especially important for high-dimensional characteristic area fashions, the place characteristic correlation is inevitable. In these settings, making an attempt to attribute particular contributions to each single variable is commonly noisy and computationally unstable. As a substitute, we will combination comparable options that characterize the identical idea right into a single group. A useful analogy is from picture classification: if we wish to clarify why a mannequin predicts “cat” as an alternative of a “canine”, inspecting particular person pixels just isn’t significant. Nevertheless, if we group pixels into “patches” (e.g., ears, tail), the reason turns into instantly interpretable. By making use of this identical logic to tabular knowledge, we will calculate the contribution of the group somewhat than splitting it arbitrarily amongst its parts.

    This may be achieved in two methods: by merely summing the Shapley values inside every group or by immediately calculating the group’s contribution. Within the direct technique, we deal with the group as a single entity. As a substitute of toggling particular person options, we deal with the presence and absence of the group as simultaneous presence or absence of all options inside it. This reduces the dimensionality of the issue, making the estimation quicker, extra correct, and extra steady.

    Picture by writer (generated with Google Gemini).

    The Winner Takes It All

    Whereas grouping is efficient, it has limitations. It requires defining the teams beforehand and infrequently ignores correlations between these teams.

    This results in “clarification redundancy”. Returning to our instance, if the 101 cloned options will not be pre-grouped, the output will repeat these 101 options with the identical contribution 101 instances. That is overwhelming, repetitive, and functionally ineffective. Efficient explainability ought to cut back the redundancy and present one thing new to the consumer every time.

    To realize this, we will create a grasping iterative course of. As a substitute of calculating all values directly, we will choose options step-by-step:

    1. Choose the “Winner”: Determine the one characteristic (or group) with the very best particular person contribution
    2. Situation the Subsequent Step: Re-evaluate the remaining options, assuming the options from the earlier step are already identified. We’ll incorporate them within the subset of pre-selected options S within the shapley worth every time.
    3. Repeat: Ask the mannequin: “Provided that the consumer already is aware of about Function A, B, C, which remaining characteristic contributes probably the most data?”

    By recalculating Shapley values (or marginal contributions) conditioned on the pre-selected options, we be sure that redundant options successfully drop to zero. If Function A and Function B are equivalent and Function A is chosen first, Function B now not offers new data. It’s routinely filtered out, leaving a clear, concise record of distinct drivers.

    Picture by writer (generated with Google Gemini).

    Observe: Yow will discover an implementation of this direct group and grasping iterative calculation in our Python package deal medpython.
    Full disclosure: I’m a co-author of this open-source package deal.

    Actual World Validation

    Whereas this toy mannequin demonstrates mathematical flaws in shapley values technique, how does it work in real-life situations?

    We utilized these strategies of Grouped Shapley with Winner takes all of it, moreover with extra strategies (which can be out of scope for this publish, possibly subsequent time), in complicated scientific settings utilized in healthcare. Our fashions make the most of a whole lot of options with sturdy correlation that had been grouped into dozens of ideas.

    This technique was validated throughout a number of fashions in a blinded setting when our clinicians weren’t conscious which technique they had been inspecting, and outperformed the vanilla Shapley values by their rankings. Every method contributed above the earlier experiment in a multi-step experiment. Moreover, our crew utilized these explainability enhancements as a part of our submission to the CMS Well being AI Problem, the place we had been chosen as award winners.

    Picture by the Facilities for Medicare & Medicaid Companies (CMS)

    Conclusion

    Shapley values are the gold normal for mannequin explainability, offering a mathematically rigorous strategy to attribute credit score.
    Nevertheless, as we have now seen, mathematical “correctness” doesn’t all the time translate into efficient explainability.

    When options are extremely correlated, the sign is likely to be diluted, hiding the true drivers of your mannequin behind a wall of redundancy.

    We explored two methods to repair this:

    1. Grouping: Combination options right into a single idea
    2. Iterative Choice: conditioning on already introduced ideas to squeeze out solely new data, successfully stripping away redundancy.

    By acknowledging these limitations, we will guarantee our explanations are significant and useful.

    If you happen to discovered this convenient, let’s join on LinkedIn



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Editor Times Featured
    • Website

    Related Posts

    A Practical Guide to Memory for Autonomous LLM Agents

    April 17, 2026

    You Don’t Need Many Labels to Learn

    April 17, 2026

    Beyond Prompting: Using Agent Skills in Data Science

    April 17, 2026

    6 Things I Learned Building LLMs From Scratch That No Tutorial Teaches You

    April 17, 2026

    Introduction to Deep Evidential Regression for Uncertainty Quantification

    April 17, 2026

    memweave: Zero-Infra AI Agent Memory with Markdown and SQLite — No Vector Database Required

    April 17, 2026

    Comments are closed.

    Editors Picks

    Portable water filter provides safe drinking water from any source

    April 18, 2026

    MAGA Is Increasingly Convinced the Trump Assassination Attempt Was Staged

    April 18, 2026

    NCAA seeks faster trial over DraftKings disputed March Madness branding case

    April 18, 2026

    AI Trusted Less Than Social Media and Airlines, With Grok Placing Last, Survey Says

    April 18, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    About Us
    About Us

    Welcome to Times Featured, an AI-driven entrepreneurship growth engine that is transforming the future of work, bridging the digital divide and encouraging younger community inclusion in the 4th Industrial Revolution, and nurturing new market leaders.

    Empowering the growth of profiles, leaders, entrepreneurs businesses, and startups on international landscape.

    Asia-Middle East-Europe-North America-Australia-Africa

    Facebook LinkedIn WhatsApp
    Featured Picks

    Is AI “normal”? | MIT Technology Review

    May 18, 2025

    Peaky Blinders producer Banijay seals $5B deal to create gambling giant

    October 30, 2025

    British FinTech firm MillTech secures €51 million investment at €277 million valuation

    April 8, 2026
    Categories
    • Founders
    • Startups
    • Technology
    • Profiles
    • Entrepreneurs
    • Leaders
    • Students
    • VC Funds
    Copyright © 2024 Timesfeatured.com IP Limited. All Rights.
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.