Introduction
are among the many hottest instruments for explaining Machine Studying (ML) and Deep Studying (DL) fashions. Nevertheless, for time-series knowledge, these strategies usually fall brief as a result of they don’t account for the temporal dependencies inherent in such datasets. In a latest article, we (Ángel Luis Perales Gómez, Lorenzo Fernández Maimó and me) launched ShaTS, a novel Shapley-based explainability methodology particularly designed for time-series fashions. ShaTS addresses the restrictions of conventional Shapley strategies by incorporating grouping methods that improve each computational effectivity and explainability.
Shapley values: The inspiration
Shapley values originate in cooperative recreation principle and pretty distribute the whole achieve amongst gamers based mostly on their particular person contributions to a collaborative effort. The Shapley worth for a participant is calculated by contemplating all attainable coalitions of gamers and figuring out the marginal contribution of that participant to every coalition.
Formally, the Shapley worth φi for participant i is:
[ varphi_i(v) = sum_{S subseteq N setminus {i}}
fracS! (v(S cup {i}) – v(S)) ]
the place:
- N is the set of all gamers.
- S is a coalition of gamers not together with i.
- v(S) is the worth operate that assigns a price to every coalition (i.e., the whole achieve that coalition S can obtain).
This system averages the marginal contributions of participant i throughout all attainable coalitions, weighted by the chance of every coalition forming.
From Recreation Concept to xAI: Shapley values in Machine Studying
Within the context of explainable AI (xAI), Shapley values attribute a mannequin’s output to its enter options. That is significantly helpful for understanding complicated fashions, corresponding to deep neural networks, the place the connection between enter and output shouldn’t be all the time clear.
Shapley-based strategies might be computationally costly, particularly because the variety of options will increase, as a result of the variety of attainable coalitions grows exponentially. Nevertheless, approximation strategies, significantly these applied within the fashionable SHAP library, have made them possible in apply. These strategies estimate the Shapley values by sampling a subset of coalitions fairly than evaluating all attainable mixtures, considerably lowering the computational burden.
Take into account an industrial state of affairs with three parts: a water tank, a thermometer, and an engine. Suppose we’ve got an Anomaly Detection (AD) ML/DL mannequin that detects malicious exercise based mostly on the readings from these parts. Utilizing SHAP, we will decide how a lot every part contributes to the mannequin’s prediction of whether or not the exercise is malicious or benign.
Nevertheless, in additional reasonable eventualities the mannequin makes use of not solely the present studying from every sensor but additionally earlier readings (a temporal window) to make predictions. This method permits the mannequin to seize temporal patterns and developments, thereby enhancing its efficiency. Making use of SHAP on this state of affairs to assign accountability to every bodily part turns into more difficult as a result of there is no such thing as a longer a one-to-one mapping between options and sensors. Every sensor now contributes a number of options related to totally different time steps. The widespread method right here is to calculate the Shapley worth of every characteristic at every time step after which post-hoc combination these values.

This method has two most important drawbacks:
- Computational Complexity: The computational price will increase exponentially with the variety of options, making it impractical for giant time-series datasets.
- Ignoring Temporal Dependencies: SHAP explainers are designed for tabular knowledge with out temporal dependencies. Put up-hoc aggregation can result in inaccurate explanations as a result of it fails to seize temporal relationships between options.
The ShaTS Method: Grouping Earlier than Computing Significance
Within the Shapley framework, a participant’s worth is decided solely by evaluating the efficiency of a coalition with and with out that participant. Though the tactic is outlined on the particular person degree, nothing prevents making use of it to teams of gamers fairly than to single people. Thus, if we take into account a set of gamers N divided into p teams G = {G1, … , Gp}, we will compute the Shapley worth for every group Gi by evaluating the marginal contribution of your entire group to all attainable coalitions of the remaining teams. Formally, the Shapley worth for group Gi might be expressed as:
[ varphi(G_i) = sum_{T subseteq G setminus G_i} frac – G left( v(T cup G_i) – v(T) right) ]
the place:
- G is the set of all teams.
- T is a coalition of teams not together with Gi.
- v(T) is the worth operate that assigns a price to every coalition of teams.
Constructing on this concept, ShaTS operates on time home windows and supplies three distinct ranges of grouping, relying on the explanatory purpose:
Temporal
Every group accommodates all measurements recorded at a particular immediate inside the time window. This technique is beneficial for figuring out crucial instants that considerably affect the mannequin’s prediction.

Function
Every group represents the measurements of a person characteristic over the time window. This technique isolates the impression of particular options on the mannequin’s choices.

Multi-Function
Every group consists of the mixed measurements over the time window of options that share a logical relationship or symbolize a cohesive practical unit. This method analyzes the collective impression of interdependent options, making certain their mixed affect is captured.

As soon as teams are outlined, Shapley values are computed precisely as within the particular person case, however utilizing group-level marginal contributions as an alternative of per-feature contributions.

ShaTS customized visualization
ShaTS features a visualization designed particularly for sequential knowledge and for the three grouping methods above. The horizontal axis reveals consecutive home windows. The left vertical axis lists the teams, and the fitting vertical axis overlays the mannequin’s anomaly rating for every window. Every heatmap cell at (i, Gj) represents the significance of group Gj for window i. Hotter reds point out a stronger optimistic contribution to the anomaly, cooler blues point out a stronger unfavourable contribution, and near-white means negligible affect. A purple dashed line traces the anomaly rating throughout home windows, and a horizontal dashed line at 0.5 marks the choice threshold between anomalous and regular home windows.
For instance, think about a mannequin that processes home windows of size 10 constructed from three options, X, Y, and Z. When an operator receives an alert and desires to know which sign triggered it, they examine the characteristic grouping outcomes. Within the subsequent determine, round home windows 10–11 the anomaly rating rises above the edge, whereas the attribution for X intensifies. This sample signifies that the choice is being pushed primarily by X.

If the following query is when, inside every window, the anomaly happens, the operator switches to the temporal grouping view. The subsequent determine reveals that the ultimate immediate of every window (t9) persistently carries the strongest optimistic attribution, revealing that the mannequin has realized to depend on the final time step to categorise the window as anomalous.

Experimental Outcomes: Testing ShaTS on the SWaT Dataset
In our recent publication, we validated ShaTS on the Safe Water Remedy (SWaT) testbed, an industrial water facility with 51 sensors/actuators organized into six plant phases (P1–P6). A stacked Bi-LSTM educated on windowed alerts served because the detector, and we in contrast ShaTS with submit hoc KernelSHAP utilizing three viewpoints: Temporal (which immediate within the window issues), Sensor/Actuator (which machine), and Course of (which of the six phases).
Throughout assaults, ShaTS yielded tight, interpretable bands that pinpointed the true supply—right down to the sensor/actuator or plant stage—whereas submit hoc SHAP tended to diffuse significance throughout many teams, complicating root-cause evaluation. ShaTS was additionally quicker and extra scalable: grouping shrinks the participant set, so the coalition area drops dramatically; run time stays almost fixed because the window size grows as a result of the variety of teams doesn’t change; and GPU execution additional accelerates the tactic, making near-real-time use sensible.
Fingers-on Instance: Integrating ShaTS into Your Workflow
This walkthrough reveals plug ShaTS right into a typical Python workflow: import the library, select a grouping technique, initialize the explainer along with your educated mannequin and background knowledge, compute group-wise Shapley values on a take a look at set, and visualize the outcomes. The instance assumes a PyTorch time-series mannequin and that your knowledge is windowed (e.g., form [window_len, n_features] per pattern).
1. Import ShaTS and configure the Explainer
In your Python script or pocket book, start by importing the mandatory parts from the ShaTS library. Whereas the repository exposes the summary ShaTS class, you’ll usually instantiate certainly one of its concrete implementations (e.g., FastShaTS).
import shats
from shats.grouping import TimeGroupingStrategy
from shats.grouping import FeaturesGroupingStrategy
from shats.grouping import MultifeaturesGroupingStrategy
2. Initialize the Mannequin and Knowledge
Assume you’ve got a pre-trained time collection PyTorch mannequin and a background dataset, which ought to be a listing of tensors representing typical knowledge samples that the mannequin has seen throughout coaching. If you wish to higher undestand the background dataset verify this weblog from Cristoph Molnar.
mannequin = MyTrainedModel()
random_samples = random.pattern(vary(len(trainDataset)), 100)
background = [trainDataset[idx] for idx in random_samples]
shapley_class = shats.FastShaTS(mannequin,
support_dataset=background,
grouping_strategy= FeaturesGroupingStrategy(names=variable_names)
3. Compute Shapley Values
As soon as the explainer is initialized, compute the ShaTS values to your take a look at dataset. The take a look at dataset ought to be formatted equally to the background dataset.
shats_values = shaTS.compute(testDataset)
4. Visualize Outcomes
Lastly, use the built-in visualization operate to plot the ShaTS values. You may specify which class (e.g., anomalous or regular) you wish to clarify.
shaTS.plot(shats_values, test_dataset=testDataset, class_to_explain=1)
Key Takeaways
- Targeted Attribution: ShaTS supplies extra targeted attributions than submit hoc SHAP, making it simpler to determine the foundation trigger in time-series fashions.
- Effectivity: By lowering the variety of gamers to teams, ShaTS considerably decreases the coalitions to guage, resulting in quicker computation instances.
- Scalability: ShaTS maintains constant efficiency at the same time as window dimension will increase, due to its mounted group construction.
- GPU Acceleration: ShaTS can leverage GPU assets, additional enhancing its velocity and effectivity.
Strive it your self
Interactive demo
Evaluate ShaTS with submit hoc SHAP on artificial time-series here. You will discover a tutorial on the next video.
Open supply
The ShaTS module is totally documented and able to plug into your ML/DL pipeline. Discover the code on Github.
I hope you preferred it! You’re welcome to contact me when you’ve got questions, wish to share suggestions, or just really feel like showcasing your individual tasks.

