Introduction:
Due to its widespread adoption and the increasingly more complex mathematical representations of machine learning models, feature importance and model explainability algorithms have become a major topic within the field of Artificial Intelligence (AI) and Data Science in general.
Most of the commonly used feature importance techniques — such as mean decrease impurity (MDI), mean decrease accuracy (MDA), and single feature importance (SFI) — aim at generating a feature score that is directly proportional to the feature’s effect on the overall predictive quality of the machine learning model.
However, in most cases, data scientists and machine learning practitioners are more interested in how the value of a specific feature influenced the outcome of an individual prediction — instead of the overall model. In the past couple of years, this has resulted in the development of machine learning techniques which allow to detect the contribution of a feature on an individual prediction for any given machine learning model.
The insights provided by such sample-level feature techniques allow machine-learning practitioners and data scientists to add an additional level of interpretability on top of their predictions. Naturally, this has proven to be very valuable in cases where businesses or people are obligated to disclose the criteria used during the different steps of their decision-making procedure — which turn out to happen more often than you would think.
A common example is one from the financial and insurance industries. Within these industries, it has become common practise to rely on advanced algorithms and machine learning models to determine the credit risk scores of customers based on a variety of parameters. Needless to say, it is impossible to refuse a customer’s loan request without providing an in-depth explanation on why his or her request got denied.
Alternatively, machine learning applications are being used extensively within the field of pharmaceuticals and medicine for drug discovery or detecting diseases within patients, respectively. As was the case in previous example, patients would not be pleased with the thought that a machine determined that they were seriously ill — without being able to explain why the algorithm arrived at this decision.
In addition, from the perspective of accelerating the pace of machine learning adoption, it is vital that machine learning models and individual predictions are trusted by their users. Whereas the former was guaranteed by previously developed feature importance methodologies, the latter is now possible thanks to sample level feature importance techniques. In what follows, two of the most commonly-used sample level explainability algorithms — being the LIME and SHAPE algorithm — will be discussed in greater depth
LIME explainability algorithm:
LIME — short for Local Interpretable Model-agnostic Explanations — is a model explanation algorithm which provides insights into how much each feature contributed to the outcome of a machine learning model for an individual prediction. As stated in its name, the LIME algorithm operates in a model-agnostic way, meaning that it can be applied to any black-box machine learning model — including different neural network architectures and a wide range of kernel-methods like support vector machines.
The LIME explainability model belongs to what is often referred to as ‘surrogate models’. Within engineering, it is common practice to utilize surrogate models to model outcomes of processes whenever this outcome becomes too complex to explain directly. The same holds true for the LIME explainability algorithm: instead of trying to explain the black-box machine learning model directly, a simpler — more intuitive — local surrogate model is used to provide explanations.
The creation of this surrogate model is done in a step wise process. First, the LIME algorithm creates a new proxy-data set by making slight permutations to the feature values of the available dataset (that is: the dataset that was used to train the black-box model). Next, each of these samples is assigned a weight that is proportional to its similarity with respect to the instance we are trying to explain. At last, a surrogate machine learning model — which is an explainable model such as a decision tree classifier/regressor or logistic regression model — is trained on the weighted proxy data-set.
The final result of this process is an explainable machine learning model which behaves very similar to the black-box machine learning model in the local region in which the instance which we want to explain is located. By using this explainable machine learning model to classify the new instance, and by interpreting the internal decision-making structure of the model, one is now able to provide a very granular instance explanation on top of the model’s final prediction.
This methodology for providing model explanations has some inherent advantages linked to it. First, since one of the by-products of the LIME algorithm is a local surrogate model, one can keep using this model to create model explanations — even when the more complex black-box model itself changes. In addition, it is not necessarily required to train the local surrogate model on the same features that were used to train the black-box model. For example, the black-box model could be trained on complex, aggregated features (i.e., the features resulting from a Principal Component Analysis (PCA)) — which are perfect for achieving high predictive accuracy -, whereas the local surrogate model could be trained on the original features — which are perfectly suited for explaining the results.
However, as is always the case within data science and machine learning, these advantages come with a trade-off. As previously discussed, the creation of a local surrogate model requires to define the similarity to the sample we wish to explain. However, determining this similarity metric is often difficult, and the appropriate metric might differ from case to case. Another downside of the LIME algorithm — which is partly caused by the selection of an inappropriate similarity metric — is the instability of the explanations with instances that are very close to each other within the feature space. Concretely, this means that the explanation of nearly-identical samples might differ from each other a lot while — in theory — this is not the case.
Now that we’ve covered the theoretical part, let’s switch to the fun stuff: the practical implementation of the LIME algorithm. Thankfully, the power of the LIME algorithm can be accessed easily by utilizing the LIME Python package.
As previously mentioned, the LIME algorithm is model agnostic, meaning it is able to interpret the predictions of any machine learning classifier. However, the LIME package for Python implementations requires one to define both the data type that was used, as well as the type of machine learning model that was used during model training — mainly to improve the model explanation and to reduce computational time.
An example of what can be achieved with the LIME Python package — and alternatively any coding language which implements the LIME algorithm — is shown below. For this example, a machine learning classifier — more specifically a convolutional neural network — was trained to detect a series of animals within digital images.
The original input image — which is shown on the left on Figure 1 — depicts an image of a Bernese mountain dog along with a regular house cat. When looking at a few of the probability scores (Table 1), the convolutional neural network predicted that a Bernese mountain dog is present in this image with a probability of 82.92%.
Whereas a Bernese mountain dog — indeed — is present in the image, one can now use the power of the LIME algorithm to detect which features caused the Convolutional Neural Network to assign such a high probability to the Bernese Mountain Dog class.
This is done by running the instance (which is an image, in this case) through the LIME explainer function. The result from this operation is a masked image — shown on the right on Figure 1 — which indicates a group of pixels within the image that contributed to the prediction being positive (green pixels) or negative (red pixels). In the masked explanation image, one can see that the Convolutional Neural Network indeed relied on the right pixels to assign such a high probability to a Bernese Mountain Dog being present within the image — allowing one to conclude that it has learned to correctly identify the features needed to recognise Bernese Mountain Dogs.
SHAP explainability algorithm:
Similar to LIME, SHAPE — short for Shapley Additive Explanation — is a model explainability algorithm which operates on the single prediction level rather than the global machine learning model level. The algorithm derives its name from the ‘Shapley values’ — a concept that is commonly used within the field of cooperative game theory to determine the payout for each player within a cooperative coalition.
Here, the payout that is received by each player is determined by the magnitude of the SHAP value that is associated with that player. Obviously, the magnitude itself is determined by the player’s contribution to the coalitions payout.
Scott Lundberg — a Microsoft Researcher and Phd in Computer Science — cleverly combined the mathematical foundations of the Shapley value with the theoretical principles of machine learning models to develop the SHAP explainability algorithm. Instead of calculating the Shapley value for each person within a cooperative coalition, S. Lundberg rephrased the algorithm to calculate the Shapley value for each feature of a single instance. Instead of representing the payout that is received by a player within a coalition, the Shapley value now represents the feature’s contribution on the final prediction. Thus, whenever the SHAP algorithm is used on a particular instance, one ends up with a series of Shapley values — each of which is associated with one of the instance’s features.
One of the powers of the SHAP algorithm is that the final prediction can be calculated by adding a baseline value — which is constant for each machine learning classifier/regressor — to the sum of all Shapley values:
By looking at the magnitude and the sign of the Shapley value, one can determine the contribution of a feature to the final prediction. The magnitude of the Shapley value represents the importance of the feature to which it is associated — with low absolute values representing less important features and high absolute values representing more important features. Alternatively, the sign of the Shapley value represents the direction in which the feature pushed the prediction. For a binary classifier — for example — a negative Shapley refers to a feature which pushed the prediction towards the 0 label, whereas a positive Shapley value refers to a feature which pushed the prediction towards the 1 label.
In addition, the algorithm is not limited to determining the importance of individual features, but is also capable of detecting important clusters and subsets of features. This is especially relevant when dealing with imagery data, where a cluster of nearby pixel values will usually be highly correlated to the classification output rather than the value of a single pixel.
As is the case with the LIME algorithm, the SHAP algorithm is model agnostic, meaning it allows data scientist to change their machine learnings models without running the risk of losing their model explanations.
However, one of the key disadvantages while using the SHAP algorithm is its computational time. Due time-consuming operations needed to calculate the Shapley values, arriving at the final explanation for a particular instance might require some patience. This is especially true when dealing with rather complex machine learning models, and instances with many (read as: more than 20) features.
Thankfully, the power of the SHAP algorithm has been implemented in a clever way in the SHAP Python package[1]. This package allows one to define the used machine learning model, enabling to drastically speed up the computational time need to arrive at an instance explanation.
An implementation of the SHAP package is the Tree Explainer. As the name suggests, this algorithm provides explanations for tree-based algorithms — including Gradient Boosting Algorithms, Random Forest Classifiers, and regular decision trees.
The image below shows a visual representation of an prediction made by a tree-base binary classifier. The classifier was trained on the Census Income Data Set[2], with the aim of predicting whether a person’s yearly income exceeds $50,000. For this instance — which represents a person — the final prediction turned out to be 0.03, meaning that the classifier assigns a probability of 3% to this person having a yearly income that exceeds $50,000. In addition, the explanation shows that the person’s age — being equal to 51 — causes this probability to increase, whereas the relationship class 0 (representing a married person) and the occupation class 6 (representing a job in the farming industry) causes the probability to decrease drastically.
Alternatively, one can use the DeepExplainer from the SHAP package to explain more complex machine learning models such as Deep Learning models. An example of such an explanation is provided in the image below. This example shows the explanation of a Deep Learning Image classification model that was trained to recognise a series of animals. As was the case in the example from the LIME algorithm, the SHAP explainability algorithm is capable of detecting which pixels caused the algorithm to lean towards a certain animal class. In this case, pixels with an intense pink color make the deep learning model lean towards a positive classification, whereas pixels with an intense blue color make the deep learning model lean towards a negative classification.
At last, the SHAP algorithm can be used to provide explanations for Natural Language Processing(NLP) problems as well. This can be done by using the package’s Kernel Explainer — which is completely model-agnostic and can be used for any type of machine learning algorithm. The example below shows an instance explanation based on a prediction that was made by a logistic regression model. The model itself was trained on a large database of tweet messages, with the purpose of determining the probability that the tweet’s content referred to the occurrence of a disaster — such as bombings, tsunami’s, or earthquakes.
The explanation shows that for this particular instance — which was a tweet that included the words ‘debris’, ‘Malaysia’, ‘ocean’, ‘airlines’ and ‘south’ — the logistic regression model predicted assigned a probability of 100% to the tweet’s content being disaster-related. More specifically, the explanation indicated which words within the tweet contributed to this prediction — showing that the words ‘Ocean’, ‘Indian’, and ‘Airlines’ are the biggest contributors.
Conclusion:
From the examples above, it becomes clear that single sample feature importance techniques — like LIME and SHAPE — present a great new tool within the toolkit of any machine learning practitioner or data scientist. In addition, academic research performed during recent years indicates that the interest in single-sample feature importance techniques is rapidly rising. It is widely accepted that this increasing interest is driven by a realization within the machine learning community that model explainability is a crucial factor in accelerating AI and Machine Learning adoption. In this way, innovations like the SHAPE and Lime algorithm are making the world of AI and Machine Learning a little less complex — and a little more intuitive — one step at a time.
References:
https://github.com/marcotcr/lime
Great post Pradeep , Very clearly explanation !
Clear explanation!
Fantastic read! Very well documented and explained as well :)
Arjun
2020/08/23Nice one!