Explainable AI Cheatsheet

Explainable AI Cheatsheet

Before you start

import xgboost
import shap
import as px
import dalex as dx

X, y =
model = xgboost.XGBRegressor().fit(X, y)

explainer = shap.Explainer(model)
boston_rf_exp = dx.Explainer(model, X, y, 
                label="Boston houses RF Pipeline")
shap_values = explainer(X)

Waterfall chart

Used to see contri­­bu­tions of different atributes for the predic­­tion. These SHAP values are valid for this observ­­ation only. With other data points the SHAP values will change.


Force plot

Exactly the same purpose as the waterfall chart but much more compact


SHAP Summaries

If you take force plots for all observ­­at­ions, rotate them by 90 degrees and then put next to each other you obtain a SHAP summary plot. This is very useful if you want te see explan­­ations for the entire dataset.


SHAP Beeswarm

Useful to see which attributes are the most important. For every feature and every sample we plot a dot. We denote value of the feature with color: big (red) or small (blue). On the X-axis we see the import­­ance. From this plot we see that LSTAT is probably the most important attribute. Also, high value of RM increases the model prediction


Feature intera­ction

This one is helpful to capture feature intera­­ction and how they influence SHAP value for given feature. On X and Y axis we have inform­­ation about attribute we are interested in. Color represents value of another feature that is intera­­cting with consid­­ered. From here we see that if RAD is small then RM have quite big impact on the prediction whereas when RAD is big then this impact is much smaller.

shap.p­­lo­t­s.s­­cat­­te­r­(­sh­­ap_­­va­l­u­es­­[:,­­"­R­M"], color=­­sh­a­p­_v­­alues)

SHAP for text

We can extend this idea to text and see how particular words influence the predic­­tion.

SHAP for images

This can be also used for images to see the influence of individual pixels.

Breakdown plot

This plot shows the decomp­­os­ition of the model's prediction into contri­­bu­tions of different attributes

bd = boston­­_r­f­_­ex­­p.p­­re­d­i­ct­­_pa­­rt­s­(­house, type='­­br­e­a­k_­­down')


Permut­ation importance

Every attribute is scramled and then based on some evaluation metric (MSE, ACC) we give them scores. Can be visualized on bar chart.

from sklear­n.i­nsp­ection import permut­ati­on_­imp­ortance

r = permut­­at­i­o­n_­­imp­­or­t­a­nc­­e(m­­odel, X, y)

Tree models feature importance

Tree algorithms offer importance scores based on the reduction in the evaluation criterion, like Gini or entropy. Can be used either in regression or classi­­fi­c­ation problems in decision trees, random forests or boosting methods.­­(x­=­X.c­­olumns, y=mode­­l.f­­e­at­­ure­­_i­m­p­or­­tan­­ces_)

Ceteris paribus profiles (partial dependence plot)

This figure shows how different attributes in a new instance can change a prediction of the model.
In a nutshell, we held all explan­­atory variables but one (can increase this but comput­­at­ional const increases by much) constant. Then we change the values of one selected and see how the response changes.

cp = boston­_­r­­f­­_e­­­xp.p­­­r­­e­d­­ic­­­t­_p­­­ro­­f­­i­l­e­­­(house)

cp.plo­­­t­(­v­­a­­ri­­­ab­l­­­es­­=­[­­'NOX', 'RM', 'DIS', 'LSTAT'])

Linear model feature importance

After scaling features we can measure how each attribute is important for the model

from sklear­n.l­ine­ar_­model import Linear­Reg­ression

from sklear­n.p­rep­roc­essing import scale

linarModel = Linear­Reg­res­sio­n().fi­t(s­cal­e(X), y)­­­(­y­=­­l­­­in­a­­­rM­­o­d­­el.c­­o­­ef_,­­­l­umns)


