Plot Feature Importance Objects in D3 with r2d3 Package. — plotD3.feature_importance

Function plotD3.feature_importance_explainer plots dropouts for variables used in the model. It uses output from feature_importance function that corresponds to permutation based measure of feature importance. Variables are sorted in the same order in all panels. The order depends on the average drop out loss. In different panels variable contributions may not look like sorted if variable importance is different in different models.

# S3 method for feature_importance_explainer
plotD3(
  x,
  ...,
  max_vars = NULL,
  show_boxplots = TRUE,
  bar_width = 12,
  split = "model",
  scale_height = FALSE,
  margin = 0.15,
  chart_title = "Feature importance"
)

Arguments

x: a feature importance explainer produced with the feature_importance() function
...: other explainers that shall be plotted together
max_vars: maximum number of variables that shall be presented for for each model. By default NULL which means all variables
show_boxplots: logical if TRUE (default) boxplot will be plotted to show permutation data.
bar_width: width of bars in px. By default 12px
split: either "model" or "feature" determines the plot layout
scale_height: a logical. If TRUE, the height of plot scales with window size. By default it's FALSE
margin: extend x axis domain range to adjust the plot. Usually value between 0.1 and 0.3, by default it's 0.15
chart_title: a character. Set custom title

Value

a r2d3 object.

References

Explanatory Model Analysis. Explore, Explain, and Examine Predictive Models. https://ema.drwhy.ai/

Examples

library("DALEX")
library("ingredients")

lm_model <- lm(m2.price ~., data = apartments)
explainer_lm <- explain(lm_model,
                        data = apartments[,-1],
                        y = apartments[,1],
                        verbose = FALSE)

fi_lm <- feature_importance(explainer_lm,
      loss_function = DALEX::loss_root_mean_square, B = 1)

head(fi_lm)
#>            variable mean_dropout_loss label
#> 1      _full_model_          279.3262    lm
#> 2 construction.year          279.2987    lm
#> 3          no.rooms          289.6181    lm
#> 4             floor          503.1583    lm
#> 5           surface          620.3488    lm
#> 6          district          990.7506    lm
plotD3(fi_lm)


# \donttest{
library("ranger")

rf_model <- ranger(m2.price~., data = apartments)

explainer_rf <- explain(rf_model,
                        data = apartments[,-1],
                        y = apartments[,1],
                        label = "ranger forest",
                        verbose = FALSE)

fi_rf <- feature_importance(explainer_rf, loss_function = DALEX::loss_root_mean_square)

head(fi_rf)
#>            variable mean_dropout_loss         label
#> 1      _full_model_          144.1809 ranger forest
#> 2          no.rooms          300.9969 ranger forest
#> 3 construction.year          378.9817 ranger forest
#> 4             floor          426.2261 ranger forest
#> 5           surface          521.2871 ranger forest
#> 6          district          774.9314 ranger forest
plotD3(fi_lm, fi_rf)


plotD3(fi_lm, fi_rf, split = "feature")


plotD3(fi_lm, fi_rf, max_vars = 3, bar_width = 16, scale_height = TRUE)

plotD3(fi_lm, fi_rf, max_vars = 3, bar_width = 16, split = "feature", scale_height = TRUE)

plotD3(fi_lm, margin = 0.2)

# }