Here we will use the dragons data from DALEX package to present the iBreakDown for regression models.

#>   year_of_birth   height   weight scars colour year_of_discovery
#> 1         -1291 59.40365 15.32391     7    red              1700
#> 2          1589 46.21374 11.80819     5    red              1700
#> 3          1528 49.17233 13.34482     6    red              1700
#> 4          1645 48.29177 13.27427     5  green              1700
#> 5            -8 49.99679 13.08757     1    red              1700
#> 6           915 45.40876 11.48717     2    red              1700
#>   number_of_lost_teeth life_length
#> 1                   25   1368.4331
#> 2                   28   1377.0474
#> 3                   38   1603.9632
#> 4                   33   1434.4222
#> 5                   18    985.4905
#> 6                   20    969.5682
#>   year_of_birth   height   weight scars colour year_of_discovery
#> 1          -938 39.18619 10.02391     4  black              1800
#>   number_of_lost_teeth life_length
#> 1                   30     1375.38

Linear regression

First, we fit a model.

m_lm <- lm(life_length ~ . , data = dragons)

To understand the factors that drive predictions for a single observation we use the iBreakDown package.

Now, we create an object of the break_down class. If we want to plot distributions of partial predictions, set keep_distributions = TRUE.

We can simply print the result.

#>                               contribution
#> lm: intercept                     1356.562
#> lm: scars = 4                     -235.221
#> lm: number_of_lost_teeth = 30      205.037
#> lm: year_of_birth = -940            22.193
#> lm: height = 39                     11.296
#> lm: colour = black                  10.856
#> lm: weight = 10                     -9.217
#> lm: year_of_discovery = 1800         4.668
#> lm: life_length = 1400               0.000
#> lm: prediction                    1366.174

Or plot the result which is more clear.

plot(bd_lm)

Use the baseline parameter to set the origin of plots.

plot(bd_lm, baseline = 0)

Use the plot_distributions parameter to see distributions of partial predictions.

plot(bd_lm, plot_distributions = TRUE)

For another types of models we proceed analogously. However, sometimes we need to create custom predict function (see nnet example).

randomForest

#>                               contribution
#> lm: intercept                     1356.562
#> lm: scars = 4                     -235.221
#> lm: number_of_lost_teeth = 30      205.037
#> lm: year_of_birth = -940            22.193
#> lm: height = 39                     11.296
#> lm: colour = black                  10.856
plot(bd_rf)

nnet

When you use nnet package for regression, remember to normalize the resposne variable, in such a way that it is from interval \((0,1)\).

In this case, creating custom predict function is also needed.

library(nnet)

x <- max(abs(dragons$life_length))
digits <- floor(log10(x))
normalizing_factor <- round(x, -digits)
m_nnet <- nnet(life_length/normalizing_factor ~ . , data = dragons, size = 10, linout = TRUE)
#> # weights:  111
#> initial  value 2781.620407 
#> iter  10 value 34.610693
#> iter  20 value 28.731518
#> iter  30 value 27.555372
#> iter  40 value 26.413933
#> iter  50 value 24.183990
#> iter  60 value 14.086360
#> iter  70 value 10.594963
#> iter  80 value 7.731425
#> iter  90 value 2.982037
#> iter 100 value 1.533715
#> final  value 1.533715 
#> stopped after 100 iterations