`vignettes/vignette_yardstick.Rmd`

`vignette_yardstick.Rmd`

yardstick
is a package that offers many measures for evaluating model performance.
It is based on the `tidymodels`

/`tidyverse`

philosophy, the performance is calculated by functions working on the
data.frame with the results of the model.

DALEX uses model performance
measures to assess the importance of variables (in the model_parts
function). These are typically calculated based on loss functions
(functions with prefix `loss`

) that are working on two
vectors - the score from the model and the true target variable.

Although these packages have a slightly different philosophy of
operation, you can use the measures available in yardstick when working
with `DALEX.`

Below is information on how to use the
`loss_yardstick`

function to do this.

The `yardstick`

package supports both classification
models and regression models. We will start our example with a
classification model for the titanic data - the probability of surviving
this disaster.

The following instruction trains a classification model.

```
library("DALEX")
library("yardstick")
titanic_glm <- glm(survived~., data = titanic_imputed, family = "binomial")
```

The Class Probability Metrics in the `yardstick`

package
assume that the true value is a `factor`

and the model
returns a numerical score. So let’s prepare an `explainer`

that has `factor`

as `y`

and the
`predict_function`

returns the probability of the target
class (default behaviour).

**NOTE**: Performance measures will be calculated on
data supplied in the explainer. Put here the test data!

```
explainer_glm <- DALEX::explain(titanic_glm,
data = titanic_imputed[,-8],
y = factor(titanic_imputed$survived))
```

To make functions from the `yardstick`

compatible with
`DALEX`

we must use the `loss_yardstick`

adapter.
In the example below we use the `roc_auc`

function (area
under the receiver operator curve). The `yardstick::`

prefix
is not necessary, but we put it here to show explicitly where the
functions you use are located.

**NOTE**: we set
`yardstick.event_first = FALSE`

as the model predicts
probability of `survived = 1`

.

```
options(yardstick.event_first = FALSE)
glm_auc <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::roc_auc))
glm_auc
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.8176487 lm
#> 2 gender 0.6300156 lm
#> 3 class 0.7233351 lm
#> 4 age 0.7887847 lm
#> 5 sibsp 0.8087978 lm
#> 6 embarked 0.8104637 lm
#> 7 fare 0.8174282 lm
#> 8 parch 0.8177252 lm
#> 9 _baseline_ 0.5004211 lm
```

`plot(glm_auc)`

In a similar way, we can use the `pr_auc`

function (area
under the precision recall curve).

```
glm_prauc <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::pr_auc))
glm_prauc
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7343162 lm
#> 2 gender 0.4267867 lm
#> 3 class 0.5912337 lm
#> 4 age 0.7119210 lm
#> 5 sibsp 0.7202304 lm
#> 6 embarked 0.7232801 lm
#> 7 parch 0.7340958 lm
#> 8 fare 0.7342991 lm
#> 9 _baseline_ 0.3216100 lm
```

`plot(glm_prauc)`

The Classification Metrics in the `yardstick`

package
assume that the true value is a `factor`

and the model
returns a `factor`

variable.

This is different behavior than for most explanations in DALEX, because when explaining predictions we typically operate on class membership probabilities. If we want to use Classification Metrics we need to provide a predict function that returns classes instead of probabilities.

So let’s prepare an `explainer`

that has
`factor`

as `y`

and the
`predict_function`

returns classes.

```
explainer_glm <- DALEX::explain(titanic_glm,
data = titanic_imputed[,-8],
y = factor(titanic_imputed$survived),
predict_function = function(m,x) {
factor(as.numeric(predict(m, x, type = "response") > 0.5),
levels = c("0", "1"))
})
```

Again, let’s use the `loss_yardstick`

adapter. In the
example below we use the `accuracy`

function.

```
glm_accuracy <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::accuracy))
glm_accuracy
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7995 lm
#> 2 gender 0.6425 lm
#> 3 class 0.7338 lm
#> 4 age 0.7826 lm
#> 5 sibsp 0.7932 lm
#> 6 embarked 0.7974 lm
#> 7 parch 0.7989 lm
#> 8 fare 0.8000 lm
#> 9 _baseline_ 0.5867 lm
```

`plot(glm_accuracy)`

In a similar way, we can use the `bal_accuracy`

function
(balanced accuracy).

```
glm_bal_accuracy <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::bal_accuracy))
glm_bal_accuracy
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7428542 lm
#> 2 gender 0.5688662 lm
#> 3 class 0.6798010 lm
#> 4 age 0.7276367 lm
#> 5 sibsp 0.7376837 lm
#> 6 embarked 0.7391225 lm
#> 7 fare 0.7420058 lm
#> 8 parch 0.7427821 lm
#> 9 _baseline_ 0.5035898 lm
```

`plot(glm_bal_accuracy)`

For the loss function, the smaller the values the better the model.
Therefore, the importance of variables is often calculated as
`loss(perturbed) - loss(original)`

.

But many model performance functions have the opposite
characteristic, the higher they are the better (e.g. `AUC`

,
`accuracy`

, etc). To maintain a consistent analysis pipeline
it is convenient to invert such functions, e.g. by converting to
`1- AUC`

or `1 - accuracy`

.

To do it, just add the `reverse = TRUE`

argument.

```
glm_1accuracy <- model_parts(explainer_glm,
loss_function = loss_yardstick(accuracy, reverse = TRUE))
glm_1accuracy
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.1965 lm
#> 2 fare 0.1954 lm
#> 3 parch 0.1969 lm
#> 4 embarked 0.1979 lm
#> 5 sibsp 0.2024 lm
#> 6 age 0.2143 lm
#> 7 class 0.2518 lm
#> 8 gender 0.3579 lm
#> 9 _baseline_ 0.4081 lm
```

`plot(glm_1accuracy)`

By default the performance is calculated on `N = 1000`

randomly selected observations (to speed up the calculations). Set
`N = NULL`

to use the whole dataset.

```
glm_1accuracy <- model_parts(explainer_glm,
loss_function = loss_yardstick(accuracy, reverse = TRUE),
N = NULL)
plot(glm_1accuracy)
```

The following instruction trains a regression model.

The Regression Metrics in the `yardstick`

package assume
that the true value is a `numeric`

variable and the model
returns a `numeric`

score.

```
explainer_ranger <- DALEX::explain(apartments_ranger, data = apartments[,-1],
y = apartments$m2.price, label = "Ranger Apartments")
```

```
#> Preparation of a new explainer is initiated
#> -> model label : Ranger Apartments
#> -> data : 1000 rows 5 cols
#> -> target variable : 1000 values
#> -> predict function : yhat.ranger will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package ranger , ver. 0.14.1 , task regression ( default )
#> -> predicted values : numerical, min = 1880.33 , mean = 3489.547 , max = 6152.823
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -504.575 , mean = -2.527879 , max = 631.0517
#> A new explainer has been created!
```

To make functions from the `yardstick`

compatible with
`DALEX`

we must use the `loss_yardstick`

adapter.
In the example below we use the `rmse`

function (root mean
squared error).

```
ranger_rmse <- model_parts(explainer_ranger, type = "raw",
loss_function = loss_yardstick(rmse))
ranger_rmse
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 155.9598 Ranger Apartments
#> 2 no.rooms 329.3016 Ranger Apartments
#> 3 construction.year 386.5747 Ranger Apartments
#> 4 floor 447.6254 Ranger Apartments
#> 5 surface 526.6621 Ranger Apartments
#> 6 district 759.7785 Ranger Apartments
#> 7 _baseline_ 1201.4958 Ranger Apartments
```

`plot(ranger_rmse)`

And one more example for `rsq`

function (R squared).

```
ranger_rsq <- model_parts(explainer_ranger, type = "raw",
loss_function = loss_yardstick(rsq))
ranger_rsq
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.9822143560 Ranger Apartments
#> 2 district 0.3283019847 Ranger Apartments
#> 3 surface 0.6832458690 Ranger Apartments
#> 4 floor 0.7655788254 Ranger Apartments
#> 5 construction.year 0.8268200544 Ranger Apartments
#> 6 no.rooms 0.9125441548 Ranger Apartments
#> 7 _baseline_ 0.0007561657 Ranger Apartments
```

`plot(ranger_rsq)`

I hope that using the `yardstick`

package at
`DALEX`

will now be easy and enjoyable. If you would like to
share your experience with this package, please create an issue at https://github.com/ModelOriented/DALEX/issues.

```
#> R version 4.2.2 (2022-10-31)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur ... 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ranger_0.14.1 yardstick_1.1.0 DALEX_2.5.0
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.2.0 xfun_0.36 bslib_0.4.2 purrr_1.0.1
#> [5] lattice_0.20-45 colorspace_2.1-0 vctrs_0.5.2 generics_0.1.3
#> [9] htmltools_0.5.4 yaml_2.3.7 utf8_1.2.2 rlang_1.0.6
#> [13] pkgdown_2.0.7 jquerylib_0.1.4 pillar_1.8.1 glue_1.6.2
#> [17] withr_2.5.0 lifecycle_1.0.3 stringr_1.5.0 munsell_0.5.0
#> [21] gtable_0.3.1 ragg_1.2.5 memoise_2.0.1 evaluate_0.20
#> [25] labeling_0.4.2 knitr_1.42 fastmap_1.1.0 fansi_1.0.4
#> [29] highr_0.10 Rcpp_1.0.10 scales_1.2.1 cachem_1.0.6
#> [33] desc_1.4.2 jsonlite_1.8.4 ingredients_2.3.0 farver_2.1.1
#> [37] systemfonts_1.0.4 fs_1.6.0 textshaping_0.3.6 ggplot2_3.4.0
#> [41] digest_0.6.31 stringi_1.7.12 dplyr_1.0.10 grid_4.2.2
#> [45] rprojroot_2.0.3 cli_3.6.0 tools_4.2.2 magrittr_2.0.3
#> [49] sass_0.4.5 tibble_3.1.8 crayon_1.5.2 pkgconfig_2.0.3
#> [53] Matrix_1.5-1 rmarkdown_2.20 R6_2.5.1 compiler_4.2.2
```