`vignettes/vignette_yardstick.Rmd`

`vignette_yardstick.Rmd`

yardstick is a package that offers many measures for evaluating model performance. It is based on the `tidymodels`

/`tidyverse`

philosophy, the performance is calculated by functions working on the data.frame with the results of the model.

DALEX uses model performance measures to assess the importance of variables (in the model_parts function). These are typically calculated based on loss functions (functions with prefix `loss`

) that are working on two vectors - the score from the model and the true target variable.

Although these packages have a slightly different philosophy of operation, you can use the measures available in yardstick when working with `DALEX.`

Below is information on how to use the `loss_yardstick`

function to do this.

The `yardstick`

package supports both classification models and regression models. We will start our example with a classification model for the titanic data - the probability of surviving this disaster.

The following instruction trains a classification model.

```
library("DALEX")
library("yardstick")
titanic_glm <- glm(survived~., data = titanic_imputed, family = "binomial")
```

The Class Probability Metrics in the `yardstick`

package assume that the true value is a `factor`

and the model returns a numerical score. So let’s prepare an `explainer`

that has `factor`

as `y`

and the `predict_function`

returns the probability of the target class (default behaviour).

**NOTE**: Performance measures will be calculated on data supplied in the explainer. Put here the test data!

```
explainer_glm <- DALEX::explain(titanic_glm,
data = titanic_imputed[,-8],
y = factor(titanic_imputed$survived))
```

To make functions from the `yardstick`

compatible with `DALEX`

we must use the `loss_yardstick`

adapter. In the example below we use the `roc_auc`

function (area under the receiver operator curve). The `yardstick::`

prefix is not necessary, but we put it here to show explicitly where the functions you use are located.

**NOTE**: we set `yardstick.event_first = FALSE`

as the model predicts probability of `survived = 1`

.

```
options(yardstick.event_first = FALSE)
glm_auc <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::roc_auc))
glm_auc
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.8137302 lm
#> 2 gender 0.6174230 lm
#> 3 class 0.7166700 lm
#> 4 age 0.7863548 lm
#> 5 sibsp 0.8033765 lm
#> 6 embarked 0.8074437 lm
#> 7 fare 0.8133961 lm
#> 8 parch 0.8137905 lm
#> 9 _baseline_ 0.4871857 lm
```

`plot(glm_auc)`

In a similar way, we can use the `pr_auc`

function (area under the precision recall curve).

```
glm_prauc <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::pr_auc))
glm_prauc
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7415507 lm
#> 2 gender 0.4459565 lm
#> 3 class 0.5865294 lm
#> 4 age 0.7162901 lm
#> 5 embarked 0.7273457 lm
#> 6 sibsp 0.7277256 lm
#> 7 parch 0.7416492 lm
#> 8 fare 0.7418198 lm
#> 9 _baseline_ 0.3273634 lm
```

`plot(glm_prauc)`

The Classification Metrics in the `yardstick`

package assume that the true value is a `factor`

and the model returns a `factor`

variable.

This is different behavior than for most explanations in DALEX, because when explaining predictions we typically operate on class membership probabilities. If we want to use Classification Metrics we need to provide a predict function that returns classes instead of probabilities.

So let’s prepare an `explainer`

that has `factor`

as `y`

and the `predict_function`

returns classes.

```
explainer_glm <- DALEX::explain(titanic_glm,
data = titanic_imputed[,-8],
y = factor(titanic_imputed$survived),
predict_function = function(m,x) {
factor(as.numeric(predict(m, x, type = "response") > 0.5),
levels = c("0", "1"))
})
```

Again, let’s use the `loss_yardstick`

adapter. In the example below we use the `accuracy`

function.

```
glm_accuracy <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::accuracy))
glm_accuracy
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7977 lm
#> 2 gender 0.6334 lm
#> 3 class 0.7399 lm
#> 4 age 0.7875 lm
#> 5 sibsp 0.7923 lm
#> 6 embarked 0.7960 lm
#> 7 parch 0.7976 lm
#> 8 fare 0.7986 lm
#> 9 _baseline_ 0.5867 lm
```

`plot(glm_accuracy)`

In a similar way, we can use the `bal_accuracy`

function (balanced accuracy).

```
glm_bal_accuracy <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::bal_accuracy))
glm_bal_accuracy
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7397031 lm
#> 2 gender 0.5619733 lm
#> 3 class 0.6788103 lm
#> 4 age 0.7268739 lm
#> 5 embarked 0.7346292 lm
#> 6 sibsp 0.7377564 lm
#> 7 parch 0.7393319 lm
#> 8 fare 0.7396152 lm
#> 9 _baseline_ 0.5051107 lm
```

`plot(glm_bal_accuracy)`

For the loss function, the smaller the values the better the model. Therefore, the importance of variables is often calculated as `loss(perturbed) - loss(original)`

.

But many model performance functions have the opposite characteristic, the higher they are the better (e.g. `AUC`

, `accuracy`

, etc). To maintain a consistent analysis pipeline it is convenient to invert such functions, e.g. by converting to `1- AUC`

or `1 - accuracy`

.

To do it, just add the `reverse = TRUE`

argument.

```
glm_1accuracy <- model_parts(explainer_glm,
loss_function = loss_yardstick(accuracy, reverse = TRUE))
glm_1accuracy
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.1980 lm
#> 2 fare 0.1974 lm
#> 3 parch 0.1984 lm
#> 4 embarked 0.1986 lm
#> 5 sibsp 0.2035 lm
#> 6 age 0.2152 lm
#> 7 class 0.2525 lm
#> 8 gender 0.3658 lm
#> 9 _baseline_ 0.4144 lm
```

`plot(glm_1accuracy)`

By default the performance is calculated on `N = 1000`

randomly selected observations (to speed up the calculations). Set `N = NULL`

to use the whole dataset.

```
glm_1accuracy <- model_parts(explainer_glm,
loss_function = loss_yardstick(accuracy, reverse = TRUE),
N = NULL)
plot(glm_1accuracy)
```

The following instruction trains a regression model.

The Regression Metrics in the `yardstick`

package assume that the true value is a `numeric`

variable and the model returns a `numeric`

score.

```
explainer_ranger <- DALEX::explain(apartments_ranger, data = apartments[,-1],
y = apartments$m2.price, label = "Ranger Apartments")
```

```
#> Preparation of a new explainer is initiated
#> -> model label : Ranger Apartments
#> -> data : 1000 rows 5 cols
#> -> target variable : 1000 values
#> -> predict function : yhat.ranger will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package ranger , ver. 0.14.1 , task regression ( default )
#> -> predicted values : numerical, min = 1899.414 , mean = 3488.861 , max = 6046.561
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -445.4107 , mean = -1.842099 , max = 736.702
#> A new explainer has been created!
```

To make functions from the `yardstick`

compatible with `DALEX`

we must use the `loss_yardstick`

adapter. In the example below we use the `rmse`

function (root mean squared error).

```
ranger_rmse <- model_parts(explainer_ranger, type = "raw",
loss_function = loss_yardstick(rmse))
ranger_rmse
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 155.6184 Ranger Apartments
#> 2 no.rooms 316.5036 Ranger Apartments
#> 3 construction.year 392.5758 Ranger Apartments
#> 4 floor 431.6225 Ranger Apartments
#> 5 surface 524.9066 Ranger Apartments
#> 6 district 763.5492 Ranger Apartments
#> 7 _baseline_ 1206.6411 Ranger Apartments
```

`plot(ranger_rmse)`

And one more example for `rsq`

function (R squared).

```
ranger_rsq <- model_parts(explainer_ranger, type = "raw",
loss_function = loss_yardstick(rsq))
ranger_rsq
```

```
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.983377709 Ranger Apartments
#> 2 district 0.326764600 Ranger Apartments
#> 3 surface 0.667962198 Ranger Apartments
#> 4 floor 0.775455846 Ranger Apartments
#> 5 construction.year 0.816108921 Ranger Apartments
#> 6 no.rooms 0.920990517 Ranger Apartments
#> 7 _baseline_ 0.001389768 Ranger Apartments
```

`plot(ranger_rsq)`

I hope that using the `yardstick`

package at `DALEX`

will now be easy and enjoyable. If you would like to share your experience with this package, please create an issue at https://github.com/ModelOriented/DALEX/issues.

```
#> R version 4.2.1 (2022-06-23)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur ... 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ranger_0.14.1 yardstick_1.0.0 DALEX_2.4.2
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.1.2 xfun_0.32 bslib_0.4.0 purrr_0.3.4
#> [5] lattice_0.20-45 colorspace_2.0-3 vctrs_0.4.1 generics_0.1.3
#> [9] htmltools_0.5.3 yaml_2.3.5 utf8_1.2.2 rlang_1.0.5
#> [13] pkgdown_2.0.6 jquerylib_0.1.4 pillar_1.8.1 glue_1.6.2
#> [17] lifecycle_1.0.1 stringr_1.4.1 munsell_0.5.0 gtable_0.3.1
#> [21] ragg_1.2.2 memoise_2.0.1 evaluate_0.16 labeling_0.4.2
#> [25] knitr_1.40 fastmap_1.1.0 fansi_1.0.3 highr_0.9
#> [29] Rcpp_1.0.9 scales_1.2.1 cachem_1.0.6 desc_1.4.1
#> [33] jsonlite_1.8.0 ingredients_2.2.0 farver_2.1.1 systemfonts_1.0.4
#> [37] fs_1.5.2 textshaping_0.3.6 ggplot2_3.3.6 digest_0.6.29
#> [41] stringi_1.7.8 dplyr_1.0.10 grid_4.2.1 rprojroot_2.0.3
#> [45] cli_3.3.0 tools_4.2.1 magrittr_2.0.3 sass_0.4.2
#> [49] tibble_3.1.8 crayon_1.5.1 pkgconfig_2.0.3 Matrix_1.4-1
#> [53] rmarkdown_2.16 R6_2.5.1 compiler_4.2.1
```