vignettes/vignette_yardstick.Rmd
vignette_yardstick.Rmd
yardstick
is a package that offers many measures for evaluating model performance.
It is based on the tidymodels
/tidyverse
philosophy, the performance is calculated by functions working on the
data.frame with the results of the model.
DALEX uses model performance
measures to assess the importance of variables (in the model_parts
function). These are typically calculated based on loss functions
(functions with prefix loss
) that are working on two
vectors - the score from the model and the true target variable.
Although these packages have a slightly different philosophy of
operation, you can use the measures available in yardstick when working
with DALEX.
Below is information on how to use the
loss_yardstick
function to do this.
The yardstick
package supports both classification
models and regression models. We will start our example with a
classification model for the titanic data - the probability of surviving
this disaster.
The following instruction trains a classification model.
library("DALEX")
library("yardstick")
titanic_glm <- glm(survived~., data = titanic_imputed, family = "binomial")
The Class Probability Metrics in the yardstick
package
assume that the true value is a factor
and the model
returns a numerical score. So let’s prepare an explainer
that has factor
as y
and the
predict_function
returns the probability of the target
class (default behaviour).
NOTE: Performance measures will be calculated on data supplied in the explainer. Put here the test data!
explainer_glm <- DALEX::explain(titanic_glm,
data = titanic_imputed[,-8],
y = factor(titanic_imputed$survived))
To make functions from the yardstick
compatible with
DALEX
we must use the loss_yardstick
adapter.
In the example below we use the roc_auc
function (area
under the receiver operator curve). The yardstick::
prefix
is not necessary, but we put it here to show explicitly where the
functions you use are located.
NOTE: we set
yardstick.event_first = FALSE
as the model predicts
probability of survived = 1
.
options(yardstick.event_first = FALSE)
glm_auc <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::roc_auc))
glm_auc
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.8176487 lm
#> 2 gender 0.6300156 lm
#> 3 class 0.7233351 lm
#> 4 age 0.7887847 lm
#> 5 sibsp 0.8087978 lm
#> 6 embarked 0.8104637 lm
#> 7 fare 0.8174282 lm
#> 8 parch 0.8177252 lm
#> 9 _baseline_ 0.5004211 lm
plot(glm_auc)
In a similar way, we can use the pr_auc
function (area
under the precision recall curve).
glm_prauc <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::pr_auc))
glm_prauc
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7343162 lm
#> 2 gender 0.4267867 lm
#> 3 class 0.5912337 lm
#> 4 age 0.7119210 lm
#> 5 sibsp 0.7202304 lm
#> 6 embarked 0.7232801 lm
#> 7 parch 0.7340958 lm
#> 8 fare 0.7342991 lm
#> 9 _baseline_ 0.3216100 lm
plot(glm_prauc)
The Classification Metrics in the yardstick
package
assume that the true value is a factor
and the model
returns a factor
variable.
This is different behavior than for most explanations in DALEX, because when explaining predictions we typically operate on class membership probabilities. If we want to use Classification Metrics we need to provide a predict function that returns classes instead of probabilities.
So let’s prepare an explainer
that has
factor
as y
and the
predict_function
returns classes.
explainer_glm <- DALEX::explain(titanic_glm,
data = titanic_imputed[,-8],
y = factor(titanic_imputed$survived),
predict_function = function(m,x) {
factor(as.numeric(predict(m, x, type = "response") > 0.5),
levels = c("0", "1"))
})
Again, let’s use the loss_yardstick
adapter. In the
example below we use the accuracy
function.
glm_accuracy <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::accuracy))
glm_accuracy
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7995 lm
#> 2 gender 0.6425 lm
#> 3 class 0.7338 lm
#> 4 age 0.7826 lm
#> 5 sibsp 0.7932 lm
#> 6 embarked 0.7974 lm
#> 7 parch 0.7989 lm
#> 8 fare 0.8000 lm
#> 9 _baseline_ 0.5867 lm
plot(glm_accuracy)
In a similar way, we can use the bal_accuracy
function
(balanced accuracy).
glm_bal_accuracy <- model_parts(explainer_glm, type = "raw",
loss_function = loss_yardstick(yardstick::bal_accuracy))
glm_bal_accuracy
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.7428542 lm
#> 2 gender 0.5688662 lm
#> 3 class 0.6798010 lm
#> 4 age 0.7276367 lm
#> 5 sibsp 0.7376837 lm
#> 6 embarked 0.7391225 lm
#> 7 fare 0.7420058 lm
#> 8 parch 0.7427821 lm
#> 9 _baseline_ 0.5035898 lm
plot(glm_bal_accuracy)
For the loss function, the smaller the values the better the model.
Therefore, the importance of variables is often calculated as
loss(perturbed) - loss(original)
.
But many model performance functions have the opposite
characteristic, the higher they are the better (e.g. AUC
,
accuracy
, etc). To maintain a consistent analysis pipeline
it is convenient to invert such functions, e.g. by converting to
1- AUC
or 1 - accuracy
.
To do it, just add the reverse = TRUE
argument.
glm_1accuracy <- model_parts(explainer_glm,
loss_function = loss_yardstick(accuracy, reverse = TRUE))
glm_1accuracy
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.1965 lm
#> 2 fare 0.1954 lm
#> 3 parch 0.1969 lm
#> 4 embarked 0.1979 lm
#> 5 sibsp 0.2024 lm
#> 6 age 0.2143 lm
#> 7 class 0.2518 lm
#> 8 gender 0.3579 lm
#> 9 _baseline_ 0.4081 lm
plot(glm_1accuracy)
By default the performance is calculated on N = 1000
randomly selected observations (to speed up the calculations). Set
N = NULL
to use the whole dataset.
glm_1accuracy <- model_parts(explainer_glm,
loss_function = loss_yardstick(accuracy, reverse = TRUE),
N = NULL)
plot(glm_1accuracy)
The following instruction trains a regression model.
The Regression Metrics in the yardstick
package assume
that the true value is a numeric
variable and the model
returns a numeric
score.
explainer_ranger <- DALEX::explain(apartments_ranger, data = apartments[,-1],
y = apartments$m2.price, label = "Ranger Apartments")
#> Preparation of a new explainer is initiated
#> -> model label : Ranger Apartments
#> -> data : 1000 rows 5 cols
#> -> target variable : 1000 values
#> -> predict function : yhat.ranger will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package ranger , ver. 0.14.1 , task regression ( default )
#> -> predicted values : numerical, min = 1880.33 , mean = 3489.547 , max = 6152.823
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -504.575 , mean = -2.527879 , max = 631.0517
#> A new explainer has been created!
To make functions from the yardstick
compatible with
DALEX
we must use the loss_yardstick
adapter.
In the example below we use the rmse
function (root mean
squared error).
ranger_rmse <- model_parts(explainer_ranger, type = "raw",
loss_function = loss_yardstick(rmse))
ranger_rmse
#> variable mean_dropout_loss label
#> 1 _full_model_ 155.9598 Ranger Apartments
#> 2 no.rooms 329.3016 Ranger Apartments
#> 3 construction.year 386.5747 Ranger Apartments
#> 4 floor 447.6254 Ranger Apartments
#> 5 surface 526.6621 Ranger Apartments
#> 6 district 759.7785 Ranger Apartments
#> 7 _baseline_ 1201.4958 Ranger Apartments
plot(ranger_rmse)
And one more example for rsq
function (R squared).
ranger_rsq <- model_parts(explainer_ranger, type = "raw",
loss_function = loss_yardstick(rsq))
ranger_rsq
#> variable mean_dropout_loss label
#> 1 _full_model_ 0.9822143560 Ranger Apartments
#> 2 district 0.3283019847 Ranger Apartments
#> 3 surface 0.6832458690 Ranger Apartments
#> 4 floor 0.7655788254 Ranger Apartments
#> 5 construction.year 0.8268200544 Ranger Apartments
#> 6 no.rooms 0.9125441548 Ranger Apartments
#> 7 _baseline_ 0.0007561657 Ranger Apartments
plot(ranger_rsq)
I hope that using the yardstick
package at
DALEX
will now be easy and enjoyable. If you would like to
share your experience with this package, please create an issue at https://github.com/ModelOriented/DALEX/issues.
#> R version 4.2.2 (2022-10-31)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur ... 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ranger_0.14.1 yardstick_1.1.0 DALEX_2.5.0
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.2.0 xfun_0.36 bslib_0.4.2 purrr_1.0.1
#> [5] lattice_0.20-45 colorspace_2.1-0 vctrs_0.5.2 generics_0.1.3
#> [9] htmltools_0.5.4 yaml_2.3.7 utf8_1.2.2 rlang_1.0.6
#> [13] pkgdown_2.0.7 jquerylib_0.1.4 pillar_1.8.1 glue_1.6.2
#> [17] withr_2.5.0 lifecycle_1.0.3 stringr_1.5.0 munsell_0.5.0
#> [21] gtable_0.3.1 ragg_1.2.5 memoise_2.0.1 evaluate_0.20
#> [25] labeling_0.4.2 knitr_1.42 fastmap_1.1.0 fansi_1.0.4
#> [29] highr_0.10 Rcpp_1.0.10 scales_1.2.1 cachem_1.0.6
#> [33] desc_1.4.2 jsonlite_1.8.4 ingredients_2.3.0 farver_2.1.1
#> [37] systemfonts_1.0.4 fs_1.6.0 textshaping_0.3.6 ggplot2_3.4.0
#> [41] digest_0.6.31 stringi_1.7.12 dplyr_1.0.10 grid_4.2.2
#> [45] rprojroot_2.0.3 cli_3.6.0 tools_4.2.2 magrittr_2.0.3
#> [49] sass_0.4.5 tibble_3.1.8 crayon_1.5.2 pkgconfig_2.0.3
#> [53] Matrix_1.5-1 rmarkdown_2.20 R6_2.5.1 compiler_4.2.2