DALEX is designed to work with various black-box models like tree ensembles, linear models, neural networks etc. Unfortunately R packages that create such models are very inconsistent. Different tools use different interfaces to train, validate and use models. One of those tools, which is one of the most popular one is mlr3 package. We would like to present dedicated explain function for it.

- model
object - a model to be explained

- data
data.frame or matrix - data which will be used to calculate the explanations. If not provided, then it will be extracted from the model. Data should be passed without a target column (this shall be provided as the

`y`

argument). NOTE: If the target variable is present in the`data`

, some of the functionalities may not work properly.- y
numeric vector with outputs/scores. If provided, then it shall have the same size as

`data`

- weights
numeric vector with sampling weights. By default it's

`NULL`

. If provided, then it shall have the same length as`data`

- predict_function
function that takes two arguments: model and new data and returns a numeric vector with predictions. By default it is

`yhat`

.- predict_function_target_column
Character or numeric containing either column name or column number in the model prediction object of the class that should be considered as positive (i.e. the class that is associated with probability 1). If NULL, the second column of the output will be taken for binary classification. For a multiclass classification setting, that parameter cause switch to binary classification mode with one vs others probabilities.

- residual_function
function that takes four arguments: model, data, target vector y and predict function (optionally). It should return a numeric vector with model residuals for given data. If not provided, response residuals (\(y-\hat{y}\)) are calculated. By default it is

`residual_function_default`

.- ...
other parameters

- label
character - the name of the model. By default it's extracted from the 'class' attribute of the model

- verbose
logical. If TRUE (default) then diagnostic messages will be printed

- precalculate
logical. If TRUE (default) then

`predicted_values`

and`residual`

are calculated when explainer is created. This will happen also if`verbose`

is TRUE. Set both`verbose`

and`precalculate`

to FALSE to omit calculations.- colorize
logical. If TRUE (default) then

`WARNINGS`

,`ERRORS`

and`NOTES`

are colorized. Will work only in the R console. Now by default it is`FALSE`

while knitting and`TRUE`

otherwise.- model_info
a named list (

`package`

,`version`

,`type`

) containing information about model. If`NULL`

,`DALEX`

will seek for information on it's own.- type
type of a model, either

`classification`

or`regression`

. If not specified then`type`

will be extracted from`model_info`

.

explainer object (`explain`

) ready to work with DALEX

```
library("DALEXtra")
library(mlr3)
#> Warning: Packages 'paradox' and 'ParamHelpers' are conflicting and should not be loaded in the same session
#> Warning: Packages 'mlr3' and 'mlr' are conflicting and should not be loaded in the same session
#>
#> Attaching package: ‘mlr3’
#> The following objects are masked from ‘package:mlr’:
#>
#> benchmark, resample
titanic_imputed$survived <- as.factor(titanic_imputed$survived)
task_classif <- TaskClassif$new(id = "1", backend = titanic_imputed, target = "survived")
learner_classif <- lrn("classif.rpart", predict_type = "prob")
learner_classif$train(task_classif)
explain_mlr3(learner_classif, data = titanic_imputed,
y = as.numeric(as.character(titanic_imputed$survived)))
#> Preparation of a new explainer is initiated
#> -> model label : R6 ( default )
#> -> data : 2207 rows 8 cols
#> -> target variable : 2207 values
#> -> predict function : yhat.LearnerClassif will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package mlr3 , ver. 0.13.3 , task classification ( default )
#> -> predicted values : numerical, min = 0.05555556 , mean = 0.3221568 , max = 0.9267399
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -0.9267399 , mean = 1.858834e-17 , max = 0.9444444
#> A new explainer has been created!
#> Model label: R6
#> Model class: LearnerClassifRpart,LearnerClassif,Learner,R6
#> Data head :
#> gender age class embarked fare sibsp parch survived
#> 1 male 42 3rd Southampton 7.11 0 0 0
#> 2 male 13 3rd Southampton 20.05 0 2 0
task_regr <- TaskRegr$new(id = "2", backend = apartments, target = "m2.price")
learner_regr <- lrn("regr.rpart")
learner_regr$train(task_regr)
explain_mlr3(learner_regr, data = apartments, apartments$m2.price)
#> Preparation of a new explainer is initiated
#> -> model label : R6 ( default )
#> -> data : 1000 rows 6 cols
#> -> target variable : 1000 values
#> -> predict function : yhat.LearnerRegr will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package mlr3 , ver. 0.13.3 , task regression ( default )
#> -> predicted values : numerical, min = 2289.664 , mean = 3487.019 , max = 5737.175
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -1044.133 , mean = 4.321552e-14 , max = 1080.867
#> A new explainer has been created!
#> Model label: R6
#> Model class: LearnerRegrRpart,LearnerRegr,Learner,R6
#> Data head :
#> m2.price construction.year surface floor no.rooms district
#> 1 5897 1953 25 3 1 Srodmiescie
#> 2 1818 1992 143 9 5 Bielany
```