DALEX is designed to work with various black-box models like tree ensembles, linear models, neural networks etc. Unfortunately R packages that create such models are very inconsistent. Different tools use different interfaces to train, validate and use models. One of those tools, which is one of the most popular one is mlr3 package. We would like to present dedicated explain function for it.

explain_mlr3(
  model,
  data = NULL,
  y = NULL,
  weights = NULL,
  predict_function = NULL,
  residual_function = NULL,
  ...,
  label = NULL,
  verbose = TRUE,
  precalculate = TRUE,
  colorize = TRUE,
  model_info = NULL,
  type = NULL
)

Arguments

model

object - a fitted learner created with mlr3.

data

data.frame or matrix - data that was used for fitting. If not provided then will be extracted from the model. Data should be passed without target column (this shall be provided as the y argument). NOTE: If target variable is present in the data, some of the functionalities my not work properly.

y

numeric vector with outputs / scores. If provided then it shall have the same size as data

weights

numeric vector with sampling weights. By default it's NULL. If provided then it shall have the same length as data

predict_function

function that takes two arguments: model and new data and returns numeric vector with predictions

residual_function

function that takes three arguments: model, data and response vector y. It should return a numeric vector with model residuals for given data. If not provided, response residuals (\(y-\hat{y}\)) are calculated.

...

other parameters

label

character - the name of the model. By default it's extracted from the 'class' attribute of the model

verbose

if TRUE (default) then diagnostic messages will be printed.

precalculate

if TRUE (default) then 'predicted_values' and 'residuals' are calculated when explainer is created.

colorize

if TRUE (default) then WARNINGS, ERRORS and NOTES are colorized. Will work only in the R console.

model_info

a named list (package, version, type) containg information about model. If NULL, DALEX will seek for information on it's own.

type

type of a model, either classification or regression. If not specified then type will be extracted from model_info.

Value

explainer object (explain) ready to work with DALEX

Examples

#> #> Attaching package: ‘mlr3’
#> The following objects are masked from ‘package:mlr’: #> #> benchmark, resample
titanic_imputed$survived <- as.factor(titanic_imputed$survived) task_classif <- TaskClassif$new(id = "1", backend = titanic_imputed, target = "survived") learner_classif <- lrn("classif.rpart", predict_type = "prob") learner_classif$train(task_classif) explain_mlr3(learner_classif, data = titanic_imputed, y = as.numeric(as.character(titanic_imputed$survived)))
#> Preparation of a new explainer is initiated #> -> model label : R6 ( default ) #> -> data : 2207 rows 8 cols #> -> target variable : 2207 values #> -> predict function : yhat.LearnerClassif will be used ( default ) #> -> predicted values : numerical, min = 0.05555556 , mean = 0.3221568 , max = 0.9267399 #> -> model_info : package mlr3 , ver. 0.5.0 , task classification ( default ) #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -0.9267399 , mean = 1.858834e-17 , max = 0.9444444 #> A new explainer has been created!
#> Model label: R6 #> Model class: LearnerClassifRpart,LearnerClassif,Learner,R6 #> Data head : #> gender age class embarked fare sibsp parch survived #> 1 male 42 3rd Southampton 7.11 0 0 0 #> 2 male 13 3rd Southampton 20.05 0 2 0
task_regr <- TaskRegr$new(id = "2", backend = apartments, target = "m2.price") learner_regr <- lrn("regr.rpart") learner_regr$train(task_regr) explain_mlr3(learner_regr, data = apartments, apartments$m2.price)
#> Preparation of a new explainer is initiated #> -> model label : R6 ( default ) #> -> data : 1000 rows 6 cols #> -> target variable : 1000 values #> -> predict function : yhat.LearnerRegr will be used ( default ) #> -> predicted values : numerical, min = 2289.664 , mean = 3487.019 , max = 5737.175 #> -> model_info : package mlr3 , ver. 0.5.0 , task regression ( default ) #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -1044.133 , mean = 4.321552e-14 , max = 1080.867 #> A new explainer has been created!
#> Model label: R6 #> Model class: LearnerRegrRpart,LearnerRegr,Learner,R6 #> Data head : #> m2.price construction.year surface floor no.rooms district #> 1 5897 1953 25 3 1 Srodmiescie #> 2 1818 1992 143 9 5 Bielany