`R/explain_tidymodels.R`

`explain_tidymodels.Rd`

DALEX is designed to work with various black-box models like tree ensembles, linear models, neural networks etc. Unfortunately R packages that create such models are very inconsistent. Different tools use different interfaces to train, validate and use models. One of those tools, which is one of the most popular one is tidymodels package. We would like to present dedicated explain function for it.

explain_tidymodels( model, data = NULL, y = NULL, weights = NULL, predict_function = NULL, residual_function = NULL, ..., label = NULL, verbose = TRUE, precalculate = TRUE, colorize = TRUE, model_info = NULL, type = NULL )

model | object - a fitted workflow created with |
---|---|

data | data.frame or matrix - data that was used for fitting. Data should be passed without target column (this shall be provided as the |

y | numeric vector with outputs / scores. If provided then it shall have the same size as |

weights | numeric vector with sampling weights. By default it's |

predict_function | function that takes two arguments: model and new data and returns numeric vector with predictions |

residual_function | function that takes three arguments: model, data and response vector y. It should return a numeric vector with model residuals for given data. If not provided, response residuals (\(y-\hat{y}\)) are calculated. |

... | other parameters |

label | character - the name of the model. By default it's extracted from the 'class' attribute of the model |

verbose | if TRUE (default) then diagnostic messages will be printed. |

precalculate | if TRUE (default) then 'predicted_values' and 'residuals' are calculated when explainer is created. |

colorize | if TRUE (default) then |

model_info | a named list ( |

type | type of a model, either |

explainer object (`explain`

) ready to work with DALEX

#>#> ✔ broom 0.7.0 ✔ recipes 0.1.13 #> ✔ dials 0.0.8 ✔ rsample 0.0.7 #> ✔ dplyr 1.0.2 ✔ tibble 3.0.3 #> ✔ infer 0.5.3 ✔ tidyr 1.1.2 #> ✔ modeldata 0.0.2 ✔ tune 0.1.1 #> ✔ parsnip 0.1.3 ✔ workflows 0.1.3 #> ✔ purrr 0.3.4 ✔ yardstick 0.0.7#> Conflicts ───────────────────────────────────────── tidymodels_conflicts() ── #> ✖ purrr::discard() masks scales::discard() #> ✖ dplyr::explain() masks DALEX::explain() #> ✖ dplyr::filter() masks stats::filter() #> ✖ dplyr::lag() masks stats::lag() #> ✖ recipes::step() masks stats::step()library("recipes") data <- titanic_imputed data$survived <- as.factor(data$survived) rec <- recipe(survived ~ ., data = data) %>% step_normalize(fare) model <- decision_tree(tree_depth = 25) %>% set_engine("rpart") %>% set_mode("classification") wflow <- workflow() %>% add_recipe(rec) %>% add_model(model) model_fitted <- wflow %>% fit(data = data) explain_tidymodels(model_fitted, data = titanic_imputed, y = titanic_imputed$survived)#> Preparation of a new explainer is initiated #> -> model label : workflow ( default ) #> -> data : 2207 rows 8 cols #> -> target variable : 2207 values #> -> predict function : yhat.workflow will be used ( default ) #> -> predicted values : numerical, min = 0.05555556 , mean = 0.3221568 , max = 0.9267399 #> -> model_info : package tidymodels , ver. 0.1.1 , task classification ( default ) #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -0.9267399 , mean = 1.858834e-17 , max = 0.9444444 #> A new explainer has been created!#> Model label: workflow #> Model class: workflow #> Data head : #> gender age class embarked fare sibsp parch survived #> 1 male 42 3rd Southampton 7.11 0 0 0 #> 2 male 13 3rd Southampton 20.05 0 2 0