Determining if one model is better than the other one is a difficult task. Mostly because there is a lot of fields that have to be
covered to make such a judgement. Overall performance, performance on the crucial subset, distribution of residuals, those are only
few among many ideas related to that issue. Following function allow user to create a report based on various sections. Each says something different
about relation between champion and challengers. DALEXtra package share 3 base sections which are funnel_measure
overall_comparison and training_test_comparison but any object that has generic plot function can
be included at report.
- list of sections to be attached to report. Could be sections available with DALEXtra which are funnel_measure
training_test_comparison, overall_comparison or any other explanation that can work with plot function. Please
provide name for not standard sections, that will be presented as section titles. Otherwise class of the object will be used.
- dot_size argument passed to plot.funnel_measure if funnel_measure section present
- path to directory where Report should be created. By default it is current working directory.
- name of the Report. By default it is "Report"
- If TRUE and overall_comparison section present, table of scores will be displayed.
- Title for report, by default it is "ChampionChallenger".
- Author of , report. By default it is current user name.
- other parameters passed to rmarkdown::render.
rmarkdown report
# \donttest{
library("mlr")
#> Loading required package: ParamHelpers
#> Warning message: 'mlr' is in 'maintenance-only' mode since July 2019.
#> Future development will only happen in 'mlr3'
#> (<https://mlr3.mlr-org.com>). Due to the focus on 'mlr3' there might be
#> uncaught bugs meanwhile in {mlr} - please consider switching.
library("DALEXtra")
task <- mlr::makeRegrTask(
id = "R",
data = apartments,
target = "m2.price"
)
learner_lm <- mlr::makeLearner(
"regr.lm"
)
model_lm <- mlr::train(learner_lm, task)
explainer_lm <- explain_mlr(model_lm, apartmentsTest, apartmentsTest$m2.price, label = "LM")
#> Preparation of a new explainer is initiated
#> -> model label : LM
#> -> data : 9000 rows 6 cols
#> -> target variable : 9000 values
#> -> predict function : yhat.WrappedModel will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package mlr , ver. 2.19.0 , task regression ( default )
#> -> predicted values : numerical, min = 1792.597 , mean = 3506.836 , max = 6241.447
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -257.2555 , mean = 4.687686 , max = 472.356
#> A new explainer has been created!
learner_rf <- mlr::makeLearner(
"regr.ranger"
)
model_rf <- mlr::train(learner_rf, task)
explainer_rf <- explain_mlr(model_rf, apartmentsTest, apartmentsTest$m2.price, label = "RF")
#> Preparation of a new explainer is initiated
#> -> model label : RF
#> -> data : 9000 rows 6 cols
#> -> target variable : 9000 values
#> -> predict function : yhat.WrappedModel will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package mlr , ver. 2.19.0 , task regression ( default )
#> -> predicted values : numerical, min = 1795.43 , mean = 3504.914 , max = 6237.863
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -573.2127 , mean = 6.609821 , max = 726.0512
#> A new explainer has been created!
learner_gbm <- mlr::makeLearner(
"regr.gbm"
)
model_gbm <- mlr::train(learner_gbm, task)
explainer_gbm <- explain_mlr(model_gbm, apartmentsTest, apartmentsTest$m2.price, label = "GBM")
#> Preparation of a new explainer is initiated
#> -> model label : GBM
#> -> data : 9000 rows 6 cols
#> -> target variable : 9000 values
#> -> predict function : yhat.WrappedModel will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package mlr , ver. 2.19.0 , task regression ( default )
#> -> predicted values : numerical, min = 2118.951 , mean = 3502.078 , max = 6059.426
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -518.8356 , mean = 9.445597 , max = 735.4289
#> A new explainer has been created!
plot_data <- funnel_measure(explainer_lm, list(explainer_rf, explainer_gbm),
nbins = 5, measure_function = DALEX::loss_root_mean_square)
#>
|
| | 0%
|
|============ | 17%
|
|======================= | 33%
|
|=================================== | 50%
|
|=============================================== | 67%
|
|========================================================== | 83%
|
|======================================================================| 100%
champion_challenger(list(plot_data), dot_size = 3, output_dir_path = tempdir())
# }