Determining if one model is better than the other one is a difficult task. Mostly because there is a lot of fields that have to be
covered to make such a judgement. Overall performance, performance on the crucial subset, distribution of residuals, those are only
few among many ideas related to that issue. Following function allow user to create a report based on various sections. Each says something different
about relation between champion and challengers. DALEXtra
package share 3 base sections which are funnel_measure
overall_comparison
and training_test_comparison
but any object that has generic plot
function can
be included at report.
- list of sections to be attached to report. Could be sections available with DALEXtra which are funnel_measure
training_test_comparison
, overall_comparison
or any other explanation that can work with plot
function. Please
provide name for not standard sections, that will be presented as section titles. Otherwise class of the object will be used.
- dot_size argument passed to plot.funnel_measure
if funnel_measure
section present
- path to directory where Report should be created. By default it is current working directory.
- name of the Report. By default it is "Report"
- If TRUE and overall_comparison
section present, table of scores will be displayed.
- Title for report, by default it is "ChampionChallenger".
- Author of , report. By default it is current user name.
- other parameters passed to rmarkdown::render.
rmarkdown report
# \donttest{
library("mlr")
#> Loading required package: ParamHelpers
#> Warning message: 'mlr' is in 'maintenance-only' mode since July 2019.
#> Future development will only happen in 'mlr3'
#> (<https://mlr3.mlr-org.com>). Due to the focus on 'mlr3' there might be
#> uncaught bugs meanwhile in {mlr} - please consider switching.
library("DALEXtra")
task <- mlr::makeRegrTask(
id = "R",
data = apartments,
target = "m2.price"
)
learner_lm <- mlr::makeLearner(
"regr.lm"
)
model_lm <- mlr::train(learner_lm, task)
explainer_lm <- explain_mlr(model_lm, apartmentsTest, apartmentsTest$m2.price, label = "LM")
#> Preparation of a new explainer is initiated
#> -> model label : LM
#> -> data : 9000 rows 6 cols
#> -> target variable : 9000 values
#> -> predict function : yhat.WrappedModel will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package mlr , ver. 2.19.0 , task regression ( default )
#> -> predicted values : numerical, min = 1792.597 , mean = 3506.836 , max = 6241.447
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -257.2555 , mean = 4.687686 , max = 472.356
#> A new explainer has been created!
learner_rf <- mlr::makeLearner(
"regr.ranger"
)
model_rf <- mlr::train(learner_rf, task)
explainer_rf <- explain_mlr(model_rf, apartmentsTest, apartmentsTest$m2.price, label = "RF")
#> Preparation of a new explainer is initiated
#> -> model label : RF
#> -> data : 9000 rows 6 cols
#> -> target variable : 9000 values
#> -> predict function : yhat.WrappedModel will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package mlr , ver. 2.19.0 , task regression ( default )
#> -> predicted values : numerical, min = 1795.43 , mean = 3504.914 , max = 6237.863
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -573.2127 , mean = 6.609821 , max = 726.0512
#> A new explainer has been created!
learner_gbm <- mlr::makeLearner(
"regr.gbm"
)
model_gbm <- mlr::train(learner_gbm, task)
explainer_gbm <- explain_mlr(model_gbm, apartmentsTest, apartmentsTest$m2.price, label = "GBM")
#> Preparation of a new explainer is initiated
#> -> model label : GBM
#> -> data : 9000 rows 6 cols
#> -> target variable : 9000 values
#> -> predict function : yhat.WrappedModel will be used ( default )
#> -> predicted values : No value for predict function target column. ( default )
#> -> model_info : package mlr , ver. 2.19.0 , task regression ( default )
#> -> predicted values : numerical, min = 2118.951 , mean = 3502.078 , max = 6059.426
#> -> residual function : difference between y and yhat ( default )
#> -> residuals : numerical, min = -518.8356 , mean = 9.445597 , max = 735.4289
#> A new explainer has been created!
plot_data <- funnel_measure(explainer_lm, list(explainer_rf, explainer_gbm),
nbins = 5, measure_function = DALEX::loss_root_mean_square)
#>
|
| | 0%
|
|============ | 17%
|
|======================= | 33%
|
|=================================== | 50%
|
|=============================================== | 67%
|
|========================================================== | 83%
|
|======================================================================| 100%
champion_challenger(list(plot_data), dot_size = 3, output_dir_path = tempdir())
# }