Function model_performance() calculates various performance measures for classification and regression models. For classification models following measures are calculated: F1, accuracy, recall, precision and AUC. For regression models following measures are calculated: mean squared error, R squared, median absolute deviation.

model_performance(explainer, ..., cutoff = 0.5)

Arguments

explainer

a model to be explained, preprocessed by the explain function

...

other parameters

cutoff

a cutoff for classification models, needed for measures like recall, precision, ACC, F1. By default 0.5.

Value

An object of the class model_performance.

It's a list with following fields:

  • residuals - data frame that contains residuals for each observation

  • measures - list with calculated measures that are dedicated for the task, whether it is regression, binary classification or multiclass classification.

  • type - character that specifies type of the task.

References

Explanatory Model Analysis. Explore, Explain, and Examine Predictive Models. https://ema.drwhy.ai/

Examples

# \donttest{ # regression library("ranger") apartments_ranger_model <- ranger(m2.price~., data = apartments, num.trees = 50) explainer_ranger_apartments <- explain(apartments_ranger_model, data = apartments[,-1], y = apartments$m2.price, label = "Ranger Apartments")
#> Preparation of a new explainer is initiated #> -> model label : Ranger Apartments #> -> data : 1000 rows 5 cols #> -> target variable : 1000 values #> -> predict function : yhat.ranger will be used ( default ) #> -> predicted values : No value for predict function target column. ( default ) #> -> model_info : package ranger , ver. 0.13.1 , task regression ( default ) #> -> predicted values : numerical, min = 1858.966 , mean = 3488.445 , max = 6107.714 #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -436.5377 , mean = -1.426037 , max = 743.8861 #> A new explainer has been created!
model_performance_ranger_aps <- model_performance(explainer_ranger_apartments ) model_performance_ranger_aps
#> Measures for: regression #> mse : 23783.55 #> rmse : 154.2191 #> r2 : 0.9710404 #> mad : 86.32667 #> #> Residuals: #> 0% 10% 20% 30% 40% 50% #> -436.537667 -167.145158 -114.019537 -78.532267 -53.677333 -25.737000 #> 60% 70% 80% 90% 100% #> 3.202057 39.277400 106.426046 198.733311 743.886111
plot(model_performance_ranger_aps)
plot(model_performance_ranger_aps, geom = "boxplot")
plot(model_performance_ranger_aps, geom = "histogram")
# binary classification titanic_glm_model <- glm(survived~., data = titanic_imputed, family = "binomial") explainer_glm_titanic <- explain(titanic_glm_model, data = titanic_imputed[,-8], y = titanic_imputed$survived)
#> Preparation of a new explainer is initiated #> -> model label : lm ( default ) #> -> data : 2207 rows 7 cols #> -> target variable : 2207 values #> -> predict function : yhat.glm will be used ( default ) #> -> predicted values : No value for predict function target column. ( default ) #> -> model_info : package stats , ver. 4.1.1 , task classification ( default ) #> -> predicted values : numerical, min = 0.008128381 , mean = 0.3221568 , max = 0.9731431 #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -0.9628583 , mean = -2.569729e-10 , max = 0.9663346 #> A new explainer has been created!
model_performance_glm_titanic <- model_performance(explainer_glm_titanic) model_performance_glm_titanic
#> Measures for: classification #> recall : 0.5738397 #> precision : 0.7472527 #> f1 : 0.6491647 #> accuracy : 0.8001812 #> auc : 0.8115462 #> #> Residuals: #> 0% 10% 20% 30% 40% 50% #> -0.96285832 -0.32240247 -0.23986439 -0.19544185 -0.14842925 -0.11460334 #> 60% 70% 80% 90% 100% #> -0.06940964 0.06185475 0.29607060 0.72120412 0.96633458
plot(model_performance_glm_titanic)
plot(model_performance_glm_titanic, geom = "boxplot")
plot(model_performance_glm_titanic, geom = "histogram")
# multilabel classification HR_ranger_model <- ranger(status~., data = HR, num.trees = 50, probability = TRUE) explainer_ranger_HR <- explain(HR_ranger_model, data = HR[,-6], y = HR$status, label = "Ranger HR")
#> Preparation of a new explainer is initiated #> -> model label : Ranger HR #> -> data : 7847 rows 5 cols #> -> target variable : 7847 values #> -> predict function : yhat.ranger will be used ( default ) #> -> predicted values : No value for predict function target column. ( default ) #> -> model_info : package ranger , ver. 0.13.1 , task multiclass ( default ) #> -> predicted values : predict function returns multiple columns: 3 ( default ) #> -> residual function : difference between 1 and probability of true class ( default ) #> -> residuals : numerical, min = 0 , mean = 0.2769613 , max = 0.9160218 #> A new explainer has been created!
model_performance_ranger_HR <- model_performance(explainer_ranger_HR) model_performance_ranger_HR
#> Measures for: multiclass #> micro_F1 : 0.870014 #> macro_F1 : 0.8678788 #> w_macro_F1 : 0.8687785 #> accuracy : 0.870014 #> w_macro_auc: 0.9771964 #> #> Residuals: #> 0% 10% 20% 30% 40% 50% 60% #> 0.00000000 0.02555376 0.05683133 0.11707943 0.17633126 0.24091292 0.31364170 #> 70% 80% 90% 100% #> 0.39212599 0.48521070 0.59293321 0.91602179
plot(model_performance_ranger_HR)
plot(model_performance_ranger_HR, geom = "boxplot")
plot(model_performance_ranger_HR, geom = "histogram")
# }