Currently three tests are performed - for outliers in residuals - for autocorrelation in target variable or in residuals - for trend in residuals as a function of target variable (detection of bias)
check_residuals(object, ...)
object | An object of class 'explainer' created with function |
---|---|
... | other parameters that will be passed to further functions. |
list with statistics for particular checks
dragons <- DALEX::dragons[1:100, ] lm_model <- lm(life_length ~ ., data = dragons) lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length)#> Preparation of a new explainer is initiated #> -> model label : lm ( default ) #> -> data : 100 rows 8 cols #> -> target variable : 100 values #> -> predict function : yhat.lm will be used ( default ) #> -> predicted values : No value for predict function target column. ( default ) #> -> model_info : package stats , ver. 4.1.1 , task regression ( default ) #> -> predicted values : numerical, min = 585.8311 , mean = 1347.787 , max = 2942.307 #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -88.41755 , mean = -1.489291e-13 , max = 77.92805 #> A new explainer has been created!check_residuals(lm_audit)#> ----------------------------------------------- #> Checks for autocorrelation #> ----------------------------------------------- #> Model name: lm #> Autocorrelation in target: -0.08 #> Autocorrelation in residuals: +0.06 #> ----------------------------------------------- #> Checks for outliers #> ----------------------------------------------- #> Model name: lm #> Shift > 1: 0 ( 0 %) #> Shift > 2: 0 ( 0 %) #> Top lowest standardised residuals: #> -2.2738 (48), -2.2542 (90), -2.0906 (63), -2.0201 (45), -1.8596 (67) #> Top highest standardised residuals: #> 2.004 (96), 1.9057 (2), 1.6531 (11), 1.4571 (3), 1.4286 (42) #> ----------------------------------------------- #> Checks for trend in residuals #> ----------------------------------------------- #> Model name: lm #> Standardised loess fit: +0.54# \dontrun{ library("randomForest") rf_model <- randomForest(life_length ~ ., data = dragons) rf_audit <- audit(rf_model, data = dragons, y = dragons$life_length)#> Preparation of a new explainer is initiated #> -> model label : randomForest ( default ) #> -> data : 100 rows 8 cols #> -> target variable : 100 values #> -> predict function : yhat.randomForest will be used ( default ) #> -> predicted values : No value for predict function target column. ( default ) #> -> model_info : package randomForest , ver. 4.6.14 , task regression ( default ) #> -> predicted values : numerical, min = 759.8688 , mean = 1343.915 , max = 2480.71 #> -> residual function : difference between y and yhat ( default ) #> -> residuals : numerical, min = -176.2797 , mean = 3.871692 , max = 417.7102 #> A new explainer has been created!check_residuals(rf_audit)#> ----------------------------------------------- #> Checks for autocorrelation #> ----------------------------------------------- #> Model name: randomForest #> Autocorrelation in target: -0.08 #> Autocorrelation in residuals: -0.06 #> ----------------------------------------------- #> Checks for outliers #> ----------------------------------------------- #> Model name: randomForest #> Shift > 1: 2 ( 2 %) #> Shift > 2: 0 ( 0 %) #> Top lowest standardised residuals: #> -1.8179 (93), -1.771 (85), -1.7295 (95), -1.499 (63), -1.4351 (53) #> Top highest standardised residuals: #> 4.1759 (55), 3.602 (26), 2.9708 (99), 2.3161 (75), 2.0519 (66) #> ----------------------------------------------- #> Checks for trend in residuals #> ----------------------------------------------- #> Model name: randomForest #> Standardised loess fit: +12.20 *# }