Automated tests for model residuals — check

Currently three tests are performed - for outliers in residuals - for autocorrelation in target variable or in residuals - for trend in residuals as a function of target variable (detection of bias)

check_residuals(object, ...)

Arguments

object	An object of class 'explainer' created with function `explain` from the DALEX package.
...	other parameters that will be passed to further functions.

Value

list with statistics for particular checks

Examples

dragons <- DALEX::dragons[1:100, ]
lm_model <- lm(life_length ~ ., data = dragons)
lm_audit <- audit(lm_model, data = dragons, y = dragons$life_length)
#> Preparation of a new explainer is initiated
#>   -> model label       :  lm  (  default  )
#>   -> data              :  100  rows  8  cols 
#>   -> target variable   :  100  values 
#>   -> predict function  :  yhat.lm  will be used (  default  )
#>   -> predicted values  :  No value for predict function target column. (  default  )
#>   -> model_info        :  package stats , ver. 4.1.1 , task regression (  default  ) 
#>   -> predicted values  :  numerical, min =  585.8311 , mean =  1347.787 , max =  2942.307  
#>   -> residual function :  difference between y and yhat (  default  )
#>   -> residuals         :  numerical, min =  -88.41755 , mean =  -1.489291e-13 , max =  77.92805  
#>   A new explainer has been created!  
check_residuals(lm_audit)
#>   -----------------------------------------------
#>    Checks for autocorrelation
#>   -----------------------------------------------
#>     Model name:  lm 
#>     Autocorrelation in target:     -0.08      
#>     Autocorrelation in residuals:  +0.06      
#>   -----------------------------------------------
#>    Checks for outliers
#>   -----------------------------------------------
#>     Model name:  lm 
#>     Shift > 1:  0 ( 0 %) 
#>     Shift > 2:  0 ( 0 %) 
#>     Top lowest standardised residuals: 
#>      -2.2738 (48), -2.2542 (90), -2.0906 (63), -2.0201 (45), -1.8596 (67) 
#>     Top highest standardised residuals: 
#>      2.004 (96), 1.9057 (2), 1.6531 (11), 1.4571 (3), 1.4286 (42) 
#>   -----------------------------------------------
#>    Checks for trend in residuals
#>   -----------------------------------------------
#>     Model name:  lm 
#>     Standardised loess fit:  +0.54      
 # \dontrun{
 library("randomForest")
 rf_model <- randomForest(life_length ~ ., data = dragons)
 rf_audit <- audit(rf_model, data = dragons, y = dragons$life_length)
#> Preparation of a new explainer is initiated
#>   -> model label       :  randomForest  (  default  )
#>   -> data              :  100  rows  8  cols 
#>   -> target variable   :  100  values 
#>   -> predict function  :  yhat.randomForest  will be used (  default  )
#>   -> predicted values  :  No value for predict function target column. (  default  )
#>   -> model_info        :  package randomForest , ver. 4.6.14 , task regression (  default  ) 
#>   -> predicted values  :  numerical, min =  759.8688 , mean =  1343.915 , max =  2480.71  
#>   -> residual function :  difference between y and yhat (  default  )
#>   -> residuals         :  numerical, min =  -176.2797 , mean =  3.871692 , max =  417.7102  
#>   A new explainer has been created!  
 check_residuals(rf_audit)
#>   -----------------------------------------------
#>    Checks for autocorrelation
#>   -----------------------------------------------
#>     Model name:  randomForest 
#>     Autocorrelation in target:     -0.08      
#>     Autocorrelation in residuals:  -0.06      
#>   -----------------------------------------------
#>    Checks for outliers
#>   -----------------------------------------------
#>     Model name:  randomForest 
#>     Shift > 1:  2 ( 2 %) 
#>     Shift > 2:  0 ( 0 %) 
#>     Top lowest standardised residuals: 
#>      -1.8179 (93), -1.771 (85), -1.7295 (95), -1.499 (63), -1.4351 (53) 
#>     Top highest standardised residuals: 
#>      4.1759 (55), 3.602 (26), 2.9708 (99), 2.3161 (75), 2.0519 (66) 
#>   -----------------------------------------------
#>    Checks for trend in residuals
#>   -----------------------------------------------
#>     Model name:  randomForest 
#>     Standardised loess fit:  +12.20     * 
# }