SHAP aggregated values — shap_aggregated • DALEX

This function works in a similar way to shap function from iBreakDown but it calculates explanations for a set of observation and then aggregates them.

shap_aggregated(
  explainer,
  new_observations,
  order = NULL,
  B = 25,
  kernelshap = FALSE,
  ...
)

Arguments

explainer: a model to be explained, preprocessed by the explain function
new_observations: a set of new observations with columns that correspond to variables used in the model.
order: if not NULL, then it will be a fixed order of variables. It can be a numeric vector or vector with names of variables.
B: number of random paths; works only if kernelshap=FALSE
kernelshap: indicates whether the kernelshap method should be used
...: other parameters like label, predict_function, data, x

Value

an object of the shap_aggregated class.

References

Explanatory Model Analysis. Explore, Explain and Examine Predictive Models. https://ema.drwhy.ai

Examples

library("DALEX")
set.seed(1313)
model_titanic_glm <- glm(survived ~ gender + age + fare,
                       data = titanic_imputed, family = "binomial")
explain_titanic_glm <- explain(model_titanic_glm,
                           data = titanic_imputed,
                           y = titanic_imputed$survived,
                           label = "glm")
#> Preparation of a new explainer is initiated
#>   -> model label       :  glm 
#>   -> data              :  2207  rows  8  cols 
#>   -> target variable   :  2207  values 
#>   -> predict function  :  yhat.glm  will be used (  default  )
#>   -> predicted values  :  No value for predict function target column. (  default  )
#>   -> model_info        :  package stats , ver. 4.2.3 , task classification (  default  ) 
#>   -> predicted values  :  numerical, min =  0.1490412 , mean =  0.3221568 , max =  0.9878987  
#>   -> residual function :  difference between y and yhat (  default  )
#>   -> residuals         :  numerical, min =  -0.8898433 , mean =  4.198546e-13 , max =  0.8448637  
#>   A new explainer has been created!  

# \donttest{
bd_glm <- shap_aggregated(explain_titanic_glm, titanic_imputed[1:10, ])
bd_glm
#> $aggregated
#>      variable label contribution position sign cumulative variable_name
#> 9   intercept   glm  0.360768791       10    X  0.3607688     intercept
#> 1         age   glm  0.006195355        9    1  0.3669641           age
#> 2      gender   glm  0.038938300        8    1  0.4059024         class
#> 3       class   glm  0.000000000        7    1  0.4059024      embarked
#> 4    embarked   glm  0.000000000        6    1  0.4059024          fare
#> 5        fare   glm -0.006521638        5   -1  0.3993808        gender
#> 6       parch   glm  0.000000000        4    1  0.3993808         parch
#> 7       sibsp   glm  0.000000000        3    1  0.3993808         sibsp
#> 8    survived   glm  0.000000000        2    1  0.3993808      survived
#> 10 prediction   glm  0.322156774        1    X  0.3221568              
#>    variable_value
#> 9                
#> 1                
#> 2                
#> 3                
#> 4                
#> 5                
#> 6                
#> 7                
#> 8                
#> 10               
#> 
#> $raw
#>                         min            q1       median         mean          q3
#> glm: age        -0.01514903  0.0002353574  0.005275165  0.006195355 0.018497923
#> glm: class       0.00000000  0.0000000000  0.000000000  0.000000000 0.000000000
#> glm: embarked    0.00000000  0.0000000000  0.000000000  0.000000000 0.000000000
#> glm: fare       -0.01889917 -0.0170260518 -0.002977667 -0.006521638 0.001316718
#> glm: gender     -0.11039942 -0.1084437947 -0.107256514  0.038938300 0.379438843
#> glm: intercept   0.14904118  0.1887423364  0.201938260  0.322156774 0.280247495
#> glm: parch       0.00000000  0.0000000000  0.000000000  0.000000000 0.000000000
#> glm: prediction  0.18270403  0.2103113783  0.223698836  0.360768791 0.577815674
#> glm: sibsp       0.00000000  0.0000000000  0.000000000  0.000000000 0.000000000
#> glm: survived    0.00000000  0.0000000000  0.000000000  0.000000000 0.000000000
#>                        max
#> glm: age        0.02362156
#> glm: class      0.00000000
#> glm: embarked   0.00000000
#> glm: fare       0.01020851
#> glm: gender     0.38624848
#> glm: intercept  0.98789865
#> glm: parch      0.00000000
#> glm: prediction 0.71570558
#> glm: sibsp      0.00000000
#> glm: survived   0.00000000
#> 
#> attr(,"class")
#> [1] "shap_aggregated" "list"           
plot(bd_glm, max_features = 3)
#> Warning: Removed 260 rows containing missing values (`stat_boxplot()`).

# }