Function show_aggregated_profiles adds a layer to a plot created with plot.ceteris_paribus_explainer.

show_aggregated_profiles(
  x,
  ...,
  size = 0.5,
  alpha = 1,
  color = "#371ea3",
  variables = NULL
)

Arguments

x

a ceteris paribus explainer produced with function ceteris_paribus()

...

other explainers that shall be plotted together

size

a numeric. Size of lines to be plotted

alpha

a numeric between 0 and 1. Opacity of lines

color

a character. Either name of a color or name of a variable that should be used for coloring

variables

if not NULL then only variables will be presented

Value

a ggplot2 layer

References

Explanatory Model Analysis. Explore, Explain, and Examine Predictive Models. https://ema.drwhy.ai/

Examples

library("DALEX")
library("ingredients")

selected_passangers <- select_sample(titanic_imputed, n = 100)

model_titanic_glm <- glm(survived ~ gender + age + fare,
                         data = titanic_imputed, family = "binomial")

explain_titanic_glm <- explain(model_titanic_glm,
                               data = titanic_imputed[,-8],
                               y = titanic_imputed[,8])
#> Preparation of a new explainer is initiated
#>   -> model label       :  lm  (  default  )
#>   -> data              :  2207  rows  7  cols 
#>   -> target variable   :  2207  values 
#>   -> predict function  :  yhat.glm  will be used (  default  )
#>   -> predicted values  :  No value for predict function target column. (  default  )
#>   -> model_info        :  package stats , ver. 4.2.2 , task classification (  default  ) 
#>   -> predicted values  :  numerical, min =  0.1490412 , mean =  0.3221568 , max =  0.9878987  
#>   -> residual function :  difference between y and yhat (  default  )
#>   -> residuals         :  numerical, min =  -0.8898433 , mean =  4.198546e-13 , max =  0.8448637  
#>   A new explainer has been created!  

cp_rf <- ceteris_paribus(explain_titanic_glm, selected_passangers)

pdp_rf <- aggregate_profiles(cp_rf, type = "partial", variables = "age")

plot(cp_rf, variables = "age") +
  show_observations(cp_rf, variables = "age") +
  show_aggregated_profiles(pdp_rf, size = 3)


# \donttest{
library("ranger")

model_titanic_rf <- ranger(survived ~., data = titanic_imputed, probability = TRUE)

explain_titanic_rf <- explain(model_titanic_rf,
                              data = titanic_imputed[,-8],
                              y = titanic_imputed[,8],
                              label = "ranger forest",
                              verbose = FALSE)

cp_rf <- ceteris_paribus(explain_titanic_rf, selected_passangers)
cp_rf
#> Top profiles    : 
#>        gender age            class    embarked    fare sibsp parch    _yhat_
#> 515    female  45              2nd Southampton 10.1000     0     0 0.8125426
#> 515.1    male  45              2nd Southampton 10.1000     0     0 0.1098901
#> 604    female  17              3rd Southampton  7.1701     1     0 0.4618520
#> 604.1    male  17              3rd Southampton  7.1701     1     0 0.1108578
#> 1430   female  25 engineering crew Southampton  0.0000     0     0 0.7440738
#> 1430.1   male  25 engineering crew Southampton  0.0000     0     0 0.2346546
#>        _vname_ _ids_       _label_
#> 515     gender   515 ranger forest
#> 515.1   gender   515 ranger forest
#> 604     gender   604 ranger forest
#> 604.1   gender   604 ranger forest
#> 1430    gender  1430 ranger forest
#> 1430.1  gender  1430 ranger forest
#> 
#> 
#> Top observations:
#>      gender age            class    embarked    fare sibsp parch    _yhat_
#> 515    male  45              2nd Southampton 10.1000     0     0 0.1098901
#> 604    male  17              3rd Southampton  7.1701     1     0 0.1108578
#> 1430   male  25 engineering crew Southampton  0.0000     0     0 0.2346546
#> 865    male  20              3rd   Cherbourg  7.0406     0     0 0.1124048
#> 452  female  17              3rd  Queenstown  7.1408     0     0 0.6530898
#> 1534   male  38 victualling crew Southampton  0.0000     0     0 0.1729712
#>            _label_ _ids_
#> 515  ranger forest     1
#> 604  ranger forest     2
#> 1430 ranger forest     3
#> 865  ranger forest     4
#> 452  ranger forest     5
#> 1534 ranger forest     6

pdp_rf <- aggregate_profiles(cp_rf, type = "partial", variables = "age")
head(pdp_rf)
#> Top profiles    : 
#>   _vname_       _label_        _x_    _yhat_ _ids_
#> 1     age ranger forest  0.1666667 0.5218831     0
#> 2     age ranger forest  2.0000000 0.5590009     0
#> 3     age ranger forest  4.0000000 0.5760272     0
#> 4     age ranger forest  7.0000000 0.5150417     0
#> 5     age ranger forest  9.0000000 0.4988154     0
#> 6     age ranger forest 13.0000000 0.4224053     0

plot(cp_rf, variables = "age") +
  show_observations(cp_rf, variables = "age") +
  show_rugs(cp_rf, variables = "age", color = "red") +
  show_aggregated_profiles(pdp_rf, size = 3)

# }