Individual Variable Effect
individual_variable_effect(x, ...) # S3 method for explainer individual_variable_effect( x, new_observation, method = "KernelSHAP", nsamples = "auto", ... ) # S3 method for default individual_variable_effect( x, data, predict_function = predict, new_observation, label = tail(class(x), 1), method = "KernelSHAP", nsamples = "auto", ... ) shap(x, ...)
x | a model to be explained, or an explainer created with function |
---|---|
... | other parameters. |
new_observation | an observation/observations to be explained. Required for local/instance level explainers. Columns in should correspond to columns in the data argument. Data set should not contain any additional columns. |
method | an estimation method of SHAP values. Currently the only availible is `KernelSHAP`. |
nsamples | number of samples or "auto". Note that number must be as integer. Use `as.integer()`. |
data | validation dataset. Used to determine univariate distributions, calculation of quantiles, correlations and so on. It will be extracted from `x` if it’s an explainer. |
predict_function | predict function that operates on the model `x`. Since the model is a black box, the `predict_function` is the only interface to access values from the model. It should be a function that takes at least a model `x` and data and returns vector of predictions. If model response has more than a single number (like multiclass models) then this function should return a marix/data.frame of the size `m` x `d`, where `m` is the number of observations while `d` is the dimensionality of model response. It will be extracted from `x` if it’s an explainer. |
label | name of the model. By default it’s extracted from the class attribute of the model |
an object of class individual_variable_effect with shap values of each variable for each new observation. Columns:
first d columns contains variable values.
_id_ - id of observation, number of row in `new_observation` data.
_ylevel_ - level of y
_yhat_ -predicted value for level of y
_yhat_mean_ - expected value of prediction, mean of all predictions
_vname_ - variable name
_attribution_ - attribution of variable
_sign_ a sign of attribution
_label_ a label
In order to use shapper with other python virtual environment following R command are required to execute reticulate::use_virtualenv("path_to_your_env") or for conda reticulate::use_conda("name_of_conda_env") before attaching shapper.
have_shap <- reticulate::py_module_available("shap") if(have_shap){ library("shapper") library("DALEX") library("randomForest") Y_train <- HR$status x_train <- HR[ , -6] set.seed(123) model_rf <- randomForest(x = x_train, y = Y_train, ntree= 50) p_function <- function(model, data) predict(model, newdata = data, type = "prob") ive_rf <- individual_variable_effect(model_rf, data = x_train, predict_function = p_function, new_observation = x_train[1:2,], nsamples = 50) ive_rf } else{ print('Python testing environment is required.') }#>#> #> #>#>#>#> gender age hours evaluation salary _id_ _ylevel_ _yhat_ _yhat_mean_ #> 1 male 32.58267 41.88626 3 1 1 fired 0.9 0.3787180 #> 1.3 male 32.58267 41.88626 3 1 1 fired 0.9 0.3787180 #> 1.4 male 32.58267 41.88626 3 1 1 fired 0.9 0.3787180 #> 1.5 male 32.58267 41.88626 3 1 1 fired 0.9 0.3787180 #> 1.6 male 32.58267 41.88626 3 1 1 fired 0.9 0.3787180 #> 1.1 male 32.58267 41.88626 3 1 1 ok 0.1 0.2730292 #> _vname_ _attribution_ _sign_ _label_ #> 1 gender -0.02056803 - randomForest #> 1.3 age 0.03448379 + randomForest #> 1.4 hours 0.32636349 + randomForest #> 1.5 evaluation 0.11472954 + randomForest #> 1.6 salary 0.06627323 + randomForest #> 1.1 gender 0.01840784 + randomForest