Returns a vector of interaction strengths between variable v
and all other
variables, see Details.
potential_interactions(
obj,
v,
nbins = NULL,
color_num = TRUE,
scale = FALSE,
adjusted = FALSE
)
An object of class "shapviz".
Variable name to calculate potential SHAP interactions for.
Into how many quantile bins should a numeric v
be binned?
The default NULL
equals the smaller of \(n/20\) and \(\sqrt n\) (rounded up),
where \(n\) is the sample size. Ignored if obj
contains SHAP interactions.
Should other ("color") features v'
be converted to numeric,
even if they are factors/characters? Default is TRUE
.
Ignored if obj
contains SHAP interactions.
Should adjusted R-squared be multiplied with the sample variance of
within-bin SHAP values? If TRUE
, bins with stronger vertical scatter will get
higher weight. The default is FALSE
. Ignored if obj
contains SHAP interactions.
Should adjusted R-squared be used? Default is FALSE
.
A named vector of decreasing interaction strengths.
If SHAP interaction values are available, the interaction strength
between feature v
and another feature v'
is measured by twice their
mean absolute SHAP interaction values.
Otherwise, we use a heuristic calculated as follows:
If v
is numeric, it is binned into nbins
bins.
Per bin, the SHAP values of v
are regressed onto v
, and the R-squared
is calculated. Rows with missing v'
are discarded.
The R-squared are averaged over bins, weighted by the number of
non-missing v'
values.
This measures how much variability in the SHAP values of v
is explained by v'
,
after accounting for v
.
Set scale = TRUE
to multiply the R-squared by the within-bin variance
of the SHAP values. This will put higher weight to bins with larger scatter.
Set color_num = FALSE
to not turn the values of the "color" feature v'
to numeric.
Finally, set adjusted = TRUE
to use adjusted R-squared.
The algorithm does not consider observations with missing v'
values.