Create a corrgrapher object

This is the main function of corrgrapher package. It does necessary calculations and creates a corrgrapher object. Feel free to pass it into plot, include it in knitr report or generate a simple HTML.

corrgrapher(x, ...)

# S3 method for explainer
corrgrapher(
  x,
  cutoff = 0.2,
  values = NULL,
  cor_functions = list(),
  ...,
  feature_importance = NULL,
  partial_dependence = NULL
)

# S3 method for matrix
corrgrapher(x, cutoff = 0.2, values = NULL, cor_functions = list(), ...)

# S3 method for default
corrgrapher(x, cutoff = 0.2, values = NULL, cor_functions = list(), ...)

Arguments

x	an object to be used to select the method, which must satisfy conditions: if `data.frame` (default), columns of `numeric` type must contain numerical variables and columns of `factor` class must contain categorical variables. Columns of other types will be ignored. if `explainer`, methods `feature_importance` and `partial_dependence` must not return an error. See also arguments `feature_importance` and `partial_dependence`. if `matrix`, it will be converted with `as.data.frame`.
...	other arguments.
cutoff	a number. Correlations below this are treated as no correlation. Edges corresponding to them will not be included in the graph.
values	a `data.frame` with information about size of the nodes, containing columns `value` and `label` (consistent with colnames of `x`). Default set to equal for all nodes, or (for `explainer`) importance of variables.
cor_functions	a named `list` of functions to pass to `calculate_cors`. Must contain necessary functions from `num_num_f`, `num_cat_f` or `cat_cat_f`. Must contain also `max_cor`
feature_importance	Either: an object of `feature importance_explainer` class, created by `feature_importance` function, or a named `list` of parameters to pass to `feature_importance` function.
partial_dependence	a named `list` with 2 elements: `numerical` and `categorical`. Both of them should be either: an object of `aggregated_profile_explainer` class, created by `partial_dependence` function, or a named `list` of parameters to pass to `partial_dependence`. If only one kind of data was used, use a list with 1 object.

Value

A corrgrapher object. Essentially a list, consisting of following fields:

nodes - a data.frame to pass as argument nodes to visNetwork function
edges - a data.frame to pass as argument edges to visNetwork function
pds (if x was of explainer class) - a list with 2 elements: numerical and categorical. Each of them contains an object of aggregated_profiles_explainer used to create partial dependency plots.
data - data used to create the object.

Details

Data analysis (and creating ML models) involves many stages. For early exploration, it is useful to have a grip not only on individual series (AKA variables) available, but also on relations between them. Unfortunately, the task of understanding correlations between variables proves to be difficult. corrgrapher package aims to plot correlations between variables in form of a graph. Each node on it is associated with single variable. Variables correlated with each other (positively and negatively alike) shall be close, and weakly correlated - far from each other.

Examples

# convert the category variable
df <- as.data.frame(datasets::Seatbelts)
df$law <- factor(df$law) 
cgr <- corrgrapher(df)

Arguments

Value

Details

See also

Examples

Create a `corrgrapher` object