Internal Function for Split Points for Selected Variables — calculate_variable

This function calculate candidate splits for each selected variable. For numerical variables splits are calculated as percentiles (in general uniform quantiles of the length grid_points). For all other variables splits are calculated as unique values.

calculate_variable_split(
  data,
  variables = colnames(data),
  grid_points = 101,
  variable_splits_type = "quantiles",
  new_observation = NA
)

# S3 method for default
calculate_variable_split(
  data,
  variables = colnames(data),
  grid_points = 101,
  variable_splits_type = "quantiles",
  new_observation = NA
)

Arguments

data: validation dataset. Is used to determine distribution of observations.
variables: names of variables for which splits shall be calculated
grid_points: number of points used for response path
variable_splits_type: how variable grids shall be calculated? Use "quantiles" (default) for percentiles or "uniform" to get uniform grid of points
new_observation: if specified (not NA) then all values in new_observation will be included in variable_splits

Value

A named list with splits for selected variables

Details

Note that calculate_variable_split function is S3 generic. If you want to work on non standard data sources (like H2O ddf, external databases) you should overload it.