Balance and split the dataset
train_test_balance(
data,
y,
balance = TRUE,
fractions = c(0.6, 0.2, 0.2),
seed = NULL
)
A data source, that is one of the major R formats: data.table, data.frame, matrix and so on.
A string that indicates a target column name.
A logical value, determines if we want to balance the dataset.
A vector with 3 numeric values that sum to 1 which determine sizes of train, test and validation datasets. DEFAULT: c(0.6, 0.2, 0.2).
An integer random seed. It allows for comparable results. If it is NULL, the split is random.
A list of train, test and validation datasets.