All functions |
|
---|---|
Adult dataset |
|
Provide basic dataset information |
|
Predict a single model |
|
Binarize the target column |
|
Perform Boruta algorithm for selecting most important features |
|
Search for strongly correlated values (Spearman for numerical, Crammer V for categorical) |
|
Run data check pipeline to seek for potential problems with the data |
|
Search for dimensionality problems in the dataset |
|
Search for duplicates between columns |
|
Search for missing values in the target column and predictors |
|
Search for outliers via mean standard deviation, median absolute deviation and inter quantile range |
|
Search for columns dominated by a single value |
|
Check whether the target column is unbalanced (for regression it bins values via quantiles) |
|
Choose the bests models, according to the score data frame |
|
Modified COMPAS dataset |
|
Create final ranked_list |
|
Delete correlated values |
|
Delete columns that are ID-like columns |
|
Detect columns that are ID-like columns |
|
Draw boxplot of resuduals - for regression |
|
Draw confusion matrix for the model |
|
Draw Feature Importance plot |
|
Plot radar chart of one metric |
|
Draws train vs test RMSE plot for models |
|
Draw AUC ROC curve for the best model |
|
Draw scatterplot of true vs predicted values of target for training and test data for one model |
|
Explain forester model |
|
Fertility dataset |
|
Return colors from palette |
|
Format info about models |
|
Guess task type by the target value from the dataset |
|
Lisbon dataset |
|
Lymph dataset |
|
Manage missing values |
|
Predict models depending on the engine |
|
Predictions for a list of models with multiple occurrences of the same types of models |
|
Perform predictions on new data |
|
Prepare data into format correct for the selected model engine |
|
Conduct preprocessing processes |
|
Remove columns with one value for all rows |
|
Random optimization of hyperparameters |
|
Generate report after training |
|
Save elements from forester |
|
Save column names deleted during preprocessing process |
|
Score models by suitable metrics |
|
Testing dataset |
|
Train models with forester |
|
Train models from given engines |
|
Train models with Bayesian Optimization algorithm |
|
Balance and split the dataset |
|
Print the provided cat-like input if verbose is TRUE |