Currently three checks are implemented, covariate drift, residual drift and model drift.

check_drift(model_old, model_new, data_old, data_new, y_old, y_new,
  predict_function = predict, max_obs = 100, bins = 20,
  scale = sd(y_new, na.rm = TRUE))



model created on historical / `old`data


model created on current / `new`data


data frame with historical / `old` data


data frame with current / `new` data


true values of target variable for historical / `old` data


true values of target variable for current / `new` data


function that takes two arguments: model and new data and returns numeric vector with predictions, by default it's `predict`


if negative, them all observations are used for calculation of PDP, is positive, then only `max_obs` are used for calculation of PDP


continuous variables are discretized to `bins` intervals of equal sizes


scale parameter for calculation of scaled drift


This function is executed for its side effects, all checks are being printed on the screen. Additionaly it returns list with particualr checks.


library("DALEX") model_old <- lm(m2.price ~ ., data = apartments) model_new <- lm(m2.price ~ ., data = apartments_test[1:1000,]) check_drift(model_old, model_new, apartments, apartments_test, apartments$m2.price, apartments_test$m2.price)
#> ------------------------------------- #> Variable Shift #> ------------------------------------- #> m2.price 4.9 #> construction.year 6.0 #> surface 6.8 #> floor 4.9 #> no.rooms 2.8 #> district 2.8 #> ------------------------------------- #> Variable Shift #> ------------------------------------- #> Residuals 8.3 #> ----------------------------------------------- #> Variable Shift Scaled #> ----------------------------------------------- #> floor 22.10 2.5 #> no.rooms 27.44 3.0 #> surface 30.12 3.3 #> m2.price 26.41 2.9 #> construction.year 29.49 3.3
library("ranger") predict_function <- function(m,x,...) predict(m, x, ...)$predictions model_old <- ranger(m2.price ~ ., data = apartments) model_new <- ranger(m2.price ~ ., data = apartments_test) check_drift(model_old, model_new, apartments, apartments_test, apartments$m2.price, apartments_test$m2.price, predict_function = predict_function)
#> ------------------------------------- #> Variable Shift #> ------------------------------------- #> m2.price 4.9 #> construction.year 6.0 #> surface 6.8 #> floor 4.9 #> no.rooms 2.8 #> district 2.8 #> ------------------------------------- #> Variable Shift #> ------------------------------------- #> Residuals 34.1 ** #> ----------------------------------------------- #> Variable Shift Scaled #> ----------------------------------------------- #> floor 83.14 9.2 #> no.rooms 160.79 17.9 . #> surface 164.33 18.2 . #> m2.price 166.15 18.5 . #> construction.year 201.95 22.4 *