Calculate Residual Drift for old model and new vs. old data

calculate_residuals_drift(model_old, data_old, data_new, y_old, y_new,
  predict_function = predict, bins = 20)

Arguments

model_old

model created on historical / `old` data

data_old

data frame with historical / `old` data

data_new

data frame with current / `new` data

y_old

true values of target variable for historical / `old` data

y_new

true values of target variable for current / `new` data

predict_function

function that takes two arguments: model and new data and returns numeric vector with predictions, by default it's `predict`

bins

continuous variables are discretized to `bins` intervals of equal sizes

Value

an object of a class `covariate_drift` (data.frame) with Non-Intersection Distances calculated for residuals

Examples

library("DALEX") model_old <- lm(m2.price ~ ., data = apartments) model_new <- lm(m2.price ~ ., data = apartments_test[1:1000,]) calculate_model_drift(model_old, model_new, apartments_test[1:1000,], apartments_test[1:1000,]$m2.price)
#> Variable Shift Scaled #> ----------------------------------------------- #> floor 18.98 2.1 #> no.rooms 27.33 3.1 #> surface 37.96 4.3 #> m2.price 23.35 2.6 #> construction.year 25.11 2.8
library("ranger") predict_function <- function(m,x,...) predict(m, x, ...)$predictions model_old <- ranger(m2.price ~ ., data = apartments) calculate_residuals_drift(model_old, apartments_test[1:4000,], apartments_test[4001:8000,], apartments_test$m2.price[1:4000], apartments_test$m2.price[4001:8000], predict_function = predict_function)
#> Variable Shift #> ------------------------------------- #> Residuals 3.7
calculate_residuals_drift(model_old, apartments, apartments_test, apartments$m2.price, apartments_test$m2.price, predict_function = predict_function)
#> Variable Shift #> ------------------------------------- #> Residuals 35.8 **