Calculate Residual Drift for old model and new vs. old data
calculate_residuals_drift(model_old, data_old, data_new, y_old, y_new, predict_function = predict, bins = 20)
model_old | model created on historical / `old` data |
---|---|
data_old | data frame with historical / `old` data |
data_new | data frame with current / `new` data |
y_old | true values of target variable for historical / `old` data |
y_new | true values of target variable for current / `new` data |
predict_function | function that takes two arguments: model and new data and returns numeric vector with predictions, by default it's `predict` |
bins | continuous variables are discretized to `bins` intervals of equal sizes |
an object of a class `covariate_drift` (data.frame) with Non-Intersection Distances calculated for residuals
library("DALEX") model_old <- lm(m2.price ~ ., data = apartments) model_new <- lm(m2.price ~ ., data = apartments_test[1:1000,]) calculate_model_drift(model_old, model_new, apartments_test[1:1000,], apartments_test[1:1000,]$m2.price)#> Variable Shift Scaled #> ----------------------------------------------- #> floor 18.98 2.1 #> no.rooms 27.33 3.1 #> surface 37.96 4.3 #> m2.price 23.35 2.6 #> construction.year 25.11 2.8library("ranger") predict_function <- function(m,x,...) predict(m, x, ...)$predictions model_old <- ranger(m2.price ~ ., data = apartments) calculate_residuals_drift(model_old, apartments_test[1:4000,], apartments_test[4001:8000,], apartments_test$m2.price[1:4000], apartments_test$m2.price[4001:8000], predict_function = predict_function)#> Variable Shift #> ------------------------------------- #> Residuals 3.7calculate_residuals_drift(model_old, apartments, apartments_test, apartments$m2.price, apartments_test$m2.price, predict_function = predict_function)#> Variable Shift #> ------------------------------------- #> Residuals 35.8 **