Function select_sample
selects subset of rows from data set.
This is useful if data is large and we need just a sample to calculate profiles.
select_sample(data, n = 100, seed = 1313)
set of observations. Profile will be calculated for every observation (every row)
number of observations to select.
seed for random number generator.
a data frame with selected rows
Note that select_subsample()
function is S3 generic.
If you want to work on non standard data sources (like H2O ddf, external databases)
you should overload it.
library("ingredients")
small_apartments <- select_sample(DALEX::apartments_test)
head(small_apartments)
#> m2.price construction.year surface floor no.rooms district
#> 9707 5670 2008 98 3 3 Srodmiescie
#> 9796 2696 1932 110 10 4 Ursus
#> 9644 3466 1980 73 10 2 Mokotow
#> 7567 2818 1940 63 8 3 Praga
#> 4090 3803 1955 105 3 3 Ochota
#> 8594 3643 1999 36 9 2 Ursus