`R/select_neighbours.R`

`select_neighbours.Rd`

Function `select_neighbours`

selects subset of rows from data set.
This is useful if data is large and we need just a sample to calculate profiles.

select_neighbours( observation, data, variables = NULL, distance = gower::gower_dist, n = 20, frac = NULL )

observation | single observation |
---|---|

data | set of observations |

variables | names of variables that shall be used for calculation of distance.
By default these are all variables present in |

distance | the distance function, by default the |

n | number of neighbors to select |

frac | if |

a data frame with selected rows

Note that `select_neighbours()`

function is S3 generic.
If you want to work on non standard data sources (like H2O ddf, external databases)
you should overload it.

library("ingredients") new_apartment <- DALEX::apartments[1,] small_apartments <- select_neighbours(new_apartment, DALEX::apartments_test, n = 10) new_apartment#> m2.price construction.year surface floor no.rooms district #> 1 5897 1953 25 3 1 Srodmiesciesmall_apartments#> m2.price construction.year surface floor no.rooms district #> 2285 5875 1970 27 3 1 Srodmiescie #> 1073 5886 1960 36 2 1 Srodmiescie #> 3261 5859 1945 39 2 1 Srodmiescie #> 6647 5952 1938 30 2 1 Srodmiescie #> 1198 5821 1947 43 2 1 Srodmiescie #> 4309 5794 1947 31 3 2 Srodmiescie #> 9527 6080 1947 27 1 1 Srodmiescie #> 8110 5614 1957 44 4 1 Srodmiescie #> 9510 5860 1937 39 2 1 Srodmiescie #> 2408 5912 1989 24 3 1 Srodmiescie