module search_rank.search_engine_vectors
¶
Short summary¶
module mlinsights.search_rank.search_engine_vectors
Implements a way to get close examples based on the output of a machine learned model.
Classes¶
class |
truncated documentation |
---|---|
Implements a kind of local search engine which looks for similar results assuming they are vectors. The class is … |
Static Methods¶
staticmethod |
truncated documentation |
---|---|
Restore the features, the metadata to a |
Methods¶
method |
truncated documentation |
---|---|
usual |
|
Finds the closest n_neighbors. |
|
Fits the nearest neighbors. |
|
Tells if an objet is an iterator or not. |
|
Stores data in the class itself. |
|
Reorders the closest n_neighbors. |
|
Every vector comes with a list of metadata. |
|
Searches for neighbors close to X. |
|
Saves the features and the metadata into a zipfile. The function does not save the k-nn. |
Documentation¶
Implements a way to get close examples based on the output of a machine learned model.
-
class
mlinsights.search_rank.search_engine_vectors.
SearchEngineVectors
(**pknn)¶ Bases:
object
Implements a kind of local search engine which looks for similar results assuming they are vectors. The class is using sklearn.neighborsNearestNeighbors to find the nearest neighbors of a vector and follows the same API. The class populates members:
features_
: vectors used to compute the neighborsknn_
: parameters for the sklearn.neighborsNearestNeighborsmetadata_
: metadata, can be None
- Parameters
pknn – list of parameters, see sklearn.neighborsNearestNeighbors
-
__init__
(**pknn)¶ - Parameters
pknn – list of parameters, see sklearn.neighborsNearestNeighbors
-
__repr__
()¶ usual
-
_first_pass
(X, n_neighbors=None)¶ Finds the closest n_neighbors.
- Parameters
X – features
n_neighbors – number of neighbors to get (default is the value passed to the constructor)
- Returns
dist, ind
dist is an array representing the lengths to points, ind contains the indices of the nearest points in the population matrix.
-
_fit_knn
()¶ Fits the nearest neighbors.
-
_is_iterable
(data)¶ Tells if an objet is an iterator or not.
-
_prepare_fit
(data=None, features=None, metadata=None, transform=None)¶ Stores data in the class itself.
- Parameters
data – a dataframe or None if the the features and the metadata are specified with an array and a dictionary
features – features columns or an array
metadata – data
transform – transform each vector before using it
transform is a function whose signature:
def transform(vec, many): # Many tells is the functions receives many vectors # or just one (many=False).
Function transform is applied only if data is not None.
-
_second_pass
(X, dist, ind)¶ Reorders the closest n_neighbors.
- Parameters
X – features
dist – array representing the lengths to points
ind – indices of the nearest points in the population matrix
- Returns
score, ind
score is an array representing the lengths to points, ind contains the indices of the nearest points in the population matrix.
-
fit
(data=None, features=None, metadata=None)¶ Every vector comes with a list of metadata.
- Parameters
data – a dataframe or None if the the features and the metadata are specified with an array and a dictionary
features – features columns or an array
metadata – data
-
kneighbors
(X, n_neighbors=None)¶ Searches for neighbors close to X.
- Parameters
X – features
- Returns
score, ind, meta
score is an array representing the lengths to points, ind contains the indices of the nearest points in the population matrix, meta is the metadata
-
static
read_zip
(zipfilename, **kwargs)¶ Restore the features, the metadata to a
SearchEngineVectors
.- Parameters
zipfilename – a zipfile.ZipFile or a filename
zname – a filename in th zipfile
kwargs – parameters for pandas.read_csv
- Returns
It only works for Python 3.6+.
-
to_zip
(zipfilename, **kwargs)¶ Saves the features and the metadata into a zipfile. The function does not save the k-nn.
- Parameters
zipfilename – a zipfile.ZipFile or a filename
kwargs – parameters for pandas.to_csv (for the metadata)
- Returns
zipfilename
The function relies on function to_zip. It only works for Python 3.6+.