module search_rank.search_engine_vectors#

Inheritance diagram of mlinsights.search_rank.search_engine_vectors

Short summary#

module mlinsights.search_rank.search_engine_vectors

Implements a way to get close examples based on the output of a machine learned model.

source on GitHub

Classes#

class

truncated documentation

SearchEngineVectors

Implements a kind of local search engine which looks for similar results assuming they are vectors. The class is …

Static Methods#

staticmethod

truncated documentation

read_zip

Restore the features, the metadata to a SearchEngineVectors.

Methods#

method

truncated documentation

__init__

__repr__

usual

_first_pass

Finds the closest n_neighbors.

_fit_knn

Fits the nearest neighbors.

_is_iterable

Tells if an objet is an iterator or not.

_prepare_fit

Stores data in the class itself.

_second_pass

Reorders the closest n_neighbors.

fit

Every vector comes with a list of metadata.

kneighbors

Searches for neighbors close to X.

to_zip

Saves the features and the metadata into a zipfile. The function does not save the k-nn.

Documentation#

Implements a way to get close examples based on the output of a machine learned model.

source on GitHub

class mlinsights.search_rank.search_engine_vectors.SearchEngineVectors(**pknn)#

Bases: object

Implements a kind of local search engine which looks for similar results assuming they are vectors. The class is using sklearn.neighborsNearestNeighbors to find the nearest neighbors of a vector and follows the same API. The class populates members:

source on GitHub

Parameters:

pknn – list of parameters, see sklearn.neighborsNearestNeighbors

source on GitHub

__init__(**pknn)#
Parameters:

pknn – list of parameters, see sklearn.neighborsNearestNeighbors

source on GitHub

__repr__()#

usual

source on GitHub

_first_pass(X, n_neighbors=None)#

Finds the closest n_neighbors.

Parameters:
  • X – features

  • n_neighbors – number of neighbors to get (default is the value passed to the constructor)

Returns:

dist, ind

dist is an array representing the lengths to points, ind contains the indices of the nearest points in the population matrix.

source on GitHub

_fit_knn()#

Fits the nearest neighbors.

source on GitHub

_is_iterable(data)#

Tells if an objet is an iterator or not.

source on GitHub

_prepare_fit(data=None, features=None, metadata=None, transform=None)#

Stores data in the class itself.

Parameters:
  • data – a dataframe or None if the the features and the metadata are specified with an array and a dictionary

  • features – features columns or an array

  • metadata – data

  • transform – transform each vector before using it

transform is a function whose signature:

def transform(vec, many):
    # Many tells is the functions receives many vectors
    # or just one (many=False).

Function transform is applied only if data is not None.

source on GitHub

_second_pass(X, dist, ind)#

Reorders the closest n_neighbors.

Parameters:
  • X – features

  • dist – array representing the lengths to points

  • ind – indices of the nearest points in the population matrix

Returns:

score, ind

score is an array representing the lengths to points, ind contains the indices of the nearest points in the population matrix.

source on GitHub

fit(data=None, features=None, metadata=None)#

Every vector comes with a list of metadata.

Parameters:
  • data – a dataframe or None if the the features and the metadata are specified with an array and a dictionary

  • features – features columns or an array

  • metadata – data

source on GitHub

kneighbors(X, n_neighbors=None)#

Searches for neighbors close to X.

Parameters:

X – features

Returns:

score, ind, meta

score is an array representing the lengths to points, ind contains the indices of the nearest points in the population matrix, meta is the metadata

source on GitHub

static read_zip(zipfilename, **kwargs)#

Restore the features, the metadata to a SearchEngineVectors.

Parameters:
Returns:

SearchEngineVectors

It only works for Python 3.6+.

source on GitHub

to_zip(zipfilename, **kwargs)#

Saves the features and the metadata into a zipfile. The function does not save the k-nn.

Parameters:
Returns:

zipfilename

The function relies on function to_zip. It only works for Python 3.6+.

source on GitHub