module nlp.completion_simple#

Inheritance diagram of mlstatpy.nlp.completion_simple

Short summary#

module mlstatpy.nlp.completion_simple

About completion, simple algorithm

source on GitHub

Classes#

class

truncated documentation

CompletionElement

Definition of an element in a completion system, it contains the following members:

CompletionSystem

define a completion system

Static Methods#

staticmethod

truncated documentation

empty_prefix

return an instance filled with an empty prefix

Methods#

method

truncated documentation

__getitem__

Returns elements[i].

__init__

constructor

__init__

fill the completion system

__iter__

Iterates over elements.

__len__

Number of elements.

__repr__

usual

compare_with_trie

Compares the results with the other implementation.

compute_metrics

Computes the metric for the completion itself.

enumerate_test_metric

Evaluates the completion set on a set of queries, the function returns a list of CompletionElement

find

Not very efficient, finds an item in a the list.

init_metrics

initiate the metrics

items

Iterates on (e.value, e).

sort_values

sort the elements by value

sort_weight

Sorts the elements by value.

str_all_completions

builds a string with all completions for all prefixes along the paths, this is only available if parameter …

str_mks

return a string with metric information

str_mks0

return a string with metric information

test_metric

Evaluates the completion set on a set of queries, the function returns a dictionary with the aggregated metrics …

to_dict

Returns a dictionary.

tuples

Iterates on (e.weight, e.value).

update_metrics

update the metrics

Documentation#

About completion, simple algorithm

source on GitHub

class mlstatpy.nlp.completion_simple.CompletionElement(value: str, weight=1.0, disp=None)#

Bases : object

Definition of an element in a completion system, it contains the following members:

  • value: the completion

  • weight: a weight or a position, we assume a completion with a lower weight is shown at a lower position

  • disp: display string (no impact on the algorithm)

  • mks0*: value of minimum keystroke

  • mks0_*: length of the prefix to obtain mks0

  • mks1: value of dynamic minimum keystroke

  • mks1_: length of the prefix to obtain mks1

  • mks2: value of modified dynamic minimum keystroke

  • mks2_: length of the prefix to obtain mks2

source on GitHub

constructor

Paramètres:
  • value – value (a character)

  • weight – ordering (the lower, the first)

  • disp – original string, use this to identify the node

source on GitHub

__init__(value: str, weight=1.0, disp=None)#

constructor

Paramètres:
  • value – value (a character)

  • weight – ordering (the lower, the first)

  • disp – original string, use this to identify the node

source on GitHub

__repr__()#

usual

source on GitHub

__slots__ = ('value', 'weight', 'disp', 'mks0', 'mks0_', 'mks1', 'mks1_', 'mks2', 'mks2_', 'prefix', '_info')#
_info#
static empty_prefix()#

return an instance filled with an empty prefix

source on GitHub

init_metrics(position: int, completions: List[CompletionElement] | None = None)#

initiate the metrics

Paramètres:
  • position – position in the completion system when prefix is null, position starting from 0

  • completions – displayed completions, if not None, the method will store them in member _completions

Renvoie:

boolean which indicates there was an update

source on GitHub

str_all_completions(maxn=10, use_precompute=True) str#

builds a string with all completions for all prefixes along the paths, this is only available if parameter completions was used when calling method update_metrics.

Paramètres:
  • maxn – maximum number of completions to show

  • use_precompute – use intermediate results built by precompute_stat

Renvoie:

str

source on GitHub

str_mks() str#

return a string with metric information

source on GitHub

str_mks0() str#

return a string with metric information

source on GitHub

update_metrics(prefix: str, position: int, improved: dict, delta: float, completions: List[CompletionElement] | None = None, iteration=-1)#

update the metrics

Paramètres:
  • prefix – prefix

  • position – position in the completion system when prefix has length k, position starting from 0

  • improved – if one metrics is < to the completion length, it means it can be used to improve others queries

  • delta – delta in the dynamic modified mks

  • completions – displayed completions, if not None, the method will store them in member _completions

  • iteration – for debugging purpose, indicates when this improvment was detected

Renvoie:

boolean which indicates there was an update

source on GitHub

class mlstatpy.nlp.completion_simple.CompletionSystem(elements: List[CompletionElement])#

Bases : object

define a completion system

source on GitHub

fill the completion system

source on GitHub

__getitem__(i)#

Returns elements[i].

source on GitHub

__init__(elements: List[CompletionElement])#

fill the completion system

source on GitHub

__iter__() Iterator[CompletionElement]#

Iterates over elements.

source on GitHub

__len__() int#

Number of elements.

source on GitHub

compare_with_trie(delta=0.8, fLOG=<function noLOG>)#

Compares the results with the other implementation.

Paramètres:
  • delta – parameter delta in the dynamic modified mks

  • fLOG – logging function

Renvoie:

None or differences

source on GitHub

compute_metrics(ffilter=None, delta=0.8, details=False, fLOG=<function noLOG>) int#

Computes the metric for the completion itself.

Paramètres:
  • ffilter – filter function

  • delta – parameter delta in the dynamic modified mks

  • details – log more details about displayed completions

  • fLOG – logging function

Renvoie:

number of iterations

The function ends by sorting the set of completion by alphabetical order.

source on GitHub

enumerate_test_metric(qset: Iterator[Tuple[str, float]]) Iterator[Tuple[CompletionElement, CompletionElement]]#

Evaluates the completion set on a set of queries, the function returns a list of CompletionElement with the three metrics M, M', M" for these particular queries.

Paramètres:

qset – list of tuple(str, float) = (query, weight)

Renvoie:

list of tuple of CompletionElement, the first one is the query, the second one is the None or the matching completion

The method compute_metric() needs to be called first.

source on GitHub

find(value: str, is_sorted=False) CompletionElement#

Not very efficient, finds an item in a the list.

Paramètres:
  • value – string to find

  • is_sorted – the function will assume the elements are sorted by alphabetical order

Renvoie:

element or None

source on GitHub

items() Iterator[Tuple[str, CompletionElement]]#

Iterates on (e.value, e).

source on GitHub

sort_values()#

sort the elements by value

source on GitHub

sort_weight()#

Sorts the elements by value.

source on GitHub

test_metric(qset: Iterator[Tuple[str, float]]) Dict[str, float]#

Evaluates the completion set on a set of queries, the function returns a dictionary with the aggregated metrics and some statistics about them.

Paramètres:

qset – list of tuple(str, float) = (query, weight)

Renvoie:

list of CompletionElement

The method compute_metric() needs to be called first. It then calls enumerate_metric().

source on GitHub

to_dict() Dict[str, CompletionElement]#

Returns a dictionary.

source on GitHub

tuples() Iterator[Tuple[float, str]]#

Iterates on (e.weight, e.value).

source on GitHub