module nlp.completion_simple

Inheritance diagram of mlstatpy.nlp.completion_simple

Short summary

module mlstatpy.nlp.completion_simple

About completion, simple algorithm

source on GitHub

Classes

class

truncated documentation

CompletionElement

Definition of an element in a completion system, it contains the following members:

CompletionSystem

define a completion system

Static Methods

staticmethod

truncated documentation

empty_prefix

return an instance filled with an empty prefix

Methods

method

truncated documentation

__getitem__

Returns elements[i].

__init__

constructor

__init__

fill the completion system

__iter__

Iterates over elements.

__len__

Number of elements.

__repr__

usual

compare_with_trie

Compares the results with the other implementation.

compute_metrics

Computes the metric for the completion itself.

enumerate_test_metric

Evaluates the completion set on a set of queries, the function returns a list of CompletionElement

find

Not very efficient, finds an item in a the list.

init_metrics

initiate the metrics

items

Iterates on (e.value, e).

sort_values

sort the elements by value

sort_weight

Sorts the elements by value.

str_all_completions

builds a string with all completions for all prefixes along the paths, this is only available if parameter …

str_mks

return a string with metric information

str_mks0

return a string with metric information

test_metric

Evaluates the completion set on a set of queries, the function returns a dictionary with the aggregated metrics …

to_dict

Returns a dictionary.

tuples

Iterates on (e.weight, e.value).

update_metrics

update the metrics

Documentation

About completion, simple algorithm

source on GitHub

class mlstatpy.nlp.completion_simple.CompletionElement(value: str, weight=1.0, disp=None)[source]

Bases : object

Definition of an element in a completion system, it contains the following members:

  • value: the completion

  • weight: a weight or a position, we assume a completion with a lower weight is shown at a lower position

  • disp: display string (no impact on the algorithm)

  • mks0*: value of minimum keystroke

  • mks0_*: length of the prefix to obtain mks0

  • mks1: value of dynamic minimum keystroke

  • mks1_: length of the prefix to obtain mks1

  • mks2: value of modified dynamic minimum keystroke

  • mks2_: length of the prefix to obtain mks2

source on GitHub

constructor

Paramètres
  • value – value (a character)

  • weight – ordering (the lower, the first)

  • disp – original string, use this to identify the node

source on GitHub

__init__(value: str, weight=1.0, disp=None)[source]

constructor

Paramètres
  • value – value (a character)

  • weight – ordering (the lower, the first)

  • disp – original string, use this to identify the node

source on GitHub

__repr__()[source]

usual

source on GitHub

__slots__ = ('value', 'weight', 'disp', 'mks0', 'mks0_', 'mks1', 'mks1_', 'mks2', 'mks2_', 'prefix', '_info')
_info
static empty_prefix()[source]

return an instance filled with an empty prefix

source on GitHub

init_metrics(position: int, completions: List[CompletionElement] = None)[source]

initiate the metrics

Paramètres
  • position – position in the completion system when prefix is null, position starting from 0

  • completions – displayed completions, if not None, the method will store them in member _completions

Renvoie

boolean which indicates there was an update

source on GitHub

str_all_completions(maxn=10, use_precompute=True) → str[source]

builds a string with all completions for all prefixes along the paths, this is only available if parameter completions was used when calling method update_metrics.

Paramètres
  • maxn – maximum number of completions to show

  • use_precompute – use intermediate results built by precompute_stat

Renvoie

str

source on GitHub

str_mks() → str[source]

return a string with metric information

source on GitHub

str_mks0() → str[source]

return a string with metric information

source on GitHub

update_metrics(prefix: str, position: int, improved: dict, delta: float, completions: List[CompletionElement] = None, iteration=-1)[source]

update the metrics

Paramètres
  • prefix – prefix

  • position – position in the completion system when prefix has length k, position starting from 0

  • improved – if one metrics is < to the completion length, it means it can be used to improve others queries

  • delta – delta in the dynamic modified mks

  • completions – displayed completions, if not None, the method will store them in member _completions

  • iteration – for debugging purpose, indicates when this improvment was detected

Renvoie

boolean which indicates there was an update

source on GitHub

class mlstatpy.nlp.completion_simple.CompletionSystem(elements: List[mlstatpy.nlp.completion_simple.CompletionElement])[source]

Bases : object

define a completion system

source on GitHub

fill the completion system

source on GitHub

__getitem__(i)[source]

Returns elements[i].

source on GitHub

__init__(elements: List[mlstatpy.nlp.completion_simple.CompletionElement])[source]

fill the completion system

source on GitHub

__iter__() → Iterator[mlstatpy.nlp.completion_simple.CompletionElement][source]

Iterates over elements.

source on GitHub

__len__() → int[source]

Number of elements.

source on GitHub

compare_with_trie(delta=0.8, fLOG=<function noLOG>)[source]

Compares the results with the other implementation.

Paramètres
  • delta – parameter delta in the dynamic modified mks

  • fLOG – logging function

Renvoie

None or differences

source on GitHub

compute_metrics(ffilter=None, delta=0.8, details=False, fLOG=<function noLOG>) → int[source]

Computes the metric for the completion itself.

Paramètres
  • ffilter – filter function

  • delta – parameter delta in the dynamic modified mks

  • details – log more details about displayed completions

  • fLOG – logging function

Renvoie

number of iterations

The function ends by sorting the set of completion by alphabetical order.

source on GitHub

enumerate_test_metric(qset: Iterator[Tuple[str, float]]) → Iterator[Tuple[mlstatpy.nlp.completion_simple.CompletionElement, mlstatpy.nlp.completion_simple.CompletionElement]][source]

Evaluates the completion set on a set of queries, the function returns a list of CompletionElement with the three metrics M, M', M" for these particular queries.

Paramètres

qset – list of tuple(str, float) = (query, weight)

Renvoie

list of tuple of CompletionElement, the first one is the query, the second one is the None or the matching completion

The method compute_metric() needs to be called first.

source on GitHub

find(value: str, is_sorted=False) → mlstatpy.nlp.completion_simple.CompletionElement[source]

Not very efficient, finds an item in a the list.

Paramètres
  • value – string to find

  • is_sorted – the function will assume the elements are sorted by alphabetical order

Renvoie

element or None

source on GitHub

items() → Iterator[Tuple[str, mlstatpy.nlp.completion_simple.CompletionElement]][source]

Iterates on (e.value, e).

source on GitHub

sort_values()[source]

sort the elements by value

source on GitHub

sort_weight()[source]

Sorts the elements by value.

source on GitHub

test_metric(qset: Iterator[Tuple[str, float]]) → Dict[str, float][source]

Evaluates the completion set on a set of queries, the function returns a dictionary with the aggregated metrics and some statistics about them.

Paramètres

qset – list of tuple(str, float) = (query, weight)

Renvoie

list of CompletionElement

The method compute_metric() needs to be called first. It then calls enumerate_metric().

source on GitHub

to_dict() → Dict[str, mlstatpy.nlp.completion_simple.CompletionElement][source]

Returns a dictionary.

source on GitHub

tuples() → Iterator[Tuple[float, str]][source]

Iterates on (e.weight, e.value).

source on GitHub