module nlp.completion_simple
#
Short summary#
module mlstatpy.nlp.completion_simple
About completion, simple algorithm
Classes#
class |
truncated documentation |
---|---|
Definition of an element in a completion system, it contains the following members: |
|
define a completion system |
Static Methods#
staticmethod |
truncated documentation |
---|---|
return an instance filled with an empty prefix |
Methods#
method |
truncated documentation |
---|---|
Returns |
|
constructor |
|
fill the completion system |
|
Iterates over elements. |
|
Number of elements. |
|
usual |
|
Compares the results with the other implementation. |
|
Computes the metric for the completion itself. |
|
Evaluates the completion set on a set of queries, the function returns a list of |
|
Not very efficient, finds an item in a the list. |
|
initiate the metrics |
|
Iterates on |
|
sort the elements by value |
|
Sorts the elements by value. |
|
builds a string with all completions for all prefixes along the paths, this is only available if parameter … |
|
return a string with metric information |
|
return a string with metric information |
|
Evaluates the completion set on a set of queries, the function returns a dictionary with the aggregated metrics … |
|
Returns a dictionary. |
|
Iterates on |
|
update the metrics |
Documentation#
About completion, simple algorithm
- class mlstatpy.nlp.completion_simple.CompletionElement(value: str, weight=1.0, disp=None)#
Bases :
object
Definition of an element in a completion system, it contains the following members:
value: the completion
weight: a weight or a position, we assume a completion with a lower weight is shown at a lower position
disp: display string (no impact on the algorithm)
mks0*: value of minimum keystroke
mks0_*: length of the prefix to obtain mks0
mks1: value of dynamic minimum keystroke
mks1_: length of the prefix to obtain mks1
mks2: value of modified dynamic minimum keystroke
mks2_: length of the prefix to obtain mks2
constructor
- Paramètres:
value – value (a character)
weight – ordering (the lower, the first)
disp – original string, use this to identify the node
- __init__(value: str, weight=1.0, disp=None)#
constructor
- Paramètres:
value – value (a character)
weight – ordering (the lower, the first)
disp – original string, use this to identify the node
- __repr__()#
usual
- __slots__ = ('value', 'weight', 'disp', 'mks0', 'mks0_', 'mks1', 'mks1_', 'mks2', 'mks2_', 'prefix', '_info')#
- _info#
- static empty_prefix()#
return an instance filled with an empty prefix
- init_metrics(position: int, completions: List[CompletionElement] | None = None)#
initiate the metrics
- Paramètres:
position – position in the completion system when prefix is null, position starting from 0
completions – displayed completions, if not None, the method will store them in member _completions
- Renvoie:
boolean which indicates there was an update
- str_all_completions(maxn=10, use_precompute=True) str #
builds a string with all completions for all prefixes along the paths, this is only available if parameter completions was used when calling method
update_metrics
.- Paramètres:
maxn – maximum number of completions to show
use_precompute – use intermediate results built by
precompute_stat
- Renvoie:
str
- update_metrics(prefix: str, position: int, improved: dict, delta: float, completions: List[CompletionElement] | None = None, iteration=-1)#
update the metrics
- Paramètres:
prefix – prefix
position – position in the completion system when prefix has length k, position starting from 0
improved – if one metrics is < to the completion length, it means it can be used to improve others queries
delta – delta in the dynamic modified mks
completions – displayed completions, if not None, the method will store them in member _completions
iteration – for debugging purpose, indicates when this improvment was detected
- Renvoie:
boolean which indicates there was an update
- class mlstatpy.nlp.completion_simple.CompletionSystem(elements: List[CompletionElement])#
Bases :
object
define a completion system
fill the completion system
- __getitem__(i)#
Returns
elements[i]
.
- __init__(elements: List[CompletionElement])#
fill the completion system
- __iter__() Iterator[CompletionElement] #
Iterates over elements.
- compare_with_trie(delta=0.8, fLOG=<function noLOG>)#
Compares the results with the other implementation.
- Paramètres:
delta – parameter delta in the dynamic modified mks
fLOG – logging function
- Renvoie:
None or differences
- compute_metrics(ffilter=None, delta=0.8, details=False, fLOG=<function noLOG>) int #
Computes the metric for the completion itself.
- Paramètres:
ffilter – filter function
delta – parameter delta in the dynamic modified mks
details – log more details about displayed completions
fLOG – logging function
- Renvoie:
number of iterations
The function ends by sorting the set of completion by alphabetical order.
- enumerate_test_metric(qset: Iterator[Tuple[str, float]]) Iterator[Tuple[CompletionElement, CompletionElement]] #
Evaluates the completion set on a set of queries, the function returns a list of
CompletionElement
with the three metrics , , for these particular queries.- Paramètres:
qset – list of tuple(str, float) = (query, weight)
- Renvoie:
list of tuple of
CompletionElement
, the first one is the query, the second one is the None or the matching completion
The method
compute_metric()
needs to be called first.
- find(value: str, is_sorted=False) CompletionElement #
Not very efficient, finds an item in a the list.
- Paramètres:
value – string to find
is_sorted – the function will assume the elements are sorted by alphabetical order
- Renvoie:
element or None
- items() Iterator[Tuple[str, CompletionElement]] #
Iterates on
(e.value, e)
.
- sort_values()#
sort the elements by value
- sort_weight()#
Sorts the elements by value.
- test_metric(qset: Iterator[Tuple[str, float]]) Dict[str, float] #
Evaluates the completion set on a set of queries, the function returns a dictionary with the aggregated metrics and some statistics about them.
- Paramètres:
qset – list of tuple(str, float) = (query, weight)
- Renvoie:
list of
CompletionElement
The method
compute_metric()
needs to be called first. It then callsenumerate_metric()
.
- to_dict() Dict[str, CompletionElement] #
Returns a dictionary.