module nlp.completion
#
Short summary#
module mlstatpy.nlp.completion
About completion
Classes#
class |
truncated documentation |
---|---|
Node definition in a trie used to do completion, see Complétion. This implementation is not very efficient … |
Properties#
property |
truncated documentation |
---|---|
Returns the initial node with no parent. |
Static Methods#
staticmethod |
truncated documentation |
---|---|
Builds a trie. |
Methods#
method |
truncated documentation |
---|---|
Iterates on all nodes (sorted). |
|
usual |
|
Adds a child. |
|
Retrieves all completions for a node, the method does not need |
|
Retrieves all completions for a node, the method assumes |
|
Returns the node which holds all completions starting with a given prefix. |
|
Iterates on children, iterates on weight, key, child. |
|
All children nodes inluding itself in a list. |
|
Iterators on leaves sorted per weight, yield weight, value. |
|
Iterators on leaves. |
|
Returns the dynamic minimum keystrokes for a word. |
|
Returns the modified dynamic minimum keystrokes for a word. |
|
Returns the minimum keystrokes for a word without optimisation, this function should be used if you only have a … |
|
Returns the minimum keystrokes for a word. |
|
Computes and stores list of completions for each node, computes mks. |
|
Builds a string with all completions for all prefixes along the paths. |
|
Iterates on all nodes. |
|
Must be called after |
Documentation#
About completion
- class mlstatpy.nlp.completion.CompletionTrieNode(value, leave, weight=1.0, disp=None)#
Bases :
object
Node definition in a trie used to do completion, see Complétion. This implementation is not very efficient about memmory consumption, it does not hold above 200.000 words. It should be done another way (cython, C++).
- Paramètres:
value – value (a character)
leave – boolean (is it a completion)
weight – ordering (the lower, the first)
disp – original string, use this to identify the node
- class _Stat#
Bases :
object
Stores statistics and intermediate data about the compuation the metrics.
It contains the following members:
mks0*: value of minimum keystroke
mks0_*: length of the prefix to obtain mks0
mks_iter: current iteration during the computation of mks
mks1: value of dynamic minimum keystroke
mks1_: length of the prefix to obtain mks
mks1i_: iteration when it was obtained
mks2: value of modified dynamic minimum keystroke
mks2_: length of the prefix to obtain mks2
mks2i: iteration when it converged
- init_dynamic_minimum_keystroke(lw)#
Initializes mks and mks2 from from mks0.
- Paramètres:
lw – length of the prefix
- merge_completions(prefix: int, nodes: [CompletionTrieNode])#
Merges list of completions and cut the list, we assume given lists are sorted.
- update_dynamic_minimum_keystroke(lw, delta)#
Updates dynamic minimum keystroke for the completions.
- Paramètres:
lw – prefix length
delta – parameter in defintion Modified Dynamic KeyStroke
- Renvoie:
number of updates
- update_minimum_keystroke(lw)#
Updates minimum keystroke for the completions.
- Paramètres:
lw – prefix length
- __init__(value, leave, weight=1.0, disp=None)#
- Paramètres:
value – value (a character)
leave – boolean (is it a completion)
weight – ordering (the lower, the first)
disp – original string, use this to identify the node
- __iter__()#
Iterates on all nodes (sorted).
- __slots__ = ('value', 'children', 'weight', 'leave', 'stat', 'parent', 'disp')#
- __str__()#
usual
- _add(key, child)#
Adds a child.
- Paramètres:
key – one letter of the word
child – child
- Renvoie:
self
- all_completions() List[Tuple[CompletionTrieNone, List[str]]] #
Retrieves all completions for a node, the method does not need
precompute_stat
to be run first.
- all_mks_completions() List[Tuple[CompletionTrieNone, List[CompletionTrieNone]]] #
Retrieves all completions for a node, the method assumes
precompute_stat
was run.
- static build(words) CompletionTrieNode #
Builds a trie.
- Paramètres:
words – list of
(word)
or(weight, word)
or(weight, word, display string)
- Renvoie:
root of the trie (CompletionTrieNode)
- find(prefix: str) CompletionTrieNode #
Returns the node which holds all completions starting with a given prefix.
- Paramètres:
prefix – prefix
- Renvoie:
node or None for no result
- items() Iterator[Tuple[float, str, CompletionTrieNode]] #
Iterates on children, iterates on weight, key, child.
- items_list() List[CompletionTrieNode] #
All children nodes inluding itself in a list.
- Renvoie:
list[
- iter_leaves(max_weight=None) Iterator[Tuple[float, str]] #
Iterators on leaves sorted per weight, yield weight, value.
- Paramètres:
max_weight – keep all value under this threshold or None for all
- leaves() Iterator[CompletionTrieNode] #
Iterators on leaves.
- min_dynamic_keystroke(word: str) Tuple[int, int] #
Returns the dynamic minimum keystrokes for a word.
- Paramètres:
word – word
- Renvoie:
number, length of best prefix, iteration it stops moving
This function must be called after
precompute_stat
andupdate_stat_dynamic
. See Dynamic Minimum Keystroke.
- min_dynamic_keystroke2(word: str) Tuple[int, int] #
Returns the modified dynamic minimum keystrokes for a word.
- Paramètres:
word – word
- Renvoie:
number, length of best prefix, iteration it stops moving
This function must be called after
precompute_stat
andupdate_stat_dynamic
. See Modified Dynamic Minimum Keystroke.
- min_keystroke(word: str) Tuple[int, int] #
Returns the minimum keystrokes for a word without optimisation, this function should be used if you only have a couple of values to computes. You shoud use
min_keystroke0
to compute all of them.- Paramètres:
word – word
- Renvoie:
number, length of best prefix
- min_keystroke0(word: str) Tuple[int, int] #
Returns the minimum keystrokes for a word.
- Paramètres:
word – word
- Renvoie:
number, length of best prefix, iteration it stops moving
This function must be called after
precompute_stat
andupdate_stat_dynamic
.
- precompute_stat()#
Computes and stores list of completions for each node, computes mks.
- Paramètres:
clean – clean stat
- property root#
Returns the initial node with no parent.
- str_all_completions(maxn=10, use_precompute=True) str #
Builds a string with all completions for all prefixes along the paths.
- Paramètres:
maxn – maximum number of completions to show
use_precompute – use intermediate results built by
precompute_stat
- Renvoie:
str
- unsorted_iter()#
Iterates on all nodes.
- update_stat_dynamic(delta=0.8)#
Must be called after
precompute_stat
and computes dynamic mks (see Dynamic Minimum Keystroke).- Paramètres:
delta – parameter in defintion Modified Dynamic KeyStroke
- Renvoie:
number of iterations to converge