module mlmodel._kmeans_constraint_
¶
Functions¶
function 
truncated documentation 

Changes weights mapped to every cluster. weights < 1 are used for big clusters, weights > 1 are used for small … 

Computes weights difference. 

Creates a matrix 

Completes the constraint kmeans. 

Completes the constraint kmeans, the function sorts points by distance to the closest cluster and associates … 

Completes the constraint kmeans. Follows the method described in Samesize kMeans Variation. … 

Associates points to clusters. 

Runs KMeans iterator but weights cluster among them. 

Computes total weighted inertia. 

Computes weighted inertia. It also adds a fraction of the whole inertia depending on how balanced the clusters are. … 

Randomizes index depending on the value. Swap indexes. Modifies index. 

Tries to switch clusters. Modifies labels inplace. 

Completes the constraint kmeans. 

Computes the predictions but tries to associates the same numbers of points in each cluster. 

Linearizes a matrix into a new one with 3 columns value, row, column. The output format is similar to :epkg:`csr_matrix` … 
Documentation¶
Implémente la classe ConstraintKMeans
.

mlinsights.mlmodel._kmeans_constraint_.
_adjust_weights
(X, sw, weights, labels, lr)[source]¶ Changes weights mapped to every cluster. weights < 1 are used for big clusters, weights > 1 are used for small clusters.
 Parameters
X – features
centers – centers
sw – sample weights
weights – cluster weights
lr – learning rate
labels – known labels
 Returns
labels

mlinsights.mlmodel._kmeans_constraint_.
_compute_balance
(X, sw, labels, nbc=None)[source]¶ Computes weights difference.
 Parameters
X – features
sw – sample weights
labels – known labels
nbc – number of clusters
 Returns
(weights per cluster, expected weight, total weight)

mlinsights.mlmodel._kmeans_constraint_.
_compute_strategy_coefficient
(distances, strategy, labels)[source]¶ Creates a matrix

mlinsights.mlmodel._kmeans_constraint_.
_constraint_association
(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]¶ Completes the constraint kmeans.
 Parameters
X – features
labels – initialized labels (unused)
centers – initialized centers
x_squared_norms – norm of X
limit – number of point to associate per cluster
leftover – number of points to associate at the end
counters – allocated array
leftclose – allocated array
labels – allocated array
distances_close – allocated array
strategy – strategy used to sort point before mapping them to a cluster
state – random state

mlinsights.mlmodel._kmeans_constraint_.
_constraint_association_distance
(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]¶ Completes the constraint kmeans, the function sorts points by distance to the closest cluster and associates them into that order. It deals first with the further point and maps it to the closest center.
 Parameters
X – features
labels – initialized labels (unused)
centers – initialized centers
x_squared_norms – norm of X
limit – number of point to associate per cluster
leftover – number of points to associate at the end
counters – allocated array
leftclose – allocated array
labels – allocated array
distances_close – allocated array
strategy – strategy used to sort point before mapping them to a cluster
state – random state (unused)

mlinsights.mlmodel._kmeans_constraint_.
_constraint_association_gain
(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]¶ Completes the constraint kmeans. Follows the method described in Samesize kMeans Variation.
 Parameters
X – features
labels – initialized labels (unused)
centers – initialized centers
x_squared_norms – norm of X
limit – number of points to associate per cluster
leftover – number of points to associate at the end
counters – allocated array
leftclose – allocated array
labels – allocated array
distances_close – allocated array
strategy – strategy used to sort point before mapping them to a cluster
state – random state

mlinsights.mlmodel._kmeans_constraint_.
_constraint_association_weights
(X, centers, sw, weights)[source]¶ Associates points to clusters.
 Parameters
X – features
centers – centers
sw – sample weights
weights – cluster weights
 Returns
labels

mlinsights.mlmodel._kmeans_constraint_.
_constraint_kmeans_weights
(X, labels, sample_weight, centers, inertia, it, max_iter, verbose=0, state=None, learning_rate=1.0, history=False, fLOG=None)[source]¶ Runs KMeans iterator but weights cluster among them.
 Parameters
X – features
labels – initialized labels (unused)
sample_weight – sample weight
centers – initialized centers
inertia – initialized inertia (unused)
it – number of iteration already done
max_iter – maximum of number of iteration
verbose – verbose
state – random state
learning_rate – learning rate
history – keeps all centers accross iterations
fLOG – logging function (needs to be specified otherwise verbose has no effects)
 Returns
tuple (best_labels, best_centers, best_inertia, weights, it)

mlinsights.mlmodel._kmeans_constraint_.
_inertia
(X, sw)[source]¶ Computes total weighted inertia.
 Parameters
X – features
sw – sample weights
 Returns
inertia

mlinsights.mlmodel._kmeans_constraint_.
_labels_inertia_weights
(X, centers, sw, weights, labels, total_inertia)[source]¶ Computes weighted inertia. It also adds a fraction of the whole inertia depending on how balanced the clusters are.
 Parameters
X – features
centers – centers
sw – sample weights
weights – cluster weights
labels – labels
total_inertia – total inertia
 Returns
inertia

mlinsights.mlmodel._kmeans_constraint_.
_randomize_index
(index, weights)[source]¶ Randomizes index depending on the value. Swap indexes. Modifies index.

mlinsights.mlmodel._kmeans_constraint_.
_switch_clusters
(labels, distances)[source]¶ Tries to switch clusters. Modifies labels inplace.
 Parameters
labels – labels
distances – distances

mlinsights.mlmodel._kmeans_constraint_.
constraint_kmeans
(X, labels, sample_weight, centers, inertia, iter, max_iter, strategy='gain', verbose=0, state=None, learning_rate=1.0, history=False, fLOG=None)[source]¶ Completes the constraint kmeans.
 Parameters
X – features
labels – initialized labels (unused)
sample_weight – sample weight
centers – initialized centers
inertia – initialized inertia (unused)
iter – number of iteration already done
max_iter – maximum of number of iteration
strategy – strategy used to sort observations before mapping them to clusters
verbose – verbose
state – random state
learning_rate – used by strategy ‘weights’
history – return list of centers accross iterations
fLOG – logging function (needs to be specified otherwise verbose has no effects)
 Returns
tuple (best_labels, best_centers, best_inertia, iter, all_centers)

mlinsights.mlmodel._kmeans_constraint_.
constraint_predictions
(X, centers, strategy, state=None)[source]¶ Computes the predictions but tries to associates the same numbers of points in each cluster.
 Parameters
X – features
centers – centers of each clusters
strategy – strategy used to sort point before mapping them to a cluster
state – random state
 Returns
labels, distances, distances_close

mlinsights.mlmodel._kmeans_constraint_.
linearize_matrix
(mat, *adds)[source]¶ Linearizes a matrix into a new one with 3 columns value, row, column. The output format is similar to :epkg:`csr_matrix` but null values are kept.
 Parameters
mat – matrix
adds – additional square matrices
 Returns
new matrix
adds defines additional matrices, it adds columns on the right side and fill them with the corresponding value taken into the additional matrices.