module mltricks.kmeans_constraint_

Short summary

module papierstat.mltricks.kmeans_constraint_

Implémente la classe ConstraintKMeans.

source on GitHub

Functions

function truncated documentation
_compute_strategy_coefficient Creates a matrix
_constraint_association Completes the constraint k-means.
_constraint_association_distance Completes the constraint k-means.
_constraint_association_gain Completes the constraint k-means.
constraint_kmeans Completes the constraint k-means.
constraint_predictions Computes the predictions but tries to associates the same numbers of points in each cluster.
linearize_matrix Linearizes a matrix into a new one with 3 columns value, row, column. The output format is similar to csr_matrix

Documentation

Implémente la classe ConstraintKMeans.

source on GitHub

papierstat.mltricks.kmeans_constraint_._compute_strategy_coefficient(distances, strategy, labels)[source]

Creates a matrix

source on GitHub

papierstat.mltricks.kmeans_constraint_._constraint_association(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]

Completes the constraint k-means.

Paramètres:
  • X – features
  • labels – initialized labels (unsued)
  • centers – initialized centers
  • x_squared_norms – norm of X
  • limit – number of point to associate per cluster
  • leftover – number of points to associate at the end
  • counters – allocated array
  • leftclose – allocated array
  • labels – allocated array
  • distances_close – allocated array
  • strategy – strategy used to sort point before mapping them to a cluster
  • state – random state

source on GitHub

papierstat.mltricks.kmeans_constraint_._constraint_association_distance(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]

Completes the constraint k-means.

Paramètres:
  • X – features
  • labels – initialized labels (unsued)
  • centers – initialized centers
  • x_squared_norms – norm of X
  • limit – number of point to associate per cluster
  • leftover – number of points to associate at the end
  • counters – allocated array
  • leftclose – allocated array
  • labels – allocated array
  • distances_close – allocated array
  • strategy – strategy used to sort point before mapping them to a cluster
  • state – random state (unused)

source on GitHub

papierstat.mltricks.kmeans_constraint_._constraint_association_gain(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]

Completes the constraint k-means.

Paramètres:
  • X – features
  • labels – initialized labels (unsued)
  • centers – initialized centers
  • x_squared_norms – norm of X
  • limit – number of points to associate per cluster
  • leftover – number of points to associate at the end
  • counters – allocated array
  • leftclose – allocated array
  • labels – allocated array
  • distances_close – allocated array
  • strategy – strategy used to sort point before mapping them to a cluster
  • state – random state

See Same-size k-Means Variation.

source on GitHub

papierstat.mltricks.kmeans_constraint_.constraint_kmeans(X, labels, sample_weight, centers, inertia, precompute_distances, iter, max_iter, strategy='gain', verbose=0, state=None, fLOG=None)[source]

Completes the constraint k-means.

Paramètres:
  • X – features
  • labels – initialized labels (unsued)
  • sample_weight – sample weight
  • centers – initialized centers
  • inertia – initialized inertia (unsued)
  • precompute_distances – precompute distances (used in _label_inertia)
  • iter – number of iteration already done
  • max_iter – maximum of number of iteration
  • strategy – strategy used to sort observations before mapping them to clusters
  • verbose – verbose
  • state – random state
  • fLOG – logging function (needs to be specified otherwise verbose has no effects)
Renvoie:

tuple (best_labels, best_centers, best_inertia, iter)

source on GitHub

papierstat.mltricks.kmeans_constraint_.constraint_predictions(X, centers, strategy, state=None)[source]

Computes the predictions but tries to associates the same numbers of points in each cluster.

Paramètres:
  • X – features
  • centers – centers of each clusters
  • strategy – strategy used to sort point before mapping them to a cluster
  • state – random state
Renvoie:

labels, distances, distances_close

source on GitHub

papierstat.mltricks.kmeans_constraint_.linearize_matrix(mat, *adds)[source]

Linearizes a matrix into a new one with 3 columns value, row, column. The output format is similar to csr_matrix but null values are kept.

Paramètres:
  • mat – matrix
  • adds – additional square matrices
Renvoie:

new matrix

adds defines additional matrices, it adds columns on the right side and fill them with the corresponding value taken into the additional matrices.

source on GitHub