module mltricks.kmeans_constraint_

Short summary

module papierstat.mltricks.kmeans_constraint_

Implémente la classe ConstraintKMeans.

source on GitHub

Functions

function

truncated documentation

_compute_strategy_coefficient

Creates a matrix

_constraint_association

Completes the constraint k-means.

_constraint_association_distance

Completes the constraint k-means.

_constraint_association_gain

Completes the constraint k-means.

constraint_kmeans

Completes the constraint k-means.

constraint_predictions

Computes the predictions but tries to associates the same numbers of points in each cluster.

linearize_matrix

Linearizes a matrix into a new one with 3 columns value, row, column. The output format is similar to csr_matrix

Documentation

Implémente la classe ConstraintKMeans.

source on GitHub

papierstat.mltricks.kmeans_constraint_._compute_strategy_coefficient(distances, strategy, labels)[source]

Creates a matrix

source on GitHub

papierstat.mltricks.kmeans_constraint_._constraint_association(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]

Completes the constraint k-means.

Paramètres
  • X – features

  • labels – initialized labels (unsued)

  • centers – initialized centers

  • x_squared_norms – norm of X

  • limit – number of point to associate per cluster

  • leftover – number of points to associate at the end

  • counters – allocated array

  • leftclose – allocated array

  • labels – allocated array

  • distances_close – allocated array

  • strategy – strategy used to sort point before mapping them to a cluster

  • state – random state

source on GitHub

papierstat.mltricks.kmeans_constraint_._constraint_association_distance(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]

Completes the constraint k-means.

Paramètres
  • X – features

  • labels – initialized labels (unsued)

  • centers – initialized centers

  • x_squared_norms – norm of X

  • limit – number of point to associate per cluster

  • leftover – number of points to associate at the end

  • counters – allocated array

  • leftclose – allocated array

  • labels – allocated array

  • distances_close – allocated array

  • strategy – strategy used to sort point before mapping them to a cluster

  • state – random state (unused)

source on GitHub

papierstat.mltricks.kmeans_constraint_._constraint_association_gain(leftover, counters, labels, leftclose, distances_close, centers, X, x_squared_norms, limit, strategy, state=None)[source]

Completes the constraint k-means.

Paramètres
  • X – features

  • labels – initialized labels (unsued)

  • centers – initialized centers

  • x_squared_norms – norm of X

  • limit – number of points to associate per cluster

  • leftover – number of points to associate at the end

  • counters – allocated array

  • leftclose – allocated array

  • labels – allocated array

  • distances_close – allocated array

  • strategy – strategy used to sort point before mapping them to a cluster

  • state – random state

See Same-size k-Means Variation.

source on GitHub

papierstat.mltricks.kmeans_constraint_.constraint_kmeans(X, labels, sample_weight, centers, inertia, precompute_distances, iter, max_iter, strategy='gain', verbose=0, state=None, fLOG=None)[source]

Completes the constraint k-means.

Paramètres
  • X – features

  • labels – initialized labels (unsued)

  • sample_weight – sample weight

  • centers – initialized centers

  • inertia – initialized inertia (unsued)

  • precompute_distances – precompute distances (used in _label_inertia)

  • iter – number of iteration already done

  • max_iter – maximum of number of iteration

  • strategy – strategy used to sort observations before mapping them to clusters

  • verbose – verbose

  • state – random state

  • fLOG – logging function (needs to be specified otherwise verbose has no effects)

Renvoie

tuple (best_labels, best_centers, best_inertia, iter)

source on GitHub

papierstat.mltricks.kmeans_constraint_.constraint_predictions(X, centers, strategy, state=None)[source]

Computes the predictions but tries to associates the same numbers of points in each cluster.

Paramètres
  • X – features

  • centers – centers of each clusters

  • strategy – strategy used to sort point before mapping them to a cluster

  • state – random state

Renvoie

labels, distances, distances_close

source on GitHub

papierstat.mltricks.kmeans_constraint_.linearize_matrix(mat, *adds)[source]

Linearizes a matrix into a new one with 3 columns value, row, column. The output format is similar to csr_matrix but null values are kept.

Paramètres
  • mat – matrix

  • adds – additional square matrices

Renvoie

new matrix

adds defines additional matrices, it adds columns on the right side and fill them with the corresponding value taken into the additional matrices.

source on GitHub