module mlmodel._kmeans_022
#
Short summary#
module mlinsights.mlmodel._kmeans_022
Implements k-means with norms L1 and L2.
Functions#
function |
truncated documentation |
---|---|
Compute label assignment and inertia for a dense array Return the inertia (sum of squared distances to the centers). … |
|
Compute label assignment and inertia for a CSR input Return the inertia (sum of squared distances to the centers). |
|
M step of the K-means EM algorithm Computation of cluster centers / means. |
|
M step of the K-means EM algorithm Computation of cluster centers / means. |
|
Computes labels and inertia using a full distance matrix. This will overwrite the ‘distances’ array in-place. |
|
E step of the K-means EM algorithm. Compute the labels and the inertia of the given samples and centers. This will … |
Documentation#
Implements k-means with norms L1 and L2.
- mlinsights.mlmodel._kmeans_022._assign_labels_array(X, sample_weight, x_squared_norms, centers, labels, distances)#
Compute label assignment and inertia for a dense array Return the inertia (sum of squared distances to the centers).
- mlinsights.mlmodel._kmeans_022._assign_labels_csr(X, sample_weight, x_squared_norms, centers, labels, distances)#
Compute label assignment and inertia for a CSR input Return the inertia (sum of squared distances to the centers).
- mlinsights.mlmodel._kmeans_022._centers_dense(X, sample_weight, labels, n_clusters, distances)#
M step of the K-means EM algorithm Computation of cluster centers / means.
- Parameters:
X – array-like, shape (n_samples, n_features)
sample_weight – array-like, shape (n_samples,) The weights for each observation in X.
labels – array of integers, shape (n_samples) Current label assignment
n_clusters – int Number of desired clusters
distances – array-like, shape (n_samples) Distance to closest cluster for each sample.
- Returns:
centers : array, shape (n_clusters, n_features) The resulting centers
- mlinsights.mlmodel._kmeans_022._centers_sparse(X, sample_weight, labels, n_clusters, distances)#
M step of the K-means EM algorithm Computation of cluster centers / means.
- Parameters:
X – scipy.sparse.csr_matrix, shape (n_samples, n_features)
sample_weight – array-like, shape (n_samples,) The weights for each observation in X.
labels – array of integers, shape (n_samples) Current label assignment
n_clusters – int Number of desired clusters
distances – array-like, shape (n_samples) Distance to closest cluster for each sample.
- Returns:
centers, array, shape (n_clusters, n_features) The resulting centers
- mlinsights.mlmodel._kmeans_022._labels_inertia_precompute_dense(norm, X, sample_weight, centers, distances)#
Computes labels and inertia using a full distance matrix.
This will overwrite the ‘distances’ array in-place.
- Parameters:
norm – ‘L1’ or ‘L2’
X – numpy array, shape (n_sample, n_features) Input data.
sample_weight – array-like, shape (n_samples,) The weights for each observation in X.
centers – numpy array, shape (n_clusters, n_features) Cluster centers which data is assigned to.
distances – numpy array, shape (n_samples,) Pre-allocated array in which distances are stored.
- Returns:
labels : numpy array, dtype=numpy.int, shape (n_samples,) Indices of clusters that samples are assigned to.
- Returns:
inertia : float Sum of squared distances of samples to their closest cluster center.
- mlinsights.mlmodel._kmeans_022._labels_inertia_skl(X, sample_weight, x_squared_norms, centers, distances=None)#
E step of the K-means EM algorithm. Compute the labels and the inertia of the given samples and centers. This will compute the distances in-place.
- Parameters:
X – float64 array-like or CSR sparse matrix, shape (n_samples, n_features) The input samples to assign to the labels.
sample_weight – array-like, shape (n_samples,) The weights for each observation in X.
x_squared_norms – array, shape (n_samples,) Precomputed squared euclidean norm of each data point, to speed up computations.
centers – float array, shape (k, n_features) The cluster centers.
distances – float array, shape (n_samples,) Pre-allocated array to be filled in with each sample’s distance to the closest center.
- Returns:
labels, int array of shape(n) The resulting assignment
- Returns:
inertia, float Sum of squared distances of samples to their closest cluster center.