Astuces de machine learning

Autour de scikit-learn

papierstat.mltricks.SkBaseLearnerCategory (self, colnameind = None, model = None, kwargs)

Base d’un learner qui apprend un learner pour chaque modalité d’une classe.

Notebooks associés à ce learner

Algorithmes customisés

papierstat.mltricks.ConstraintKMeans (self, n_clusters = 8, init = “k-means++”, n_init = 10, max_iter = 300, tol = 0.0001, precompute_distances = “auto”, verbose = 0, random_state = None, copy_x = True, n_jobs = 1, algorithm = “auto”, balanced_predictions = False, strategy = “gain”, kmeans0 = True)

Defines a constraint k-means. Clusters are modified to have an equal size. The algorithm is initialized with a regular k-means and continues with a modified version of it.

Computing the predictions offer a choice. The first one is to keep the predictions from the regular k-means algorithm but with the balanced clusters. The second is to compute balanced predictions over the test set. That implies that the predictions for the same observations might change depending on the set it belongs to.