module mlmodel.piecewise_estimator#

Inheritance diagram of mlinsights.mlmodel.piecewise_estimator

Short summary#

module mlinsights.mlmodel.piecewise_estimator

Implements a piecewise linear regression.

source on GitHub

Classes#

class

truncated documentation

PiecewiseClassifier

Uses a decision tree to split the space of features into buckets and trains a logistic regression (default) …

PiecewiseEstimator

Uses a decision tree to split the space of features into buckets and trains a linear regression on each of them. …

PiecewiseRegressor

Uses a decision tree to split the space of features into buckets and trains a linear regression (default) on …

Functions#

function

truncated documentation

_decision_function_piecewise_estimator

_fit_piecewise_estimator

_predict_piecewise_estimator

_predict_proba_piecewise_estimator

Properties#

property

truncated documentation

_repr_html_

HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should …

_repr_html_

HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should …

_repr_html_

HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should …

n_estimators_

Returns the number of estimators = the number of buckets the data was split in.

n_estimators_

Returns the number of estimators = the number of buckets the data was split in.

n_estimators_

Returns the number of estimators = the number of buckets the data was split in.

Methods#

method

truncated documentation

__init__

__init__

__init__

_apply_predict_method

Generic predict method, works for predict_proba and decision_function as well.

_apply_predict_method

Generic predict method, works for predict_proba and decision_function as well.

_apply_predict_method

Generic predict method, works for predict_proba and decision_function as well.

_mapping_train

_mapping_train

_mapping_train

decision_function

Computes the predictions probabilities.

fit

Trains the binner and an estimator on every bucket.

fit

Trains the binner and an estimator on every bucket.

fit

Trains the binner and an estimator on every bucket.

predict

Computes the predictions.

predict

Computes the predictions.

predict_proba

Computes the predictions probabilities.

transform_bins

Maps every row to a tree in self.estimators_.

transform_bins

Maps every row to a tree in self.estimators_.

transform_bins

Maps every row to a tree in self.estimators_.

Documentation#

Implements a piecewise linear regression.

source on GitHub

class mlinsights.mlmodel.piecewise_estimator.PiecewiseClassifier(binner=None, estimator=None, n_jobs=None, random_state=None, verbose=False)#

Bases: PiecewiseEstimator, ClassifierMixin

Uses a decision tree to split the space of features into buckets and trains a logistic regression (default) on each of them. The second estimator is usually a sklearn.linear_model.LogisticRegression. It can also be sklearn.dummy.DummyClassifier to just get the average on each bucket.

The main issue with the PiecewiseClassifier is that each piece requires one example of each class in each bucket which may not happen. To avoid that, the training will pick up random example from other bucket to ensure this case does not happen.

source on GitHub

Parameters:
  • binner – transformer or predictor which creates the buckets

  • estimator – predictor trained on every bucket

  • n_jobs – number of parallel jobs (for training and predicting)

  • random_state – to pick up random examples when buckets do not contain enough examples of each class

  • verbose – boolean or use 'tqdm' to use tqdm to fit the estimators

binner allows the following values:

estimator allows the following values:

source on GitHub

__init__(binner=None, estimator=None, n_jobs=None, random_state=None, verbose=False)#
Parameters:
  • binner – transformer or predictor which creates the buckets

  • estimator – predictor trained on every bucket

  • n_jobs – number of parallel jobs (for training and predicting)

  • random_state – to pick up random examples when buckets do not contain enough examples of each class

  • verbose – boolean or use 'tqdm' to use tqdm to fit the estimators

binner allows the following values:

estimator allows the following values:

source on GitHub

decision_function(X)#

Computes the predictions probabilities.

Parameters:

X – features, X is converted into an array if X is a dataframe

Returns:

predictions probabilities

source on GitHub

predict(X)#

Computes the predictions.

Parameters:

X – features, X is converted into an array if X is a dataframe

Returns:

predictions

source on GitHub

predict_proba(X)#

Computes the predictions probabilities.

Parameters:

X – features, X is converted into an array if X is a dataframe

Returns:

predictions probabilities

source on GitHub

class mlinsights.mlmodel.piecewise_estimator.PiecewiseEstimator(binner=None, estimator=None, n_jobs=None, verbose=False)#

Bases: BaseEstimator

Uses a decision tree to split the space of features into buckets and trains a linear regression on each of them. The second estimator can be a sklearn.linear_model.LinearRegression for a regression or sklearn.linear_model.LogisticRegression for a classifier. It can also be sklearn.dummy.DummyRegressor sklearn.dummy.DummyClassifier to just get the average on each bucket. When the buckets are defined by a decision tree and the estimator is linear, PiecewiseTreeRegressor optimizes the buckets based on the results of a linear regression. The accuracy is usually better.

source on GitHub

Parameters:
  • binner – transformer or predictor which creates the buckets

  • estimator – predictor trained on every bucket

  • n_jobs – number of parallel jobs (for training and predicting)

  • verbose – boolean or use 'tqdm' to use tqdm to fit the estimators

binner must be filled or must be:

estimator allows the following values:

source on GitHub

__init__(binner=None, estimator=None, n_jobs=None, verbose=False)#
Parameters:
  • binner – transformer or predictor which creates the buckets

  • estimator – predictor trained on every bucket

  • n_jobs – number of parallel jobs (for training and predicting)

  • verbose – boolean or use 'tqdm' to use tqdm to fit the estimators

binner must be filled or must be:

estimator allows the following values:

source on GitHub

_apply_predict_method(X, method, parallelized, dimout)#

Generic predict method, works for predict_proba and decision_function as well.

source on GitHub

_mapping_train(X, binner)#
fit(X, y, sample_weight=None)#

Trains the binner and an estimator on every bucket.

Parameters:
  • X – features, X is converted into an array if X is a dataframe

  • y – target

  • sample_weight – sample weights

Returns:

self: returns an instance of self.

Fitted attributes:

  • binner_: binner

  • estimators_: dictionary of estimators, each of them

    mapped to a leave to the tree

  • mean_estimator_: estimator trained on the whole

    datasets in case the binner can find a bucket for a new observation

  • dim_: dimension of the output

  • mean_: average targets

source on GitHub

property n_estimators_#

Returns the number of estimators = the number of buckets the data was split in.

source on GitHub

transform_bins(X)#

Maps every row to a tree in self.estimators_.

source on GitHub

class mlinsights.mlmodel.piecewise_estimator.PiecewiseRegressor(binner=None, estimator=None, n_jobs=None, verbose=False)#

Bases: PiecewiseEstimator, RegressorMixin

Uses a decision tree to split the space of features into buckets and trains a linear regression (default) on each of them. The second estimator is usually a sklearn.linear_model.LinearRegression. It can also be sklearn.dummy.DummyRegressor to just get the average on each bucket.

source on GitHub

Parameters:
  • binner – transformer or predictor which creates the buckets

  • estimator – predictor trained on every bucket

  • n_jobs – number of parallel jobs (for training and predicting)

  • verbose – boolean or use 'tqdm' to use tqdm to fit the estimators

binner allows the following values:

estimator allows the following values:

source on GitHub

__init__(binner=None, estimator=None, n_jobs=None, verbose=False)#
Parameters:
  • binner – transformer or predictor which creates the buckets

  • estimator – predictor trained on every bucket

  • n_jobs – number of parallel jobs (for training and predicting)

  • verbose – boolean or use 'tqdm' to use tqdm to fit the estimators

binner allows the following values:

estimator allows the following values:

source on GitHub

predict(X)#

Computes the predictions.

Parameters:

X – features, X is converted into an array if X is a dataframe

Returns:

predictions

source on GitHub

mlinsights.mlmodel.piecewise_estimator._decision_function_piecewise_estimator(i, est, X, association)#
mlinsights.mlmodel.piecewise_estimator._fit_piecewise_estimator(i, model, X, y, sample_weight, association, nb_classes, random_state)#
mlinsights.mlmodel.piecewise_estimator._predict_piecewise_estimator(i, est, X, association)#
mlinsights.mlmodel.piecewise_estimator._predict_proba_piecewise_estimator(i, est, X, association)#