module mlmodel.piecewise_estimator
#
Short summary#
module mlinsights.mlmodel.piecewise_estimator
Implements a piecewise linear regression.
Classes#
class |
truncated documentation |
---|---|
Uses a decision tree to split the space of features into buckets and trains a logistic regression (default) … |
|
Uses a decision tree to split the space of features into buckets and trains a linear regression on each of them. … |
|
Uses a decision tree to split the space of features into buckets and trains a linear regression (default) on … |
Functions#
function |
truncated documentation |
---|---|
Properties#
property |
truncated documentation |
---|---|
|
HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should … |
|
HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should … |
|
HTML representation of estimator. This is redundant with the logic of _repr_mimebundle_. The latter should … |
|
Returns the number of estimators = the number of buckets the data was split in. |
Returns the number of estimators = the number of buckets the data was split in. |
|
|
Returns the number of estimators = the number of buckets the data was split in. |
Methods#
method |
truncated documentation |
---|---|
|
Generic predict method, works for predict_proba and decision_function as well. |
Generic predict method, works for predict_proba and decision_function as well. |
|
|
Generic predict method, works for predict_proba and decision_function as well. |
|
|
|
|
Computes the predictions probabilities. |
|
|
Trains the binner and an estimator on every bucket. |
Trains the binner and an estimator on every bucket. |
|
|
Trains the binner and an estimator on every bucket. |
Computes the predictions. |
|
Computes the predictions. |
|
Computes the predictions probabilities. |
|
|
Maps every row to a tree in self.estimators_. |
Maps every row to a tree in self.estimators_. |
|
|
Maps every row to a tree in self.estimators_. |
Documentation#
Implements a piecewise linear regression.
- class mlinsights.mlmodel.piecewise_estimator.PiecewiseClassifier(binner=None, estimator=None, n_jobs=None, random_state=None, verbose=False)#
Bases:
PiecewiseEstimator
,ClassifierMixin
Uses a decision tree to split the space of features into buckets and trains a logistic regression (default) on each of them. The second estimator is usually a sklearn.linear_model.LogisticRegression. It can also be sklearn.dummy.DummyClassifier to just get the average on each bucket.
The main issue with the PiecewiseClassifier is that each piece requires one example of each class in each bucket which may not happen. To avoid that, the training will pick up random example from other bucket to ensure this case does not happen.
- Parameters:
binner – transformer or predictor which creates the buckets
estimator – predictor trained on every bucket
n_jobs – number of parallel jobs (for training and predicting)
random_state – to pick up random examples when buckets do not contain enough examples of each class
verbose – boolean or use
'tqdm'
to use tqdm to fit the estimators
binner allows the following values:
tree
: the model is sklearn.tree.DecisionTreeClassifier'bins'
: the model sklearn.preprocessing.KBinsDiscretizerany instanciated model
estimator allows the following values:
None
: the model is sklearn.linear_model.LogisticRegressionany instanciated model
- __init__(binner=None, estimator=None, n_jobs=None, random_state=None, verbose=False)#
- Parameters:
binner – transformer or predictor which creates the buckets
estimator – predictor trained on every bucket
n_jobs – number of parallel jobs (for training and predicting)
random_state – to pick up random examples when buckets do not contain enough examples of each class
verbose – boolean or use
'tqdm'
to use tqdm to fit the estimators
binner allows the following values:
tree
: the model is sklearn.tree.DecisionTreeClassifier'bins'
: the model sklearn.preprocessing.KBinsDiscretizerany instanciated model
estimator allows the following values:
None
: the model is sklearn.linear_model.LogisticRegressionany instanciated model
- decision_function(X)#
Computes the predictions probabilities.
- Parameters:
X – features, X is converted into an array if X is a dataframe
- Returns:
predictions probabilities
- predict(X)#
Computes the predictions.
- Parameters:
X – features, X is converted into an array if X is a dataframe
- Returns:
predictions
- predict_proba(X)#
Computes the predictions probabilities.
- Parameters:
X – features, X is converted into an array if X is a dataframe
- Returns:
predictions probabilities
- class mlinsights.mlmodel.piecewise_estimator.PiecewiseEstimator(binner=None, estimator=None, n_jobs=None, verbose=False)#
Bases:
BaseEstimator
Uses a decision tree to split the space of features into buckets and trains a linear regression on each of them. The second estimator can be a sklearn.linear_model.LinearRegression for a regression or sklearn.linear_model.LogisticRegression for a classifier. It can also be sklearn.dummy.DummyRegressor sklearn.dummy.DummyClassifier to just get the average on each bucket. When the buckets are defined by a decision tree and the estimator is linear,
PiecewiseTreeRegressor
optimizes the buckets based on the results of a linear regression. The accuracy is usually better.- Parameters:
binner – transformer or predictor which creates the buckets
estimator – predictor trained on every bucket
n_jobs – number of parallel jobs (for training and predicting)
verbose – boolean or use
'tqdm'
to use tqdm to fit the estimators
binner must be filled or must be:
'bins'
: the model sklearn.preprocessing.KBinsDiscretizerany instanciated model
estimator allows the following values:
None
: the model is sklearn.linear_model.LinearRegressionany instanciated model
- __init__(binner=None, estimator=None, n_jobs=None, verbose=False)#
- Parameters:
binner – transformer or predictor which creates the buckets
estimator – predictor trained on every bucket
n_jobs – number of parallel jobs (for training and predicting)
verbose – boolean or use
'tqdm'
to use tqdm to fit the estimators
binner must be filled or must be:
'bins'
: the model sklearn.preprocessing.KBinsDiscretizerany instanciated model
estimator allows the following values:
None
: the model is sklearn.linear_model.LinearRegressionany instanciated model
- _apply_predict_method(X, method, parallelized, dimout)#
Generic predict method, works for predict_proba and decision_function as well.
- _mapping_train(X, binner)#
- fit(X, y, sample_weight=None)#
Trains the binner and an estimator on every bucket.
- Parameters:
X – features, X is converted into an array if X is a dataframe
y – target
sample_weight – sample weights
- Returns:
self: returns an instance of self.
Fitted attributes:
binner_: binner
- estimators_: dictionary of estimators, each of them
mapped to a leave to the tree
- mean_estimator_: estimator trained on the whole
datasets in case the binner can find a bucket for a new observation
dim_: dimension of the output
mean_: average targets
- property n_estimators_#
Returns the number of estimators = the number of buckets the data was split in.
- transform_bins(X)#
Maps every row to a tree in self.estimators_.
- class mlinsights.mlmodel.piecewise_estimator.PiecewiseRegressor(binner=None, estimator=None, n_jobs=None, verbose=False)#
Bases:
PiecewiseEstimator
,RegressorMixin
Uses a decision tree to split the space of features into buckets and trains a linear regression (default) on each of them. The second estimator is usually a sklearn.linear_model.LinearRegression. It can also be sklearn.dummy.DummyRegressor to just get the average on each bucket.
- Parameters:
binner – transformer or predictor which creates the buckets
estimator – predictor trained on every bucket
n_jobs – number of parallel jobs (for training and predicting)
verbose – boolean or use
'tqdm'
to use tqdm to fit the estimators
binner allows the following values:
tree
: the model is sklearn.tree.DecisionTreeRegressor'bins'
: the model sklearn.preprocessing.KBinsDiscretizerany instanciated model
estimator allows the following values:
None
: the model is sklearn.linear_model.LinearRegressionany instanciated model
- __init__(binner=None, estimator=None, n_jobs=None, verbose=False)#
- Parameters:
binner – transformer or predictor which creates the buckets
estimator – predictor trained on every bucket
n_jobs – number of parallel jobs (for training and predicting)
verbose – boolean or use
'tqdm'
to use tqdm to fit the estimators
binner allows the following values:
tree
: the model is sklearn.tree.DecisionTreeRegressor'bins'
: the model sklearn.preprocessing.KBinsDiscretizerany instanciated model
estimator allows the following values:
None
: the model is sklearn.linear_model.LinearRegressionany instanciated model
- predict(X)#
Computes the predictions.
- Parameters:
X – features, X is converted into an array if X is a dataframe
- Returns:
predictions
- mlinsights.mlmodel.piecewise_estimator._decision_function_piecewise_estimator(i, est, X, association)#
- mlinsights.mlmodel.piecewise_estimator._fit_piecewise_estimator(i, model, X, y, sample_weight, association, nb_classes, random_state)#
- mlinsights.mlmodel.piecewise_estimator._predict_piecewise_estimator(i, est, X, association)#
- mlinsights.mlmodel.piecewise_estimator._predict_proba_piecewise_estimator(i, est, X, association)#