# module ml.roc¶

## Short summary¶

module mlstatpy.ml.roc

## Classes¶

class truncated documentation
ROC Helper to draw a ROC curve

## Properties¶

property truncated documentation
Data returns the underlying dataframe

## Methods¶

method truncated documentation
__init__ Initialisation with a dataframe and two or three columns:
__len__ usual
__repr__ show first elements, precision rate
__str__ show first elements, precision rate
auc Computes the area under the curve.
auc_interval Determines a confidence interval for the AUC with bootstrap.
compute_roc_curve Computes a ROC curve with nb points avec nb, if nb == -1, there are as many as points as the data contains, …
confusion Computes the confusion matrix for a specific score or all if score is None.
plot Plot a ROC curve.
precision Computes the precision.
random_cloud resample among the data
roc_intersect The :epkg:ROC curve is defined by a set of points. This function interpolates those points to determine …
roc_intersect_interval Computes a confidence interval for the value returned by roc_intersect().

## Documentation¶

class mlstatpy.ml.roc.ROC(y_true=None, y_score=None, sample_weight=None, df=None)[source]

Bases : object

Helper to draw a ROC curve

Initialisation with a dataframe and two or three columns:

• column 1: score (y_score)
• column 2: expected answer (boolean) (y_true)
• column 3: weight (optional) (sample_weight)
Paramètres: y_true – if df is None, y_true, y_score, sample_weight must be filled, y_true is whether or None the answer is true. y_true means the prediction is right. y_score – score prediction sample_weight – weights df – dataframe or array or list, it must contains 2 or 3 columns always in the same order

class CurveType[source]

Bases : enum.Enum

Curve types

• PROBSCORE: 1 - False Positive / True Positive
• ERRPREC: error / recall
• RECPREC: precision / recall
• ROC: False Positive / True Positive
• SKROC: False Positive / True Positive (scikit-learn)

Data

returns the underlying dataframe

__init__(y_true=None, y_score=None, sample_weight=None, df=None)[source]

__len__()[source]

usual

__repr__()[source]

show first elements, precision rate

__str__()[source]

show first elements, precision rate

auc(cloud=None)[source]

Computes the area under the curve.

Paramètres: cloud – data or None to use self.data, the function assumes the data is sorted. AUC

The first column is the label, the second one is the score, the third one is the weight.

auc_interval(bootstrap=10, alpha=0.95)[source]

Determines a confidence interval for the AUC with bootstrap.

Paramètres: bootstrap – number of random estimation alpha – define the confidence interval dictionary of values

compute_roc_curve(nb=100, curve=<CurveType.ROC: 5>, bootstrap=False)[source]

Computes a ROC curve with nb points avec nb, if nb == -1, there are as many as points as the data contains, if bootstrap == True, it draws random number to create confidence interval based on bootstrap method.

Paramètres: nb – number of points for the curve curve – see CurveType boostrap – builds the curve after resampling DataFrame (metrics and threshold)

If curve is SKROC, the parameter nb is not taken into account. It should be set to 0.

confusion(score=None, nb=10, curve=<CurveType.ROC: 5>, bootstrap=False)[source]

Computes the confusion matrix for a specific score or all if score is None.

Paramètres: score – score or None. nb – number of scores (if score is None) curve – see CurveType boostrap – builds the curve after resampling One row if score is precised, many roww is score is None

plot(nb=100, curve=<CurveType.ROC: 5>, bootstrap=0, ax=None, thresholds=False, **kwargs)[source]

Plot a ROC curve.

Paramètres: nb – number of points curve – see CurveType boostrap – number of curves for the boostrap (0 for None) ax – axis thresholds – use thresholds for the X axis kwargs – sent to pandas.plot ax

precision()[source]

Computes the precision.

random_cloud()[source]

resample among the data

Renvoie: DataFrame

roc_intersect(roc, x)[source]

The :epkg:ROC curve is defined by a set of points. This function interpolates those points to determine y for any x.

Paramètres: roc – ROC curve x – x y

roc_intersect_interval(x, nb, curve=<CurveType.ROC: 5>, bootstrap=10, alpha=0.05)[source]

Computes a confidence interval for the value returned by roc_intersect.

Paramètres: roc – ROC curve x – x curve – see CurveType dictionary

