module `ml.roc`#

Short summary#

module mlstatpy.ml.roc

About ROC.

Classes#

class	truncated documentation
`ROC`	Helper to draw a ROC curve.

Properties#

property	truncated documentation
`Data`	Returns the underlying dataframe.

Methods#

method	truncated documentation
`__init__`	Initialisation with a dataframe and two or three columns:
`__len__`	usual
`__repr__`	Shows first elements, precision rate.
`__str__`	Shows first elements, precision rate.
`auc`	Computes the area under the curve (:epkg:`AUC`).
`auc_interval`	Determines a confidence interval for the :epkg:`AUC` with bootstrap.
`compute_roc_curve`	Computes a ROC curve with nb points avec nb, if nb == -1, there are as many as points as the data contains, …
`confusion`	Computes the confusion matrix for a specific score or all if score is None.
`plot`	Plots a ROC curve.
`precision`	Computes the precision.
`random_cloud`	Resamples among the data.
`roc_intersect`	The ROC curve is defined by a set of points. This function interpolates those points to determine …
`roc_intersect_interval`	Computes a confidence interval for the value returned by `roc_intersect()`.

Documentation#

About ROC.

source on GitHub

class mlstatpy.ml.roc.ROC(y_true=None, y_score=None, sample_weight=None, df=None)#

Bases : object

Helper to draw a ROC curve.

source on GitHub

Initialisation with a dataframe and two or three columns:

column 1: score (y_score)
column 2: expected answer (boolean) (y_true)
column 3: weight (optional) (sample_weight)

Paramètres:

y_true – if df is None, y_true, y_score, sample_weight must be filled, y_true is whether or None the answer is true. y_true means the prediction is right.
y_score – score prediction
sample_weight – weights
df – dataframe or array or list, it must contains 2 or 3 columns always in the same order

source on GitHub

class CurveType(value)#

Bases : Enum

Curve types:

PROBSCORE: 1 - False Positive / True Positive
ERRPREC: error / recall
RECPREC: precision / recall
ROC: False Positive / True Positive
SKROC: False Positive / True Positive (scikit-learn)

source on GitHub

property Data#

Returns the underlying dataframe.

source on GitHub

__init__(y_true=None, y_score=None, sample_weight=None, df=None)#

Initialisation with a dataframe and two or three columns:

column 1: score (y_score)
column 2: expected answer (boolean) (y_true)
column 3: weight (optional) (sample_weight)

Paramètres:

y_true – if df is None, y_true, y_score, sample_weight must be filled, y_true is whether or None the answer is true. y_true means the prediction is right.
y_score – score prediction
sample_weight – weights
df – dataframe or array or list, it must contains 2 or 3 columns always in the same order

source on GitHub

__len__()#

usual

source on GitHub

__repr__()#

Shows first elements, precision rate.

source on GitHub

__str__()#

Shows first elements, precision rate.

source on GitHub

auc(cloud=None)#

Computes the area under the curve (:epkg:`AUC`).

Paramètres:: cloud – data or None to use self.data, the function assumes the data is sorted.
Renvoie:: AUC

The first column is the label, the second one is the score, the third one is the weight.

source on GitHub

auc_interval(bootstrap=10, alpha=0.95)#

Determines a confidence interval for the :epkg:`AUC` with bootstrap.

Paramètres:

bootstrap – number of random estimation
alpha – define the confidence interval

Renvoie:

dictionary of values

source on GitHub

compute_roc_curve(nb=100, curve=CurveType.ROC, bootstrap=False)#

Computes a ROC curve with nb points avec nb, if nb == -1, there are as many as points as the data contains, if bootstrap == True, it draws random number to create confidence interval based on bootstrap method.

Paramètres:

nb – number of points for the curve
curve – see CurveType
boostrap – builds the curve after resampling

Renvoie:

DataFrame (metrics and threshold)

If curve is SKROC, the parameter nb is not taken into account. It should be set to 0.

source on GitHub

confusion(score=None, nb=10, curve=CurveType.ROC, bootstrap=False)#

Computes the confusion matrix for a specific score or all if score is None.

Paramètres:

score – score or None.
nb – number of scores (if score is None)
curve – see CurveType
boostrap – builds the curve after resampling

Renvoie:

One row if score is precised, many roww is score is None

source on GitHub

plot(nb=100, curve=CurveType.ROC, bootstrap=0, ax=None, thresholds=False, **kwargs)#

Plots a ROC curve.

Paramètres:

nb – number of points
curve – see CurveType
boostrap – number of curves for the boostrap (0 for None)
ax – axis
thresholds – use thresholds for the X axis
kwargs – sent to pandas.plot

Renvoie:

ax

source on GitHub

precision()#

Computes the precision.

source on GitHub

random_cloud()#

Resamples among the data.

Renvoie:: DataFrame

source on GitHub

roc_intersect(roc, x)#

The ROC curve is defined by a set of points. This function interpolates those points to determine y for any x.

Paramètres:

roc – ROC curve
x – x

Renvoie:

y

source on GitHub

roc_intersect_interval(x, nb, curve=CurveType.ROC, bootstrap=10, alpha=0.05)#

Computes a confidence interval for the value returned by roc_intersect.

Paramètres:

roc – ROC curve
x – x
curve – see CurveType

Renvoie:

dictionary

source on GitHub

Liens

Contenu

Information

module `ml.roc`#

Short summary#

Classes#

Properties#

Methods#

Documentation#

Liens

Contenu

Information

module ml.roc#

Short summary#

Classes#

Properties#

Methods#

Documentation#

module `ml.roc`#