Custom Scoring Functions

Classification Scores

lightmlboard.metrics.classification.multi_label_jaccard(exp, val, exc=True)

Applies to a multi-label classification problem. Computes the average Jaccard index between two sequences of sets of labels (see Multi-label classification).

Parameters:
  • exp – list of tuple or list of set or filename or streams (comma separated values) or dict

  • val – list of tuple or list of set or filename or streams (comma separated values) or dict

  • exc – raises an exception if not enough submitted items

Returns:

score

E = \frac{1}{n} \sum_{i=1}^n \frac{|C_i \cap P_i|}{|C_i \cup P_i|}

source on GitHub

Regression Scores

lightmlboard.metrics.regression_custom.l1_reg_max(exp, val, max_val=180, nomax=False, exc=True)

Implements a L1 scoring function which does not consider error above threshold max_val.

Parameters:
  • exp – list of values or numpy.array

  • val – list of values or numpy.array

  • max_val – every value above max_val is replaced by max_val before computing the differences

  • nomax – removes every value equal or above nomax in expected set, then compute the score

  • raises – an exception if not enough submitted items

Returns:

score

If max_val==180, the function computes:

E = \frac{1}{n} \sum_{i=1}^n \frac{\left| \min (Y_i, 180) - \min(f(X_i), 180) \right|}{180}

The computation is faster if numpy.array are used (for exp and val). exp and *val can be filenames or streams. In that case, the function expects to find two columns: id, value in both files or streams.

source on GitHub