# Introduction¶

ML.net is a machine learning library implemented in C# by Microsoft. This projet aims at showing how to extend it with custom tranforms or learners. It implements standard abstraction in C# such as dataframes and pipeline following the scikit-learn API. ML.net implements two API. The first one structured as a streaming API merges every experiment in a single sequence of transform and learners possibly handling one out-of-memory dataset. The second API is built on the top of the first one and proposes an easier way to build pipeline with multiple datasets. This second API is also used by wrapper to other language such as NimbusML. Let’s see first how this library can be used without any addition.

## Command line¶

ML.net proposes some sort of simple language to define a simple machine learning pipeline. We use it on Iris data to train a logistic regression.

Label	Sepal_length	Sepal_width	Petal_length	Petal_width
0	3.5	1.4	0.2	5.1
0	3.0	1.4	0.2	4.9
0	3.2	1.3	0.2	4.7
0	3.1	1.5	0.2	4.6
0	3.6	1.4	0.2	5.0


The pipeline is simply define by a logistic regression named mlr for MultiLogisticRegression. Options are defined inside {...}. The parameter data= specifies the data file, loader= specifies the format and column names.

<<<

train
data = iris.txt
loader = text{col = Label: R4: 0 col = Features: R4: 1 - 4 header = +}
tr = mlr{maxiter = 5}
out = logistic_regression.zip


The documentation of every component is available through the command line. An exemple for Multi-class Logistic Regression:

<<<

? mlr


>>>

    Help for MultiClassClassifierTrainer, Trainer: 'MultiClassLogisticRegression'
Aliases: MulticlassLogisticRegressionPredictorNew, mlr, multilr
showTrainingStats=[+|-]             Show statistics of training examples.
Default value:'-' (short form stat)
l2Weight=<float>                    L2 regularization weight Default value:'1'
(short form l2)
l1Weight=<float>                    L1 regularization weight Default value:'1'
(short form l1)
optTol=<float>                      Tolerance parameter for optimization
convergence. Lower = slower, more accurate
Default value:'1E-07' (short form ot)
memorySize=<int>                    Memory size for L-BFGS. Lower=faster, less
accurate Default value:'20' (short form m)
maxIterations=<int>                 Maximum iterations. Default
value:'2147483647' (short form maxiter)
sgdInitializationTolerance=<float>  Run SGD to initialize LR weights,
converging to this tolerance Default
value:'0' (short form sgd)
quiet=[+|-]                         If set to true, produce no output during
training. Default value:'-' (short form q)
initWtsDiameter=<float>             Init weights diameter Default value:'0'
(short form initwts)