CSharp API

ML.net is a C# library. This page compiles a couple of examples and some exploration.

Example with Iris DataSet in C#

<<<

public class IrisObservation
{
    [Column("0")]
    [ColumnName("Label")]
    public string Label

    [Column("1")]
    public float Sepal_length

    [Column("2")]
    public float Sepal_width

    [Column("3")]
    public float Petal_length

    [Column("4")]
    public float Petal_width
}

public class IrisPrediction
{
    public uint PredictedLabel

    [VectorType(4)]
    public float[] Score
}

public static void example_iris()
{
    var iris = string.Format("{0}/iris.txt", RELPATH)

    using(var env=new ConsoleEnvironment())
    {
        var args = new TextLoader.Arguments()
        {
            Separator = "\t",
            HasHeader = true,
            Column = new[] {
                new TextLoader.Column("Label", DataKind.R4, 0),
                new TextLoader.Column("Sepal_length", DataKind.R4, 1),
                new TextLoader.Column("Sepal_width", DataKind.R4, 2),
                new TextLoader.Column("Petal_length", DataKind.R4, 3),
                new TextLoader.Column("Petal_width", DataKind.R4, 4),
            }
        }

        var reader = new TextLoader(env, args)
        var concat = new ColumnConcatenatingEstimator(env,
                                                      "Features", "Sepal_length",
                                                      "Sepal_width", "Petal_length", "Petal_width")
        var km = new KMeansPlusPlusTrainer(env, "Features", clustersCount: 3)
        var pipeline = concat.Append(km)

        IDataView trainingDataView = reader.Read(new MultiFileSource(iris))
        var model = pipeline.Fit(trainingDataView)

        var obs = new IrisObservation()
        {
            Sepal_length = 3.3f,
            Sepal_width = 1.6f,
            Petal_length = 0.2f,
            Petal_width = 5.1f,
        }

        var engine = model.MakePredictionFunction < IrisObservation,
        IrisPrediction > (env)
        var res = engine.Predict(obs)
        Console.WriteLine("Type of pipeline: {0}", pipeline.GetType())
        Console.WriteLine("Type of engine: {0}", engine.GetType())
        Console.WriteLine("------------")
        Console.WriteLine("PredictedLabel: {0}", res.PredictedLabel)
        Console.WriteLine("Score: {0}", string.Join(", ",
                                                    res.Score.Select(c=> c.ToString())))
    }
}

>>>

    Initializing centroids
    Centroids initialized, starting main trainer
    Model trained successfully on 150 instances
    Type of pipeline: Microsoft.ML.Runtime.Data.EstimatorChain`1[Microsoft.ML.Runtime.Data.ClusteringPredictionTransformer`1[Microsoft.ML.Trainers.KMeans.KMeansPredictor]]
    Type of engine: Microsoft.ML.Runtime.Data.PredictionFunction`2[DynamicCS.DynamicCSFunctions_example_iris+IrisObservation,DynamicCS.DynamicCSFunctions_example_iris+IrisPrediction]
    ------------
    PredictedLabel: 1
    Score: 0.04319, 23.49886, 10.22621

DataFrame in C#

This code can be shortened with the use of DataFrame and a custom implemantation of the pipeline. It is a mix between the command line and the C#.

<<<

public static void dataframe_iris()
{
    var iris = string.Format("{0}/iris.txt", RELPATH)

    using(var env=new ConsoleEnvironment())
    {
        var df = DataFrameIO.ReadCsv(iris, sep: '\t',
                                     dtypes: new ColumnType[] {NumberType.R4})
        var concat = string.Format("Concat{{col=Features:{0},{1}}}",
                                   df.Columns[1], df.Columns[2])
        var pipe = new ScikitPipeline(new[] {concat}, "mlr")
        pipe.Train(df, "Features", "Label")

        DataFrame pred = null
        pipe.Predict(df, ref pred)
        Console.WriteLine(pred.Head())
    }
}

>>>

    LBFGS multi-threading will attempt to load dataset into memory. In case of out-of-memory issues, add 'numThreads=1' to the trainer arguments and 'cache=-' to the command line arguments to turn off multi-threading.
    Beginning optimization
    num vars: 9
    improvement criterion: Mean Improvement
    L1 regularization selected 7 of 9 weights.
    Not training a calibrator because it is not needed.
    Wrote 5 rows of length 11
    Label,Sepal_length,Sepal_width,Petal_length,Petal_width,6,2:PredictedLabel
    0,3.5,1.4,0.2,5.1,3.5,1.4,0,0.972407043,0.0275927335,2.01959764E-07
    0,3,1.4,0.2,4.9,3,1.4,0,0.9636723,0.0363273062,2.65891174E-07
    0,3.2,1.3,0.2,4.7,3.2,1.3,0,0.9743216,0.0256783385,1.3373139E-07
    0,3.1,1.5,0.2,4.6,3.1,1.5,0,0.9565067,0.0434927456,4.47395223E-07
    0,3.6,1.4,0.2,5,3.6,1.4,0,0.973891,0.0261087269,1.910979E-07