CSharp API

ML.net is a C# library. This page compiles a couple of examples and some exploration.

Example with Iris DataSet in C#


public class IrisObservation
    public string Label

    public float Sepal_length

    public float Sepal_width

    public float Petal_length

    public float Petal_width

public class IrisPrediction
    public uint PredictedLabel

    public float[] Score

public static void example_iris()
    var iris = string.Format("{0}/iris.txt", RELPATH)

    using(var env=new ConsoleEnvironment())
        var args = new TextLoader.Arguments()
            Separator = "\t",
            HasHeader = true,
            Column = new[] {
                new TextLoader.Column("Label", DataKind.R4, 0),
                new TextLoader.Column("Sepal_length", DataKind.R4, 1),
                new TextLoader.Column("Sepal_width", DataKind.R4, 2),
                new TextLoader.Column("Petal_length", DataKind.R4, 3),
                new TextLoader.Column("Petal_width", DataKind.R4, 4),

        var reader = new TextLoader(env, args)
        var concat = new ColumnConcatenatingEstimator(env,
                                                      "Features", "Sepal_length",
                                                      "Sepal_width", "Petal_length", "Petal_width")
        var km = new KMeansPlusPlusTrainer(env, "Features", clustersCount: 3)
        var pipeline = concat.Append(km)

        IDataView trainingDataView = reader.Read(new MultiFileSource(iris))
        var model = pipeline.Fit(trainingDataView)

        var obs = new IrisObservation()
            Sepal_length = 3.3f,
            Sepal_width = 1.6f,
            Petal_length = 0.2f,
            Petal_width = 5.1f,

        var engine = model.MakePredictionFunction < IrisObservation,
        IrisPrediction > (env)
        var res = engine.Predict(obs)
        Console.WriteLine("Type of pipeline: {0}", pipeline.GetType())
        Console.WriteLine("Type of engine: {0}", engine.GetType())
        Console.WriteLine("PredictedLabel: {0}", res.PredictedLabel)
        Console.WriteLine("Score: {0}", string.Join(", ",
                                                    res.Score.Select(c=> c.ToString())))


    Initializing centroids
    Centroids initialized, starting main trainer
    Model trained successfully on 150 instances
    Type of pipeline: Microsoft.ML.Runtime.Data.EstimatorChain`1[Microsoft.ML.Runtime.Data.ClusteringPredictionTransformer`1[Microsoft.ML.Trainers.KMeans.KMeansPredictor]]
    Type of engine: Microsoft.ML.Runtime.Data.PredictionFunction`2[DynamicCS.DynamicCSFunctions_example_iris+IrisObservation,DynamicCS.DynamicCSFunctions_example_iris+IrisPrediction]
    PredictedLabel: 1
    Score: 0.04319, 23.49886, 10.22621

DataFrame in C#

This code can be shortened with the use of DataFrame and a custom implemantation of the pipeline. It is a mix between the command line and the C#.


public static void dataframe_iris()
    var iris = string.Format("{0}/iris.txt", RELPATH)

    using(var env=new ConsoleEnvironment())
        var df = DataFrameIO.ReadCsv(iris, sep: '\t',
                                     dtypes: new ColumnType[] {NumberType.R4})
        var concat = string.Format("Concat{{col=Features:{0},{1}}}",
                                   df.Columns[1], df.Columns[2])
        var pipe = new ScikitPipeline(new[] {concat}, "mlr")
        pipe.Train(df, "Features", "Label")

        DataFrame pred = null
        pipe.Predict(df, ref pred)


    LBFGS multi-threading will attempt to load dataset into memory. In case of out-of-memory issues, add 'numThreads=1' to the trainer arguments and 'cache=-' to the command line arguments to turn off multi-threading.
    Beginning optimization
    num vars: 9
    improvement criterion: Mean Improvement
    L1 regularization selected 7 of 9 weights.
    Not training a calibrator because it is not needed.
    Wrote 5 rows of length 11