Fast Forest Classification

The documentation is generated based on the sources available at dotnet/machinelearning and released under MIT License.

Type: binaryclassifiertrainer Aliases: FastForestClassification, FastForest, ff, ffc Namespace: Microsoft.ML.Trainers.FastTree Assembly: Microsoft.ML.FastTree.dll Microsoft Documentation: Fast Forest Classification


Uses a random forest learner to perform binary classification.


Name Short name Default Description
allowEmptyTrees allowempty True When a root split is impossible, allow training to proceed
baggingSize bag 1 Number of trees in each bag (0 for disabling bagging)
baggingTrainFraction bagfrac 0.7 Percentage of training examples used in each bag
bias   0 Bias for calculating gradient for each feature bin for a categorical feature.
bundling bundle None Bundle low population bins. Bundle.None(0): no bundling, Bundle.AggregateLowPopulation(1): Bundle low population, Bundle.Adjacent(2): Neighbor low population bundle.
categoricalSplit cat False Whether to do split based on multiple categorical feature values.
compressEnsemble cmp False Compress the tree Ensemble
diskTranspose dt   Whether to utilize the disk or the data’s native transposition facilities (where applicable) when performing the transpose
entropyCoefficient e 0 The entropy (regularization) coefficient between 0 and 1
executionTimes et False Print execution time breakdown to stdout
featureCompressionLevel fcomp 1 The level of feature compression to use
featureFirstUsePenalty ffup 0 The feature first use penalty coefficient
featureFlocks flocks True Whether to collectivize features during dataset preparation to speed up training
featureFraction ff 0.7 The fraction of features (chosen randomly) to use on each iteration
featureReusePenalty frup 0 The feature re-use penalty (regularization) coefficient
featureSelectSeed r3 123 The seed of the active feature selection
gainConfidenceLevel gainconf 0 Tree fitting gain confidence requirement (should be in the range [0,1) ).
histogramPoolSize ps -1 The number of histograms in the pool (between 2 and numLeaves)
maxBins mb 255 Maximum number of distinct values (bins) per feature
maxCategoricalGroupsPerNode mcg 64 Maximum categorical split groups to consider when splitting on a categorical feature. Split groups are a collection of split points. This is used to reduce overfitting when there many categorical features.
maxCategoricalSplitPoints maxcat 64 Maximum categorical split points to consider when splitting on a categorical feature.
maxTreeOutput mo 100 Upper bound on absolute value of single tree output
maxTreesAfterCompression cmpmax -1 Maximum Number of trees after compression
minDocsForCategoricalSplit mdo 100 Minimum categorical doc count in a bin to consider for a split.
minDocsPercentageForCategoricalSplit mdop 0.001 Minimum categorical docs percentage in a bin to consider for a split.
minDocumentsInLeafs mil 10 The minimal number of documents allowed in a leaf of a regression tree, out of the subsampled data
numLeaves nl 20 The max number of leaves in each regression tree
numThreads t   The number of threads to use
numTrees iter 100 Total number of decision trees to create in the ensemble
parallelTrainer parag Microsoft. ML. Trainers. FastTree. SingleTrainerFactory Allows to choose Parallel FastTree Learning Algorithm
printTestGraph graph False Print metrics graph for the first test set
printTrainValidGraph graphtv False Print Train and Validation metrics in graph
quantileSampleCount qsc 100 Number of labels to be sampled from each leaf to make the distribtuion
rngSeed r1 123 The seed of the random number generator
smoothing s 0 Smoothing paramter for tree regularization
softmaxTemperature smtemp 0 The temperature of the randomized softmax distribution for choosing the feature
sparsifyThreshold sp 0.7 Sparsity level needed to use sparse feature representation
splitFraction sf 0.7 The fraction of features (chosen randomly) to use on each split
testFrequency tf 2147483647 Calculate metric values for train/valid/test every k rounds