Split Train Test TransformΒΆ

The documentation is generated based on the sources available at xadupre/machinelearningext and released under MIT License.

Type: datatransform Aliases: SplitTrainTestTransform, SplitTrainTest Namespace: Scikit.ML.ModelSelection Assembly: Scikit.ML.ModelSelection.dll


Splits a datasets into train / test.


Name Short name Default Description
cacheFile c   File name of the cache if stored on disk.
filename f   Names of saved datasets (idv only), null for none.
newColumn col splitTrain0Test1 Name of the added column
numThreads nt   Number of threads used to fill the cache.
poolRows pool 1000 When shuffling the output, the number of output rows to keep in that pool. Note that shuffling of output is completely distinct from shuffling of input.
ratios a System.String[] Array of Ratios
reuse r False Reuse the previous cache.
saverSettings saver binary Saver settings if data is saved on disk (default is binary).
seed     The random seed used to split the datasets.
seedShuffle shuffle   The random seed used to shuffle. If unspecified random state will be instead derived from the environment.
shuffleInput si True Whether we should attempt to shuffle the source data. By default on, but can be turned off for efficiency.
tag     To tag the split views (one tag per view).