module datasets.artificial
¶
Short summary¶
module pymlbenchmark.datasets.artificial
Artificial datasets.
Functions¶
function |
truncated documentation |
---|---|
Returns data for a binary classification problem (linear) with N observations and dim features. |
|
Returns data for a binary classification problem (linear) with N observations and dim features. |
Documentation¶
Artificial datasets.
- pymlbenchmark.datasets.artificial.rand(d0, d1, ..., dn)¶
Random values in a given shape.
Note
This is a convenience function for users porting code from Matlab, and wraps random_sample. That function takes a tuple to specify the size of the output, which is consistent with other NumPy functions like numpy.zeros and numpy.ones.
Create an array of the given shape and populate it with random samples from a uniform distribution over
[0, 1)
.Parameters¶
- d0, d1, …, dnint, optional
The dimensions of the returned array, must be non-negative. If no argument is given a single Python float is returned.
Returns¶
- outndarray, shape
(d0, d1, ..., dn)
Random values.
See Also¶
random
Examples¶
>>> np.random.rand(3,2) array([[ 0.14022471, 0.96360618], #random [ 0.37601032, 0.25528411], #random [ 0.49313049, 0.94909878]]) #random
- pymlbenchmark.datasets.artificial.randn(d0, d1, ..., dn)¶
Return a sample (or samples) from the “standard normal” distribution.
Note
This is a convenience function for users porting code from Matlab, and wraps standard_normal. That function takes a tuple to specify the size of the output, which is consistent with other NumPy functions like numpy.zeros and numpy.ones.
Note
New code should use the
standard_normal
method of adefault_rng()
instance instead; please see the Quick Start.If positive int_like arguments are provided, randn generates an array of shape
(d0, d1, ..., dn)
, filled with random floats sampled from a univariate “normal” (Gaussian) distribution of mean 0 and variance 1. A single float randomly sampled from the distribution is returned if no argument is provided.Parameters¶
- d0, d1, …, dnint, optional
The dimensions of the returned array, must be non-negative. If no argument is given a single Python float is returned.
Returns¶
- Zndarray or float
A
(d0, d1, ..., dn)
-shaped array of floating-point samples from the standard normal distribution, or a single such float if no parameters were supplied.
See Also¶
standard_normal : Similar, but takes a tuple as its argument. normal : Also accepts mu and sigma arguments. random.Generator.standard_normal: which should be used for new code.
Notes¶
For random samples from , use:
sigma * np.random.randn(...) + mu
Examples¶
>>> np.random.randn() 2.1923875335537315 # random
Two-by-four array of samples from N(3, 6.25):
>>> 3 + 2.5 * np.random.randn(2, 4) array([[-4.49401501, 4.00950034, -1.81814867, 7.29718677], # random [ 0.39924804, 4.68456316, 4.99394529, 4.84057254]]) # random
- pymlbenchmark.datasets.artificial.random_binary_classification(N, dim)¶
Returns data for a binary classification problem (linear) with N observations and dim features.
- Parameters:
N – number of observations
dim – number of features
- Returns:
X, y
<<<
from pymlbenchmark.datasets import random_binary_classification X, y = random_binary_classification(3, 6) print(y) print(X)
>>>
[1 0 0] [[0.851 0.121 0.711 0.263 0.44 0.297] [0.047 0.95 0.434 0.104 0.564 0.943] [0.777 0.043 0.647 0.264 0.349 0.635]]
- pymlbenchmark.datasets.artificial.random_regression(N, dim)¶
Returns data for a binary classification problem (linear) with N observations and dim features.
- Parameters:
N – number of observations
dim – number of features
- Returns:
X, y
<<<
from pymlbenchmark.datasets import random_regression X, y = random_regression(3, 6) print(y) print(X)
>>>
[1.706 3.282 2.874] [[0.359 0.48 0.899 0.103 0.274 0.381] [0.329 0.699 0.851 0.45 0.775 0.199] [0.634 0.994 0.122 0.658 0.772 0.622]]