module ml.matrices#

Short summary#

module mlstatpy.ml.matrices

Algorithms about matrices.

source on GitHub

Functions#

function

truncated documentation

gram_schmidt

Applies the Gram–Schmidt process. Due to performance, …

linear_regression

Solves the linear regression problem, find \beta which minimizes \norme{y - X\beta}, based on the …

norm2

Computes the square norm for all rows of a matrix.

streaming_gram_schmidt

Solves the linear regression problem, find \beta which minimizes \norme{y - X\beta}, based on …

streaming_gram_schmidt_update

Updates matrix P_k to produce P_{k+1} which is the matrix P in algorithm Streaming Linear Regression. …

streaming_linear_regression

Streaming algorithm to solve a linear regression. See Streaming Linear Regression.

streaming_linear_regression_gram_schmidt

Streaming algorithm to solve a linear regression with Gram-Schmidt algorithm. See Streaming Linear Regression version Gram-Schmidt. …

streaming_linear_regression_gram_schmidt_update

Updates coefficients \beta_k to produce \beta_{k+1} in Streaming Linear Regression. …

streaming_linear_regression_update

Updates coefficients \beta_k to produce \beta_{k+1} in Streaming Linear Regression. The …

Documentation#

Algorithms about matrices.

source on GitHub

mlstatpy.ml.matrices.gram_schmidt(mat, change=False)#

Applies the Gram–Schmidt process. Due to performance, every row is considered as a vector.

Paramètres:
  • mat – matrix

  • change – returns the matrix to change the basis

Renvoie:

new matrix or (new matrix, change matrix)

The function assumes the matrix mat is horizontal: it has more columns than rows.

Note

The implementation could be improved by directly using :epkg:`BLAS` function.

<<<

import numpy
from mlstatpy.ml.matrices import gram_schmidt

X = numpy.array([[1., 2., 3., 4.],
                 [5., 6., 6., 6.],
                 [5., 6., 7., 8.]])
T, P = gram_schmidt(X, change=True)
print(T)
print(P)

>>>

    [[ 0.183  0.365  0.548  0.73 ]
     [ 0.736  0.502  0.024 -0.453]
     [ 0.651 -0.67  -0.181  0.308]]
    [[ 0.183  0.     0.   ]
     [-0.477  0.243  0.   ]
     [-1.814 -1.81   2.303]]

source on GitHub

mlstatpy.ml.matrices.linear_regression(X, y, algo=None)#

Solves the linear regression problem, find \beta which minimizes \norme{y - X\beta}, based on the algorithm Arbre de décision optimisé pour les régressions linéaires.

Paramètres:
  • X – features

  • y – targets

  • algo – None to use the standard algorithm \beta = (X'X)^{-1} X'y, “gram”, “qr”

Renvoie:

beta

<<<

import numpy
from mlstatpy.ml.matrices import linear_regression

X = numpy.array([[1., 2., 3., 4.],
                 [5., 6., 6., 6.],
                 [5., 6., 7., 8.]]).T
y = numpy.array([0.1, 0.2, 0.19, 0.29])
beta = linear_regression(X, y, algo="gram")
print(beta)

>>>

    [ 0.077  0.037 -0.032]

algo=None computes \beta = (X'X)^{-1} X'y. algo='qr' uses a QR decomposition and calls function dtrtri to invert an upper triangular matrix. algo='gram' uses gram_schmidt and then computes the solution of the linear regression (see above for a link to the algorithm).

source on GitHub

mlstatpy.ml.matrices.norm2(X)#

Computes the square norm for all rows of a matrix.

source on GitHub

mlstatpy.ml.matrices.streaming_gram_schmidt(mat, start=None)#

Solves the linear regression problem, find \beta which minimizes \norme{y - X\beta}, based on algorithm Streaming Gram-Schmidt.

Paramètres:
  • mat – matrix

  • start – first row to start iteration, X.shape[1] by default

Renvoie:

iterator on

The function assumes the matrix mat is horizontal: it has more columns than rows.

<<<

import numpy
from mlstatpy.ml.matrices import streaming_gram_schmidt

X = numpy.array([[1, 0.5, 10., 5., -2.],
                 [0, 0.4, 20, 4., 2.],
                 [0, 0.7, 20, 4., 2.]], dtype=float).T

for i, p in enumerate(streaming_gram_schmidt(X.T)):
    print("iteration", i, "\n", p)
    t = X[:i + 3] @ p.T
    print(t.T @ t)

>>>

    iteration 0 
     [[ 0.099  0.     0.   ]
     [-0.953  0.482  0.   ]
     [-0.287 -3.338  3.481]]
    [[ 1.000e+00 -1.749e-15 -2.234e-15]
     [-1.749e-15  1.000e+00  1.387e-14]
     [-2.234e-15  1.387e-14  1.000e+00]]
    iteration 1 
     [[ 0.089  0.     0.   ]
     [-0.308  0.177  0.   ]
     [-0.03  -3.334  3.348]]
    [[ 1.000e+00 -4.441e-16 -1.793e-15]
     [-4.441e-16  1.000e+00  2.377e-15]
     [-1.793e-15  2.377e-15  1.000e+00]]
    iteration 2 
     [[ 0.088  0.     0.   ]
     [-0.212  0.128  0.   ]
     [-0.016 -3.335  3.342]]
    [[ 1.000e+00 -9.714e-17 -6.210e-15]
     [-9.714e-17  1.000e+00  1.978e-16]
     [-6.210e-15  1.978e-16  1.000e+00]]

source on GitHub

mlstatpy.ml.matrices.streaming_gram_schmidt_update(Xk, Pk)#

Updates matrix P_k to produce P_{k+1} which is the matrix P in algorithm Streaming Linear Regression. The function modifies the matrix Pk given as an input.

Paramètres:
  • Xk – kth row

  • Pk – matrix P at iteration k-1

source on GitHub

mlstatpy.ml.matrices.streaming_linear_regression(mat, y, start=None)#

Streaming algorithm to solve a linear regression. See Streaming Linear Regression.

Paramètres:
  • mat – features

  • y – expected target

Renvoie:

iterator on coefficients

<<<

import numpy
from mlstatpy.ml.matrices import streaming_linear_regression, linear_regression

X = numpy.array([[1, 0.5, 10., 5., -2.],
                 [0, 0.4, 20, 4., 2.],
                 [0, 0.7, 20, 4., 3.]], dtype=float).T
y = numpy.array([1., 0.3, 10, 5.1, -3.])

for i, bk in enumerate(streaming_linear_regression(X, y)):
    bk0 = linear_regression(X[:i + 3], y[:i + 3])
    print("iteration", i, bk, bk0)

>>>

    iteration 0 [ 1.     0.667 -0.667] [ 1.     0.667 -0.667]
    iteration 1 [ 1.03   0.682 -0.697] [ 1.03   0.682 -0.697]
    iteration 2 [ 1.036  0.857 -0.875] [ 1.036  0.857 -0.875]

source on GitHub

mlstatpy.ml.matrices.streaming_linear_regression_gram_schmidt(mat, y, start=None)#

Streaming algorithm to solve a linear regression with Gram-Schmidt algorithm. See Streaming Linear Regression version Gram-Schmidt.

Paramètres:
  • mat – features

  • y – expected target

Renvoie:

iterator on coefficients

<<<

import numpy
from mlstatpy.ml.matrices import streaming_linear_regression, linear_regression

X = numpy.array([[1, 0.5, 10., 5., -2.],
                 [0, 0.4, 20, 4., 2.],
                 [0, 0.7, 20, 4., 3.]], dtype=float).T
y = numpy.array([1., 0.3, 10, 5.1, -3.])

for i, bk in enumerate(streaming_linear_regression(X, y)):
    bk0 = linear_regression(X[:i + 3], y[:i + 3])
    print("iteration", i, bk, bk0)

>>>

    iteration 0 [ 1.     0.667 -0.667] [ 1.     0.667 -0.667]
    iteration 1 [ 1.03   0.682 -0.697] [ 1.03   0.682 -0.697]
    iteration 2 [ 1.036  0.857 -0.875] [ 1.036  0.857 -0.875]

source on GitHub

mlstatpy.ml.matrices.streaming_linear_regression_gram_schmidt_update(Xk, yk, Xkyk, Pk, bk)#

Updates coefficients \beta_k to produce \beta_{k+1} in Streaming Linear Regression. The function modifies the matrix Pk given as an input.

Paramètres:
  • Xk – kth row

  • yk – kth target

  • Xkyk – matrix :math:`X_{1..k}” y_{1..k}” (updated by the function)

  • Pk – Gram-Schmidt matrix produced by the streaming algorithm (updated by the function)

  • bk – current coefficient (updated by the function)

source on GitHub

mlstatpy.ml.matrices.streaming_linear_regression_update(Xk, yk, XkXk, bk)#

Updates coefficients \beta_k to produce \beta_{k+1} in Streaming Linear Regression. The function modifies the matrix Pk given as an input.

Paramètres:
  • Xk – kth row

  • yk – kth target

  • XkXk – matrix :math:`X_{1..k}”X_{1..k}”, updated by the function

  • bk – current coefficient (updated by the function)

source on GitHub