API

Tools

td3a_cpp.tools.measure_time(stmt, context, repeat=10, number=50, div_by_number=True)[source]

Measures a statement and returns the results as a dictionary.

Parameters
  • stmt – string

  • context – variable to know in a dictionary

  • repeat – average over repeat experiment

  • number – number of executions in one row

  • div_by_number – divide by the number of executions

Returns

dictionary

<<<

from math import cos
import pprint
from td3a_cpp.tools import measure_time

res = measure_time("cos(x)", context=dict(cos=cos, x=5.))
pprint.pprint(res)

>>>

    {'average': 5.427557043731212e-07,
     'context_size': 232,
     'deviation': 3.379480387959381e-08,
     'max_exec': 6.397999823093414e-07,
     'min_exec': 5.105999298393727e-07,
     'number': 50,
     'repeat': 10}

See Timer.repeat for a better understanding of parameter repeat and number. The function returns a duration corresponding to number times the execution of the main statement.

td3a_cpp.tools.measure_time_dim(stmt, contexts, repeat=10, number=50, div_by_number=True, verbose=0)[source]

Measures a statement multiple time with function measure_time_dim().

Parameters
  • stmt – string

  • contexts – variable to know in a dictionary, every context must include field ‘x_name’, which is copied in the result

  • repeat – average over repeat experiment

  • number – number of executions in one row

  • div_by_number – divide by the number of executions

  • verbose – if > 0, use tqdm to display progress

Returns

yield dictionary

<<<

import pprint
import numpy
from td3a_cpp.tools import measure_time_dim

res = list(measure_time_dim(
    "cos(x)",
    contexts=[dict(cos=numpy.cos, x=numpy.arange(10), x_name=10),
              dict(cos=numpy.cos, x=numpy.arange(100), x_name=100)]))
pprint.pprint(res)

>>>

    [{'average': 1.3470353791490195e-05,
      'context_size': 232,
      'deviation': 3.682136324970792e-07,
      'max_exec': 1.4192238450050353e-05,
      'min_exec': 1.3026059605181217e-05,
      'number': 50,
      'repeat': 10,
      'x_name': 10},
     {'average': 2.556998818181455e-05,
      'context_size': 232,
      'deviation': 3.4348012296358253e-07,
      'max_exec': 2.622612053528428e-05,
      'min_exec': 2.527392003685236e-05,
      'number': 50,
      'repeat': 10,
      'x_name': 100}]

See Timer.repeat for a better understanding of parameter repeat and number. The function returns a duration corresponding to number times the execution of the main statement.

Tutorial

dot

td3a_cpp.tutorial.pydot(va, vb)[source]

Implements the dot product between two vectors.

Parameters
  • va – first vector

  • vb – second vector

Returns

dot product

td3a_cpp.tutorial.cblas_ddot()

Computes a dot product with cblas_ddot.

Parameters
  • x – first vector, dtype must be float64

  • y – second vector, dtype must be float64

Returns

dot product

td3a_cpp.tutorial.dot_cython.dot_product()

Python dot product but in cython file.

Parameters
  • va – first vector

  • vb – second vector

Returns

dot product

float32

td3a_cpp.tutorial.dot_cython.sdot_cython_array()

dot product implemented with C types.

Parameters
  • va – first vector, dtype must be float32

  • vb – second vector, dtype must be float32

Returns

dot product

td3a_cpp.tutorial.dot_cython.sdot_cython_array_optim()

dot product implemented with C types with disabled checkings (see compiler directives.

Parameters
  • va – first vector, dtype must be float32

  • vb – second vector, dtype must be float32

Returns

dot product

td3a_cpp.tutorial.dot_cython.sdot_array()

dot product implemented with C types with disabled checkings (see compiler directives), and nogil. It is a wrapper for a C function as they cannot be exposed to the python world (gil is disabled).

Parameters
  • va – first vector, dtype must be float32

  • vb – second vector, dtype must be float32

Returns

dot product

td3a_cpp.tutorial.dot_cython.sdot_array_16()

dot product implemented with C types with disabled checkings (see compiler directives), and nogil. It is a wrapper for a C function as they cannot be exposed to the python world (gil is disabled). Computation is done 16x16 to benefit from branching.

Parameters
  • va – first vector, dtype must be float32

  • vb – second vector, dtype must be float32

Returns

dot product

td3a_cpp.tutorial.dot_cython.sdot_array_16_sse()

dot product implemented with C types with disabled checkings (see compiler directives), and nogil. It is a wrapper for a C function as they cannot be exposed to the python world (gil is disabled). Computation is done 16x16 to benefit from branching and uses AVX instructions.

Parameters
  • va – first vector, dtype must be float32

  • vb – second vector, dtype must be float32

Returns

dot product

double = float64

td3a_cpp.tutorial.dot_cython.ddot_cython_array()

dot product implemented with C types.

Parameters
  • va – first vector, dtype must be float64

  • vb – second vector, dtype must be float64

Returns

dot product

td3a_cpp.tutorial.dot_cython.ddot_cython_array_optim()

dot product implemented with C types with disabled checkings (see compiler directives.

Parameters
  • va – first vector, dtype must be float64

  • vb – second vector, dtype must be float64

Returns

dot product

td3a_cpp.tutorial.dot_cython.ddot_array()

dot product implemented with C types with disabled checkings (see compiler directives), and nogil. It is a wrapper for a C function as they cannot be exposed to the python world (gil is disabled).

Parameters
  • va – first vector, dtype must be float64

  • vb – second vector, dtype must be float64

Returns

dot product

td3a_cpp.tutorial.dot_cython.ddot_array_16()

dot product implemented with C types with disabled checkings (see compiler directives), and nogil. It is a wrapper for a C function as they cannot be exposed to the python world (gil is disabled). Computation is done 16x16 to benefit from branching.

Parameters
  • va – first vector, dtype must be float64

  • vb – second vector, dtype must be float64

Returns

dot product

td3a_cpp.tutorial.dot_cython.ddot_array_16_sse()

dot product implemented with C types with disabled checkings (see compiler directives), and nogil. It is a wrapper for a C function as they cannot be exposed to the python world (gil is disabled). Computation is done 16x16 to benefit from branching and uses AVX instructions.

Parameters
  • va – first vector, dtype must be float64

  • vb – second vector, dtype must be float64

Returns

dot product

openmp

td3a_cpp.tutorial.dot_cython_omp.get_omp_max_threads()

Returns the number of threads.

td3a_cpp.tutorial.dot_cython_omp.ddot_cython_array_omp()

dot product implemented with cython and C types using prange (openmp in cython).

Parameters
  • va – first vector, dtype must be float64

  • vb – second vector, dtype must be float64

  • chunksize – see prange

  • schedule – 0 simple prange, 1 for ‘static’, 2 for ‘dynamic’

Returns

dot product

td3a_cpp.tutorial.dot_cython_omp.ddot_array_openmp()

dot product using openmp inside C++ code.

Parameters
  • va – first vector, dtype must be float64

  • vb – second vector, dtype must be float64

Returns

dot product

td3a_cpp.tutorial.dot_cython_omp.ddot_array_openmp_16()

dot product using openmp inside C++ code, parallelizes 16x16 computation.

Parameters
  • va – first vector, dtype must be float64

  • vb – second vector, dtype must be float64

Returns

dot product

filter

td3a_cpp.tutorial.experiment_cython.pyfilter_dmax()

Replaces all value superior to mx by mx. Python inside cython.

Parameters
  • va – first vector

  • mx – maximum

td3a_cpp.tutorial.experiment_cython.filter_dmax_cython()

Replaces all value superior to mx by mx. Simple cython.

Parameters
  • va – first vector

  • mx – maximum

td3a_cpp.tutorial.experiment_cython.filter_dmax_cython_optim()

Replaces all value superior to mx by mx. Simple cython with no bound checked, no wrap around.

Parameters
  • va – first vector

  • mx – maximum

td3a_cpp.tutorial.experiment_cython.cyfilter_dmax()

Replaces all value superior to mx by mx. Wraps a C function implemented in cython.

Parameters
  • va – first vector

  • mx – maximum

td3a_cpp.tutorial.experiment_cython.cfilter_dmax()

Replaces all value superior to mx by mx. Wraps a C function implemented in C.

Parameters
  • va – first vector

  • mx – maximum

td3a_cpp.tutorial.experiment_cython.cfilter_dmax2()

Replaces all value superior to mx by mx. Wraps a C function implemented in C, but uses operator ? : instead of keyword if.

Parameters
  • va – first vector

  • mx – maximum

td3a_cpp.tutorial.experiment_cython.cfilter_dmax4()

Replaces all value superior to mx by mx. Wraps a C function implemented in C, but uses operator ? : instead of keyword if. Goes 4 by 4.

Parameters
  • va – first vector

  • mx – maximum

td3a_cpp.tutorial.experiment_cython.cfilter_dmax16()

Replaces all value superior to mx by mx. Wraps a C function implemented in C, but uses operator ? : instead of keyword if. Goes 16 by 16.

Parameters
  • va – first vector

  • mx – maximum

matrix multiplication

td3a_cpp.tutorial.mul_cython_omp.dmul_cython_omp()

matrix multiplication product implemented with cython and C types using prange (openmp in cython).

Parameters
  • va – first matrix, dtype must be float64

  • vb – second matrix, dtype must be float64

  • algo – algorithm (see below)

  • parallel – kind of parallelization (see below)

Returns

matrix multiplication