module fastdata.pandas2numpy

Short summary

module cpyquickhelper.fastdata.pandas2numpy

Fast data manipulations.

source on GitHub

Functions

function

truncated documentation

df2array

Converts a dataframe into a numpy.array without copying. pandas is merging consecutive columns sharing …

df2arrays

Converts a dataframe into a list of a list of tuple (column name, :epkg:`numpy:array`) without copying. pandas

Documentation

Fast data manipulations.

source on GitHub

cpyquickhelper.fastdata.pandas2numpy.df2array(df, check=True)[source]

Converts a dataframe into a numpy.array without copying. pandas is merging consecutive columns sharing the same type into one memory block. The function can be used only if the data is stored in one block and one type as a consequence.

Parameters
  • df – dataframe

  • check – verifies the operation can be done (True) or skip verification (False)

Returns

numpy.array

See data member, _data.

See also

df2array

source on GitHub

cpyquickhelper.fastdata.pandas2numpy.df2arrays(df, sep=', ', check=True)[source]

Converts a dataframe into a list of a list of tuple (column name, :epkg:`numpy:array`) without copying. pandas is merging consecutive columns sharing the same type into one memory block. That’s what the function extracts

Parameters
  • df – dataframe

  • check – verifies the operation can be done (True) or skip verification (False)

  • sep – columns separator

Returns

a list of tuple (column, array)

Example:

<<<

from pandas import DataFrame
from cpyquickhelper.fastdata import df2arrays

df = DataFrame([dict(a=3.4, b=5.6, c="e"),
                dict(a=3.5, b=5.7, c="r")])
arr = df2arrays(df)
print(arr)

>>>

    [('a,b', array([[3.4, 3.5],
           [5.6, 5.7]])), ('c', array([['e', 'r']], dtype=object))]

See also

df2array

source on GitHub