.. _f-dataframe: module ``df.dataframe`` ======================= .. inheritance-diagram:: pandas_streaming.df.dataframe Short summary +++++++++++++ module ``pandas_streaming.df.dataframe`` Defines a streaming dataframe. :githublink:`%|py|7` Classes +++++++ +------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+ | class | truncated documentation | +======================================================================================================+=========================================================================================================================+ | :class:`StreamingDataFrame ` | Defines a streaming dataframe. The goal is to reduce the memory footprint. The class takes a function which creates ... | +------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+ | :class:`StreamingDataFrameSchemaError ` | Reveals an issue with inconsistant schemas. | +------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+ | :class:`StreamingSeries ` | Seens as a :class:`StreamingDataFrame` of one column. | +------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------+ Properties ++++++++++ +----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+ | property | truncated documentation | +============================================================================+=================================================================================================================+ | :meth:`columns ` | See :epkg:`pandas:DataFrame:columns`. | +----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+ | :meth:`columns ` | See :epkg:`pandas:DataFrame:columns`. | +----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+ | :meth:`dtypes ` | See :epkg:`pandas:DataFrame:dtypes`. | +----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+ | :meth:`dtypes ` | See :epkg:`pandas:DataFrame:dtypes`. | +----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+ | :meth:`shape ` | This is the kind of operations you do not want to do when a file is large because it goes through the whole ... | +----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+ | :meth:`shape ` | This is the kind of operations you do not want to do when a file is large because it goes through the whole ... | +----------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------+ Static Methods ++++++++++++++ +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | staticmethod | truncated documentation | +===============================================================================================+=====================================================================================================================================+ | :py:meth:`_process_kwargs ` | Filters out parameters for the constructor of this class. | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`_process_kwargs ` | Filters out parameters for the constructor of this class. | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`read_csv ` | Reads a :epkg:`csv` file or buffer as an iterator on :epkg:`DataFrame`. The signature is the same as :epkg:`pandas:read_csv`. ... | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`read_csv ` | Reads a :epkg:`csv` file or buffer as an iterator on :epkg:`DataFrame`. The signature is the same as :epkg:`pandas:read_csv`. ... | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`read_df ` | Splits a :epkg:`DataFrame` into small chunks mostly for unit testing purposes. | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`read_df ` | Splits a :epkg:`DataFrame` into small chunks mostly for unit testing purposes. | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`read_json ` | Reads a :epkg:`json` file or buffer as an iterator on :epkg:`DataFrame`. The signature is the same as :epkg:`pandas:read_json`. ... | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`read_json ` | Reads a :epkg:`json` file or buffer as an iterator on :epkg:`DataFrame`. The signature is the same as :epkg:`pandas:read_json`. ... | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`read_str ` | Reads a :epkg:`DataFrame` as an iterator on :epkg:`DataFrame`. The signature is the same as :epkg:`pandas:read_csv`. ... | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`read_str ` | Reads a :epkg:`DataFrame` as an iterator on :epkg:`DataFrame`. The signature is the same as :epkg:`pandas:read_csv`. ... | +-----------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+ Methods +++++++ +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | method | truncated documentation | +=======================================================================================================+=========================================================================================================================================+ | :py:meth:`__add__ ` | Does an addition on every value hoping that has a meaning. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__del__ ` | Calls every function in `_delete_`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__del__ ` | Calls every function in `_delete_`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__getitem__ ` | Implements some of the functionalities :epkg:`pandas` offers for the operator ``[]``. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__getitem__ ` | Implements some of the functionalities :epkg:`pandas` offers for the operator ``[]``. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__init__ ` | | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__init__ ` | | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__iter__ ` | Iterator on a large file with a sliding window. Each windows is a :epkg:`DataFrame`. The method stores a ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__iter__ ` | Iterator on a large file with a sliding window. Each windows is a :epkg:`DataFrame`. The method stores a ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__setitem__ ` | Limited set of operators are supported. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`__setitem__ ` | Limited set of operators are supported. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`_concath ` | | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`_concath ` | | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`_concatv ` | | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`_concatv ` | | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`_reservoir_sampling ` | Uses the `reservoir sampling `_ algorithm to draw a random sample ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :py:meth:`_reservoir_sampling ` | Uses the `reservoir sampling `_ algorithm to draw a random sample ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`add_column ` | Implements some of the functionalities :epkg:`pandas` offers for the operator ``[]``. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`add_column ` | Implements some of the functionalities :epkg:`pandas` offers for the operator ``[]``. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`apply ` | Applies :epkg:`pandas:DataFrame:apply`. This function returns a :class:`StreamingDataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`apply ` | Applies :epkg:`pandas:Series:apply`. This function returns a :class:`StreamingSeries`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`applymap ` | Applies :epkg:`pandas:DataFrame:applymap`. This function returns a :class:`StreamingDataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`applymap ` | Applies :epkg:`pandas:DataFrame:applymap`. This function returns a :class:`StreamingDataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`concat ` | Concatenates :epkg:`dataframes`. The function ensures all :epkg:`pandas:DataFrame` or :class:`StreamingDataFrame` ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`concat ` | Concatenates :epkg:`dataframes`. The function ensures all :epkg:`pandas:DataFrame` or :class:`StreamingDataFrame` ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`describe ` | Calls :epkg:`pandas:DataFrame:describe` on every piece of the datasets. *percentiles* are not really accurate ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`describe ` | Calls :epkg:`pandas:DataFrame:describe` on every piece of the datasets. *percentiles* are not really accurate ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`drop ` | Applies :epkg:`pandas:DataFrame:drop`. This function returns a :class:`StreamingDataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`drop ` | Applies :epkg:`pandas:DataFrame:drop`. This function returns a :class:`StreamingDataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`ensure_dtype ` | Ensures the :epkg:`dataframe` *df* has types indicated in dtypes. Changes it if not. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`ensure_dtype ` | Ensures the :epkg:`dataframe` *df* has types indicated in dtypes. Changes it if not. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`fillna ` | Replaces the missing values, calls :epkg:`pandas:DataFrame:fillna`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`fillna ` | Replaces the missing values, calls :epkg:`pandas:DataFrame:fillna`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`get_kwargs ` | Returns the parameters used to call the constructor. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`get_kwargs ` | Returns the parameters used to call the constructor. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`groupby ` | Implements the streaming :epkg:`pandas:DataFrame:groupby`. We assume the result holds in memory. The out-of-memory ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`groupby ` | Implements the streaming :epkg:`pandas:DataFrame:groupby`. We assume the result holds in memory. The out-of-memory ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`groupby_streaming ` | Implements the streaming :epkg:`pandas:DataFrame:groupby`. We assume the result holds in memory. The out-of-memory ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`groupby_streaming ` | Implements the streaming :epkg:`pandas:DataFrame:groupby`. We assume the result holds in memory. The out-of-memory ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`head ` | Returns the first rows as a :epkg:`DataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`head ` | Returns the first rows as a :epkg:`DataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`is_stable ` | Tells if the :epkg:`dataframe` is supposed to be stable. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`is_stable ` | Tells if the :epkg:`dataframe` is supposed to be stable. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`iterrows ` | See :epkg:`pandas:DataFrame:iterrows`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`iterrows ` | See :epkg:`pandas:DataFrame:iterrows`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`merge ` | Merges two :class:`StreamingDataFrame` and returns :class:`StreamingDataFrame`. *right* can be either a :class:`StreamingDataFrame` ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`merge ` | Merges two :class:`StreamingDataFrame` and returns :class:`StreamingDataFrame`. *right* can be either a :class:`StreamingDataFrame` ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`sample ` | See :epkg:`pandas:DataFrame:sample`. Only *frac* is available, otherwise choose :meth:`reservoir_sampling`. ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`sample ` | See :epkg:`pandas:DataFrame:sample`. Only *frac* is available, otherwise choose :meth:`reservoir_sampling`. ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`sort_values ` | Sorts the streaming dataframe by values. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`sort_values ` | Sorts the streaming dataframe by values. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`tail ` | Returns the last rows as a :epkg:`DataFrame`. The size of chunks must be greater than ``n`` to get ``n`` ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`tail ` | Returns the last rows as a :epkg:`DataFrame`. The size of chunks must be greater than ``n`` to get ``n`` ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`to_csv ` | Saves the :epkg:`DataFrame` into string. See :epkg:`pandas:DataFrame.to_csv`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`to_csv ` | Saves the :epkg:`DataFrame` into string. See :epkg:`pandas:DataFrame.to_csv`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`to_dataframe ` | Converts everything into a single :epkg:`DataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`to_dataframe ` | Converts everything into a single :epkg:`DataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`to_df ` | Converts everything into a single :epkg:`DataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`to_df ` | Converts everything into a single :epkg:`DataFrame`. | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`train_test_split ` | Randomly splits a :epkg:`dataframe` into smaller pieces. The function returns streams of file names. It ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`train_test_split ` | Randomly splits a :epkg:`dataframe` into smaller pieces. The function returns streams of file names. It ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`where ` | Applies :epkg:`pandas:DataFrame:where`. *inplace* must be False. This function returns a :class:`StreamingDataFrame`. ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ | :meth:`where ` | Applies :epkg:`pandas:DataFrame:where`. *inplace* must be False. This function returns a :class:`StreamingDataFrame`. ... | +-------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------+ Documentation +++++++++++++ .. automodule:: pandas_streaming.df.dataframe :members: :special-members: __init__ :show-inheritance: