module pandashelper.tblformat

Short summary

module pyquickhelper.pandashelper.tblformat

To format a pandas dataframe

source on GitHub

Functions

function

truncated documentation

df2html

Converts the table into a html string.

df2rst

Builds a string in RST format from a dataframe.

enumerate_split_df

Splits a dataframe by columns to display shorter dataframes.

Documentation

To format a pandas dataframe

source on GitHub

pyquickhelper.pandashelper.tblformat.df2html(self, class_table=None, class_td=None, class_tr=None, class_th=None)[source][source]

Converts the table into a html string.

Parameters
  • self – dataframe (to be added as a class method)

  • class_table – adds a class to the tag table (None for none)

  • class_td – adds a class to the tag td (None for none)

  • class_tr – adds a class to the tag tr (None for none)

  • class_th – adds a class to the tag th (None for none)

source on GitHub

pyquickhelper.pandashelper.tblformat.df2rst(df, add_line=True, align='l', column_size=None, index=False, list_table=False, title=None, header=True, sep=', ', number_format=None, replacements=None, split_row=None, split_row_level='+', split_col_common=None, split_col_subsets=None, filter_rows=None, label_pattern=None)[source][source]

Builds a string in RST format from a dataframe.

Parameters
  • df – dataframe

  • add_line – (bool) add a line separator between each row

  • alignr or l or c

  • column_size – something like [1, 2, 5] to multiply the column size, a dictionary (if list_table is False) to overwrite a column size like {'col_name1': 20} or {3: 20}

  • index – add the index

  • list_table – use the list_table

  • title – used only if list_table is True

  • header – add one header

  • sep – separator if df is a string and is a filename to load

  • number_format – formats number in a specific way, if number_format is an integer, the pattern is replaced by {numpy.float64: '{:.2g}'} (if number_format is 2), see also pyformat.info>`__

  • replacements – replacements just before converting into RST (dictionary)

  • split_row – displays several table, one column is used as the name of each section

  • split_row_level – title level if option split_row is used

  • split_col_common – splits the dataframe by columns, see enumerate_split_df

  • split_col_subsets – splits the dataframe by columns, see enumerate_split_df

  • filter_rows – None or function to removes rows, signature def filter_rows(df: DataFrame) -> DataFrame

  • label_pattern – if split_row is used, the function may insert a label in front of every section, example: ".. _lpy-{section}:"

Returns

string

If list_table is False, the format is the following.

None values are replaced by empty string (4 spaces). It produces the following results:

+------------------------+------------+----------+----------+
| Header row, column 1   | Header 2   | Header 3 | Header 4 |
| (header rows optional) |            |          |          |
+========================+============+==========+==========+
| body row 1, column 1   | column 2   | column 3 | column 4 |
+------------------------+------------+----------+----------+
| body row 2             | ...        | ...      |          |
+------------------------+------------+----------+----------+

If list_table is True, the format is the following:

.. list-table:: title
    :widths: 15 10 30
    :header-rows: 1

    * - Treat
      - Quantity
      - Description
    * - Albatross
      - 2.99
      - anythings
    ...

<<<

from pandas import DataFrame
from pyquickhelper.pandashelper import df2rst

df = DataFrame([{'A': 0, 'B': 'text'},
                {'A': 1e-5, 'C': 'longer text'}])
print(df2rst(df))

>>>

    +-------+------+-------------+
    | A     | B    | C           |
    +=======+======+=============+
    | 0.0   | text |             |
    +-------+------+-------------+
    | 1e-05 |      | longer text |
    +-------+------+-------------+

Changed in version 1.8: Parameter number_format was added.

Changed in version 1.9: Nan value are replaced by empty string even if number_format is not None. Parameters replacements, split_row, split_col_subsets, split_col_common, filter_rows were added.

source on GitHub

pyquickhelper.pandashelper.tblformat.enumerate_split_df(df, common, subsets)[source][source]

Splits a dataframe by columns to display shorter dataframes.

Parameters
  • df – dataframe

  • common – common columns

  • subsets – subsets of columns

Returns

split dataframes

<<<

from pandas import DataFrame
from pyquickhelper.pandashelper.tblformat import enumerate_split_df

df = DataFrame([{'A': 0, 'B': 'text'},
                {'A': 1e-5, 'C': 'longer text'}])
res = list(enumerate_split_df(df, ['A'], [['B'], ['C']]))
print(res[0])
print('-----')
print(res[1])

>>>

             A     B
    0  0.00000  text
    1  0.00001   NaN
    -----
             A            C
    0  0.00000          NaN
    1  0.00001  longer text

source on GitHub