module homeblog.table_formula_stat#

Short summary#

module ensae_teaching_cs.homeblog.table_formula_stat

Contains TableFormulaStat.

source on GitHub

Classes#

class

truncated documentation

_TableFormulaStat

Contains various statistical functions.

Methods#

method

truncated documentation

Gini

computes the Gini, it calls GiniCurve (GiniCurve()), it takes the following parameters:

GiniCurve

Computes the Gini curve, takes the following parameters.

summary

produces a summary on each columns

summary_column

produces a summary of a column, it the column is numerical, it computes, the min, max, quantile, mean, med, std. …

Documentation#

Contains TableFormulaStat.

source on GitHub

class ensae_teaching_cs.homeblog.table_formula_stat._TableFormulaStat#

Bases : object

Contains various statistical functions.

table = TableFormula ("sum_y#1#1#1#1#1#1#1#1#1#1#1".replace(" ", "\t").replace("#","\n"))
gini = table.Gini (lambda v : v["sum_y"])
print (gini)  # expects 1

table = TableFormula ("sum_y#1#1#1#1#1#1#1#1#1#1#1#5#10".replace(" ", "\t").replace("#","\n"))
gini = table.Gini (lambda v : v["sum_y"])
print (gini) # expects much more less than 1

source on GitHub

Gini(functionY, functionX=None, isXdx=False)#

computes the Gini, it calls GiniCurve (GiniCurve), it takes the following parameters:

Paramètres:
  • functionY – revenues

  • functionX – sum of persons having an income below Y (or having Y is isXdx is True)

  • isXdx – number of persons equal to Y (True) or inferior (False), if True, X,Y couples are sorted

Renvoie:

a curve (x, Gini(x))

source on GitHub

GiniCurve(functionY, functionX=None, isXdx=False)#

Computes the Gini curve, takes the following parameters.

Paramètres:
  • functionY – revenues

  • functionX – sum of persons having an income below Y (or having Y is isXdx is True)

  • isXdx – number of persons equal to Y (True) or inferior (False), if True, X,Y couples are sorted

Renvoie:

a curve (x, Gini(x))

source on GitHub

summary()#

produces a summary on each columns

Renvoie:

TableFormulaStat

source on GitHub

summary_column(column_name)#

produces a summary of a column, it the column is numerical, it computes, the min, max, quantile, mean, med, std. If it is not, count the number of distinct values. The function considers an empty column as a non-numerical column. The fonction do not consider None values.

Paramètres:

column_name – column name

Renvoie:

dictionary

source on GitHub