module graphhelper.corrplot

Inheritance diagram of pyensae.graphhelper.corrplot

Short summary

module pyensae.graphhelper.corrplot

CorrPlot functionalities.

It comes from corrplot.py which I copied here because the module does not properly work on Python 3 (import issues). See also biokit license.

author

Thomas Cokelaer

references

http://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html

source on GitHub

Classes

class

truncated documentation

Corrplot

An implementation of correlation plotting tools (corrplot). the class requires scipy. …

Methods

method

truncated documentation

__init__

Plots the content of square matrix that contains correlation values.

_add_patches

_set_default_cmap

order

Rearranges the order of rows and columns after clustering.

plot

Plots the correlation matrix from the content of df (dataframe). By default, the correlation is …

Documentation

CorrPlot functionalities.

It comes from corrplot.py which I copied here because the module does not properly work on Python 3 (import issues). See also biokit license.

author

Thomas Cokelaer

references

http://cran.r-project.org/web/packages/corrplot/vignettes/corrplot-intro.html

source on GitHub

class pyensae.graphhelper.corrplot.Corrplot(data, na=0)[source]

Bases: pyensae.graphhelper.linkage.Linkage

An implementation of correlation plotting tools (corrplot). the class requires scipy.

Here is a simple example with a correlation matrix as an input (stored in a pandas dataframe):

import pandas
import numpy
letters = "ABCDEFGHIJKLM"[0:10]
df = pandas.DataFrame(dict(( (k, numpy.random.random(10)+ord(k)-65) for k in letters)))

import matplotlib.pyplot as plt
plt.close('all')
plt.style.use('ggplot')

from pyensae.graph_helper import Corrplot
c = Corrplot(df)
c.plot()
plt.show()

This class requires module colormap.

source on GitHub

Plots the content of square matrix that contains correlation values.

Parameters
  • data – input can be a dataframe (Pandas), or list of lists (python) or a numpy matrix. Note, however, that values must be between -1 and 1. If not, or if the matrix (or list of lists) is not squared, then correlation is computed. The data or computed correlation is stored in df attribute.

  • na – replace NA values with this value (default 0)

The params contains some tunable parameters for the colorbar in the plot() method.

# can be a list of lists, the correlation matrix is then a 2x2 matrix
c = Corrplot([[1,1], [2,4], [3,3], [4,4]])

source on GitHub

__init__(data, na=0)[source]

Plots the content of square matrix that contains correlation values.

Parameters
  • data – input can be a dataframe (Pandas), or list of lists (python) or a numpy matrix. Note, however, that values must be between -1 and 1. If not, or if the matrix (or list of lists) is not squared, then correlation is computed. The data or computed correlation is stored in df attribute.

  • na – replace NA values with this value (default 0)

The params contains some tunable parameters for the colorbar in the plot() method.

# can be a list of lists, the correlation matrix is then a 2x2 matrix
c = Corrplot([[1,1], [2,4], [3,3], [4,4]])

source on GitHub

_add_patches(df, method, fill, ax, diagonal=True)[source]
_set_default_cmap()[source]
order(method='complete', metric='euclidean', inplace=False)[source]

Rearranges the order of rows and columns after clustering.

Parameters
  • method – any scipy method (e.g., single, average, centroid, median, ward). See scipy.cluster.hierarchy.linkage

  • metric – any scipy distance (euclidean, hamming, jaccard) See scipy.spatial.distance or scipy.cluster.hieararchy

  • inplace (bool) – if set to True, the dataframe is replaced

You probably do not need to use that method. Use plot() and the two parameters order_metric and order_method instead.

source on GitHub

params = None

tunable parameters for the plot() method.

plot(fig=None, grid=True, rotation=30, lower=None, upper=None, shrink=0.9, axisbg='white', colorbar=True, label_color='black', fontsize='small', edgecolor='black', method='ellipse', order_method='complete', order_metric='euclidean', cmap=None, ax=None, binarise_color=False, figsize=None)[source]

Plots the correlation matrix from the content of df (dataframe).

By default, the correlation is shown on the upper and lower triangle and is symmetric wrt to the diagonal. The symbols are ellipses. The symbols can be changed to e.g. rectangle. The symbols are shown on upper and lower sides but you could choose a symbol for the upper side and another for the lower side using the lower and upper parameters.

Parameters
  • fig – Create a new figure by default. If an instance of an existing figure is provided, the corrplot is overlayed on the figure provided. Can also be the number of the figure.

  • grid – add grid (Defaults to grey color). You can set it to False or a color.

  • rotation – rotate labels on y-axis

  • lower – if set to a valid method, plots the data on the lower left triangle

  • upper – if set to a valid method, plots the data on the upper left triangle

  • shrink (float) – maximum space used (in percent) by a symbol. If negative values are provided, the absolute value is taken. If greater than 1, the symbols wiill overlap.

  • axisbg – color of the background (defaults to white).

  • colorbar – add the colorbar (defaults to True).

  • label_color (str) – (defaults to black).

  • fontsize – size of the fonts defaults to ‘small’.

  • method – shape to be used in ‘ellipse’, ‘square’, ‘rectangle’, ‘color’, ‘text’, ‘circle’, ‘number’, ‘pie’.

  • order_method – see order.

  • order_metric – see : meth:order.

  • cmap – a valid cmap from matplotlib or colormap package (e.g., ‘jet’, or ‘copper’). Default is red/white/blue colors.

  • binarise_color – two colors only, negative, positive

  • ax – a matplotlib axes.

  • figsize – gives that parameter to the new created figure

Returns

ax (matplotlib axes)

The colorbar can be tuned with the parameters stored in params. Here is an example. See notebook for other examples:

c = corrplot.Corrplot(dataframe)
c.plot(cmap=('Orange', 'white', 'green'))
c.plot(method='circle')
c.plot(colorbar=False, shrink=.8, upper='circle'  )

source on GitHub