module finance.astock

Inheritance diagram of pyensae.finance.astock

Short summary

module pyensae.finance.astock

Downloads stock prices (from Yahoo website) and other prices.

source on GitHub

Classes

class truncated documentation
StockPrices Defines a class containing stock prices, provides basic functions, the class uses pandas to load the data.
StockPricesException Raised by StockPrices classes.
StockPricesHTTPException Raised by StockPrices classes.

Properties

property truncated documentation
dataframe Returns the dataframe.
shape  
tick Returns the tick name.

Static Methods

staticmethod truncated documentation
available_dates Returns the list of values (Open or High or Low or Close or Volume) from each stock for all the available_dates …
covariance Computes the covariances matrix (of returns).
draw Draws a graph showing one or several time series. The example was taken date_demo.py. …

Methods

method truncated documentation
__getitem__ Overloads the getitem operator to get a StockPrice object.
__init__  
__len__  
df Returns the dataframe.
FirstDate Returns the first date.
head usual
keep_dates removes undesired dates
LastDate Returns the first date.
missing Returnq the list of missing dates from an overset of trading dates.
plot See draw.
returns Builds the series of returns.
tail usual
to_csv Saves the file in text format, see to_csv
to_excel Saves the file in Excel format, see to_excel

Documentation

Downloads stock prices (from Yahoo website) and other prices.

source on GitHub

class pyensae.finance.astock.StockPrices(tick, url='google', folder='cache', begin=None, end=None, sep=', ', intern=False, use_dtime=False)[source]

Bases: object

Defines a class containing stock prices, provides basic functions, the class uses pandas to load the data.

Retrieve stock prices from the Yahoo source

from pyensae.finance import StockPrices
prices = StockPrices(tick="NASDAQ:MSFT")
print(prices.dataframe.head())

The class loads a stock price from either a url or a folder where the data was cached. If a filename <folder>/<tick>.<day1>.<day2>.txt already exists, it takes it from here. Otherwise, it downloads it.

A couple of providers have been implemented but it is not easy to keep them up to date as policies from website change on a regular basis. If url is 'yahoo', the data will be download using CAC 40. The CAC40 composition is described by Wikipedia CAC 40. However Yahoo Finance introduced the use of cookies in May 2017 and it is not so easy to automate. The default provider could be Google Finance which has now been integrated into the search engine. Tick names depends on the data prodiver. More details: European Markets Information. You can also go to quandl and get the tick for the module quandl. As of May 14th, the following error appears when using url='yahoo' which comes from an error in :epkg:`pandas_reader`:

ImmediateDeprecationError(DEP_ERROR_MSG.format('Yahoo Daily'))
pandas_datareader.exceptions.ImmediateDeprecationError:
Yahoo Daily has been immediately deprecated due to large breaks in the API without the
introduction of a stable replacement. Pull Requests to re-enable these data
connectors are welcome.

See https://github.com/pydata/pandas-datareader/issues

url='yahoo_new' should solve the issue. It relies on :epkg:`yahoo_historial`. Data can be downloaded for a specific period of time. If not specified, it takes the largest available.

Compute the average returns and correlation matrix

import pyensae, pandas
from pyensae.finance import StockPrices
from pyensae.datasource import download_data

# download the CAC 40 composition from my website (for Yahoo)
download_data('cac40_2013_11_11.txt', website='xd')

# download all the prices (if not already done) and store them into files
actions = pandas.read_csv("cac40_2013_11_11.txt", sep="\t")

# we remove stocks with not enough historical data
stocks = { k:StockPrices(tick = k) for k,v in actions.values }
dates = StockPrices.available_dates(stocks.values())
stocks = {k:v for k,v in stocks.items() if len(v.missing(dates)) <= 10}
print("nb left", len(stocks))

# we remove dates with missing prices
dates = StockPrices.available_dates(stocks.values())
ok = dates[dates["missing"] == 0]
print("all dates before", len(dates), " after:" , len(ok))
for k in stocks:
    stocks[k] = stocks[k].keep_dates(ok)

# we compute correlation matrix and returns
ret, cor = StockPrices.covariance(stocks.values(), cov = False, ret = True)

You should also look at pyensae et notebook. If you use Google Finance as a provider, the tick name is usually prefixed by the market places (NASDAQ for example). The export does not work for all markets places. Another provider was added, yahoo_new which delegates the task of getting data from Yahoo Finance to module yahoo-historical.

source on GitHub

Parameters:
  • tick – tick name, ex NASDAQ:MSFT
  • url – if yahoo, downloads the data from there if it was not done before url is possible, 'google', 'yahoo_new', 'quandl' are predefined values
  • folder – cache folder (created if it does not exists
  • begin – first day (datetime), see below
  • end – last day (datetime), see below
  • sep – column separator
  • intern – do not use unless you know what to do (see __getitem__)
  • use_dtime – if True, use DateTime instead of string

source on GitHub

FirstDate()[source]

Returns the first date.

source on GitHub

LastDate()[source]

Returns the first date.

source on GitHub

__getitem__(key)[source]

Overloads the getitem operator to get a StockPrice object.

Parameters:key – key
Returns:StockPrice

source on GitHub

__init__(tick, url='google', folder='cache', begin=None, end=None, sep=', ', intern=False, use_dtime=False)[source]
Parameters:
  • tick – tick name, ex NASDAQ:MSFT
  • url – if yahoo, downloads the data from there if it was not done before url is possible, 'google', 'yahoo_new', 'quandl' are predefined values
  • folder – cache folder (created if it does not exists
  • begin – first day (datetime), see below
  • end – last day (datetime), see below
  • sep – column separator
  • intern – do not use unless you know what to do (see __getitem__)
  • use_dtime – if True, use DateTime instead of string

source on GitHub

__len__()[source]
Returns:number of observations

source on GitHub

static available_dates(listStockPrices, missing=True, field='Close')[source]

Returns the list of values (Open or High or Low or Close or Volume) from each stock for all the available_dates for a list of stock prices.

A missing date is a date for which there is at least one stock price and one missing stock price.

if missing is true a column is added which gives the number of missing stock prices for this dates

Parameters:
  • listStockPrices – list of StockPrices
  • missing – True or False
  • field – which field to use to fill the matrix
Returns:

matrix with the available dates for each stock

source on GitHub

static covariance(listStockPrices, missing=True, field='Close', cov=True, ret=False)[source]

Computes the covariances matrix (of returns).

Parameters:
  • listStockPrices – list of StockPrices
  • field – which field to use to fill the matrix
  • cov – if True, returns the covariance, otherwise, the correlations
  • ret – if True, also add the returns
Returns:

square dataframe or 2 dataframe (returns, correlation)

source on GitHub

dataframe

Returns the dataframe.

source on GitHub

df()[source]

Returns the dataframe.

source on GitHub

static draw(listStockPrices, begin=None, end=None, field='Close', date_format=None, existing=None, axis=1, ax=None, label_prefix=None, color=None, **args)[source]

Draws a graph showing one or several time series. The example was taken date_demo.py.

Parameters:
  • listStockPrices – list of StockPrices (or one StockPrices if it is the only one)
  • begin – first date (datetime) or None to take the first one
  • end – last included date (datetime) or None to take the last one
  • field – Open, High, Low, Close, Adj Close, Volume
  • date_format%Y or %Y-%m or %Y-%m-%d or None if you prefer the function to choose
  • args – other arguments to send to plt.subplots
  • axis – 1 or 2, it only works if existing is not None. If axis is 2, the function draws the curves on the second axis.
  • label_prefix – to prefix curve label
  • color – curve color
  • args – other parameters to give method plt.subplots
  • ax – use existing axes
Returns:

axes

The parameter figsize of the method subplots can change the graph size (see the example below).

graph of a financial series

from pyensae.finance import StockPrices
stocks = [ StockPrices("NASDAQ:MSFT", folder = cache),
           StockPrices("NASDAQ:GOOGL", folder = cache),
           StockPrices("NASDAQ:AAPL", folder = cache)]
fig, ax, plt = StockPrices.draw(stocks)
fig.savefig("image.png")
fig, ax, plt = StockPrices.draw(stocks, begin="2010-01-01", figsize=(16,8))
plt.show()

You can also chain the graphs and add a series on a second graph:

from pyensae.finance import StockPrices
stock = StockPrices("NASDAQ:MSFT", folder = cache)
stock2 = StockPrices "NASDAQ:GOOGL", folder = cache)
fig, ax, plt = stock.plot(figsize=(16,8))
fig, ax, plt = stock2.plot(existing=(fig,ax), axis=2)
plt.show()

Changed in version 1.1: Parameter existing was removed and parameter ax was added. If the date overlaps, the method autofmt_xdate should be called.

source on GitHub

head()[source]

usual

source on GitHub

keep_dates(trading_dates)[source]

removes undesired dates

Parameters:trading_dates – dates
Returns:new series

source on GitHub

missing(trading_dates)[source]

Returnq the list of missing dates from an overset of trading dates.

Parameters:trading_dates – trading_dates (DataFrame having the column Date or in the index)
Returns:missing dates (or None if issues)

source on GitHub

plot(begin=None, end=None, field='Close', date_format=None, existing=None, axis=1, ax=None, label_prefix=None, color=None, **args)[source]

See draw.

source on GitHub

returns()[source]

Builds the series of returns.

Parameters:col – column to use to compute the returns
Returns:StockPrices

source on GitHub

shape

number of observations

source on GitHub

Type:return
tail()[source]

usual

source on GitHub

tick

Returns the tick name.

source on GitHub

to_csv(filename, sep='\t', index=False, **params)[source]

Saves the file in text format, see to_csv

Parameters:
  • filename – filename
  • sep – separator
  • index – to keep or drop the index
  • params – other parameters

source on GitHub

to_excel(excel_writer, **params)[source]

Saves the file in Excel format, see to_excel

source on GitHub

exception pyensae.finance.astock.StockPricesException[source]

Bases: Exception

Raised by StockPrices classes.

source on GitHub

exception pyensae.finance.astock.StockPricesHTTPException[source]

Bases: pyensae.finance.astock.StockPricesException

Raised by StockPrices classes.

source on GitHub