module finance.astock

Inheritance diagram of pyensae.finance.astock

Short summary

module pyensae.finance.astock

Downloads stock prices (from Yahoo website) and other prices.

source on GitHub

Classes

class

truncated documentation

StockPrices

Defines a class containing stock prices, provides basic functions, the class uses :epkg:`pandas` to load the data.

StockPricesException

Raised by StockPrices classes.

StockPricesHTTPException

Raised by StockPrices classes.

Properties

property

truncated documentation

dataframe

Returns the dataframe.

shape

tick

Returns the tick name.

Static Methods

staticmethod

truncated documentation

available_dates

Returns the list of values (Open or High or Low or Close or Volume) from each stock for all the available_dates …

covariance

Computes the covariances matrix (of returns).

draw

Draws a graph showing one or several time series. The example was taken date_demo.py. …

Methods

method

truncated documentation

__getitem__

Overloads the getitem operator to get a StockPrice object.

__init__

__len__

df

Returns the dataframe.

FirstDate

Returns the first date.

head

usual

keep_dates

removes undesired dates

LastDate

Returns the first date.

missing

Returnq the list of missing dates from an overset of trading dates.

plot

See draw.

returns

Builds the series of returns.

tail

usual

to_csv

Saves the file in text format, see to_csv

to_excel

Saves the file in Excel format, see to_excel

Documentation

Downloads stock prices (from Yahoo website) and other prices.

source on GitHub

class pyensae.finance.astock.StockPrices(tick, url='google', folder='cache', begin=None, end=None, sep=', ', intern=False, use_dtime=False)[source]

Bases: object

Defines a class containing stock prices, provides basic functions, the class uses :epkg:`pandas` to load the data.

Retrieve stock prices from the Yahoo source

from pyensae.finance import StockPrices
prices = StockPrices(tick="NASDAQ:MSFT")
print(prices.dataframe.head())

The class loads a stock price from either a url or a folder where the data was cached. If a filename <folder>/<tick>.<day1>.<day2>.txt already exists, it takes it from here. Otherwise, it downloads it.

A couple of providers have been implemented but it is not easy to keep them up to date as policies from website change on a regular basis. If url is 'yahoo', the data will be download using CAC 40. The CAC40 composition is described by Wikipedia CAC 40. However Yahoo Finance introduced the use of cookies in May 2017 and it is not so easy to automate. The default provider could be Google Finance which has now been integrated into the search engine. Tick names depends on the data prodiver. More details: European Markets Information. You can also go to quandl and get the tick for the module quandl. As of May 14th, the following error appears when using url='yahoo' which comes from an error in :epkg:`pandas_reader`:

ImmediateDeprecationError(DEP_ERROR_MSG.format('Yahoo Daily'))
pandas_datareader.exceptions.ImmediateDeprecationError:
Yahoo Daily has been immediately deprecated due to large breaks in the API without the
introduction of a stable replacement. Pull Requests to re-enable these data
connectors are welcome.

See https://github.com/pydata/pandas-datareader/issues

url='yahoo_new' should solve the issue. It relies on :epkg:`yahoo_historial`. Data can be downloaded for a specific period of time. If not specified, it takes the largest available.

Compute the average returns and correlation matrix

import pyensae, pandas
from pyensae.finance import StockPrices
from pyensae.datasource import download_data

# download the CAC 40 composition from my website (for Yahoo)
download_data('cac40_2013_11_11.txt', website='xd')

# download all the prices (if not already done) and store them into files
actions = pandas.read_csv("cac40_2013_11_11.txt", sep="\t")

# we remove stocks with not enough historical data
stocks = { k:StockPrices(tick = k) for k,v in actions.values }
dates = StockPrices.available_dates(stocks.values())
stocks = {k:v for k,v in stocks.items() if len(v.missing(dates)) <= 10}
print("nb left", len(stocks))

# we remove dates with missing prices
dates = StockPrices.available_dates(stocks.values())
ok = dates[dates["missing"] == 0]
print("all dates before", len(dates), " after:" , len(ok))
for k in stocks:
    stocks[k] = stocks[k].keep_dates(ok)

# we compute correlation matrix and returns
ret, cor = StockPrices.covariance(stocks.values(), cov = False, ret = True)

You should also look at pyensae et notebook. If you use Google Finance as a provider, the tick name is usually prefixed by the market places (NASDAQ for example). The export does not work for all markets places. Another provider was added, yahoo_new which delegates the task of getting data from Yahoo Finance to module yahoo-historical.

source on GitHub

Parameters
  • tick – tick name, ex NASDAQ:MSFT

  • url – if yahoo, downloads the data from there if it was not done before url is possible, 'google', 'yahoo_new', 'quandl' are predefined values

  • folder – cache folder (created if it does not exists

  • begin – first day (datetime), see below

  • end – last day (datetime), see below

  • sep – column separator

  • intern – do not use unless you know what to do (see __getitem__)

  • use_dtime – if True, use DateTime instead of string

source on GitHub

FirstDate()[source]

Returns the first date.

source on GitHub

LastDate()[source]

Returns the first date.

source on GitHub

__getitem__(key)[source]

Overloads the getitem operator to get a StockPrice object.

Parameters

key – key

Returns

StockPrice

source on GitHub

__init__(tick, url='google', folder='cache', begin=None, end=None, sep=', ', intern=False, use_dtime=False)[source]
Parameters
  • tick – tick name, ex NASDAQ:MSFT

  • url – if yahoo, downloads the data from there if it was not done before url is possible, 'google', 'yahoo_new', 'quandl' are predefined values

  • folder – cache folder (created if it does not exists

  • begin – first day (datetime), see below

  • end – last day (datetime), see below

  • sep – column separator

  • intern – do not use unless you know what to do (see __getitem__)

  • use_dtime – if True, use DateTime instead of string

source on GitHub

__len__()[source]
Returns

number of observations

source on GitHub

static available_dates(listStockPrices, missing=True, field='Close')[source]

Returns the list of values (Open or High or Low or Close or Volume) from each stock for all the available_dates for a list of stock prices.

A missing date is a date for which there is at least one stock price and one missing stock price.

if missing is true a column is added which gives the number of missing stock prices for this dates

Parameters
  • listStockPrices – list of StockPrices

  • missing – True or False

  • field – which field to use to fill the matrix

Returns

matrix with the available dates for each stock

source on GitHub

static covariance(listStockPrices, missing=True, field='Close', cov=True, ret=False)[source]

Computes the covariances matrix (of returns).

Parameters
  • listStockPrices – list of StockPrices

  • field – which field to use to fill the matrix

  • cov – if True, returns the covariance, otherwise, the correlations

  • ret – if True, also add the returns

Returns

square dataframe or 2 dataframe (returns, correlation)

source on GitHub

dataframe

Returns the dataframe.

source on GitHub

df()[source]

Returns the dataframe.

source on GitHub

static draw(listStockPrices, begin=None, end=None, field='Close', date_format=None, existing=None, axis=1, ax=None, label_prefix=None, color=None, **args)[source]

Draws a graph showing one or several time series. The example was taken date_demo.py.

Parameters
  • listStockPrices – list of StockPrices (or one StockPrices if it is the only one)

  • begin – first date (datetime) or None to take the first one

  • end – last included date (datetime) or None to take the last one

  • field – Open, High, Low, Close, Adj Close, Volume

  • date_format%Y or %Y-%m or %Y-%m-%d or None if you prefer the function to choose

  • args – other arguments to send to plt.subplots

  • axis – 1 or 2, it only works if existing is not None. If axis is 2, the function draws the curves on the second axis.

  • label_prefix – to prefix curve label

  • color – curve color

  • args – other parameters to give method plt.subplots

  • ax – use existing axes

Returns

axes

The parameter figsize of the method subplots can change the graph size (see the example below).

graph of a financial series

from pyensae.finance import StockPrices
stocks = [ StockPrices("NASDAQ:MSFT", folder = cache),
           StockPrices("NASDAQ:GOOGL", folder = cache),
           StockPrices("NASDAQ:AAPL", folder = cache)]
fig, ax, plt = StockPrices.draw(stocks)
fig.savefig("image.png")
fig, ax, plt = StockPrices.draw(stocks, begin="2010-01-01", figsize=(16,8))
plt.show()

You can also chain the graphs and add a series on a second graph:

from pyensae.finance import StockPrices
stock = StockPrices("NASDAQ:MSFT", folder = cache)
stock2 = StockPrices "NASDAQ:GOOGL", folder = cache)
fig, ax, plt = stock.plot(figsize=(16,8))
fig, ax, plt = stock2.plot(existing=(fig,ax), axis=2)
plt.show()

Changed in version 1.1: Parameter existing was removed and parameter ax was added. If the date overlaps, the method autofmt_xdate should be called.

source on GitHub

head()[source]

usual

source on GitHub

keep_dates(trading_dates)[source]

removes undesired dates

Parameters

trading_dates – dates

Returns

new series

source on GitHub

missing(trading_dates)[source]

Returnq the list of missing dates from an overset of trading dates.

Parameters

trading_dates – trading_dates (DataFrame having the column Date or in the index)

Returns

missing dates (or None if issues)

source on GitHub

plot(begin=None, end=None, field='Close', date_format=None, existing=None, axis=1, ax=None, label_prefix=None, color=None, **args)[source]

See draw.

source on GitHub

returns()[source]

Builds the series of returns.

Parameters

col – column to use to compute the returns

Returns

StockPrices

source on GitHub

shape

number of observations

source on GitHub

Type

return

tail()[source]

usual

source on GitHub

tick

Returns the tick name.

source on GitHub

to_csv(filename, sep='\t', index=False, **params)[source]

Saves the file in text format, see to_csv

Parameters
  • filename – filename

  • sep – separator

  • index – to keep or drop the index

  • params – other parameters

source on GitHub

to_excel(excel_writer, **params)[source]

Saves the file in Excel format, see to_excel

source on GitHub

exception pyensae.finance.astock.StockPricesException[source]

Bases: Exception

Raised by StockPrices classes.

source on GitHub

exception pyensae.finance.astock.StockPricesHTTPException[source]

Bases: pyensae.finance.astock.StockPricesException

Raised by StockPrices classes.

source on GitHub