Downloading historical timeseries in Python Pandas

I have been meaning to fetch historic price data for ETFs for a project in Python, and realized this is not as easy as I thought. In this and in the next couple of posts I will go through my attempts of downloading the data and explain why I had to give up on this method.

First up: Importing the data directly from finance.yahoo.com into pandas.

import pandas as pd
import numpy as np
from pandas.io.data import DataReader
from datetime import datetime
from dateutil.relativedelta import relativedelta


HORIZON = 3 # All figures are based on timeseries dating back 3 years


start_date = datetime.today() - relativedelta(years=HORIZON)

asset_names = ["^FTSE", "GSG", "TIP", "IYY"]
assets = []
asset_returns = pd.DataFrame(columns = asset_names)

for i in range(len(asset_names)):
data = DataReader(asset_names[i], "yahoo", start=start_date)
daily_returns = data['Close'].diff() / data['Close']
asset_returns[asset_names[i]] = daily_returns
assets.append({'name':  asset_names[i],
'mean':  np.mean(daily_returns),
'stdev': np.std(daily_returns)
})

This works beautifully. I ran into issues because the data quality and availability on yahoo is not great, and I had to look for other sources. But if you are looking for a quick and easy way of getting some financial timesieries into Python, pandas.io.data.DataReader is almost impossible to beat.

Leave a Reply

Your email address will not be published. Required fields are marked *