I have been meaning to fetch historic price data for ETFs for a project in Python, and realized this is not as easy as I thought. In this and in the next couple of posts I will go through my attempts of downloading the data and explain why I had to give up on this method.
First up: Importing the data directly from finance.yahoo.com into pandas.
import pandas as pd
import numpy as np
from pandas.io.data import DataReader
from datetime import datetime
from dateutil.relativedelta import relativedelta
HORIZON = 3 # All figures are based on timeseries dating back 3 years
start_date = datetime.today() - relativedelta(years=HORIZON)
asset_names = ["^FTSE", "GSG", "TIP", "IYY"]
assets = 
asset_returns = pd.DataFrame(columns = asset_names)
for i in range(len(asset_names)):
data = DataReader(asset_names[i], "yahoo", start=start_date)
daily_returns = data['Close'].diff() / data['Close']
asset_returns[asset_names[i]] = daily_returns
This works beautifully. I ran into issues because the data quality and availability on yahoo is not great, and I had to look for other sources. But if you are looking for a quick and easy way of getting some financial timesieries into Python, pandas.io.data.DataReader is almost impossible to beat.