Sure, you can look at my post about Freely Available Financial Data. Unfortunately, even using the trick about the Yahoo CSV URL will only allow you to download 200 at a time, and it is tough to do correctly.
I'll show you in this post how to download historical prices for every S&P 500 stock using two programs: R and Excel. First off, if you aren't familiar with R, see my post on Getting Started in R, which should get you ready to go for this post.
The Tickers
What you'll really need first is a list of the ticker symbols for every stock in the index. Luckily, there exists a Wikipedia page for everything these days - even a list of S&P 500 companies. Highlight with your mouse and copy the entire table of companies, then paste into Excel. It should look something like this.Freshly copied table from Wikipedia page. |
Open up RStudio, and import the .csv file you just created by going to Tools - Import Dataset - From Text File. Give the imported data the name companies and deselect the Strings as factors option as shown below.
Importing list of companies. |
The Data
It is time to load up the workhorse of this process, the quantmod package. Select it under the Packages tab or type library(quantmod). This will give us access to all of the functions within the package. The quantmod package allows a user to download financial data by using back-end APIs. The first function of interest here is the getSymbols() function. Call the getSymbols() function and pass it the tickers object. Type: getSymbols(tickers)Be patient here, this will take a few minutes depending on how many tickers you request and your connection speed. This function will download the historical data from Yahoo Finance in the normal format you'd get if you went to the website and downloaded it yourself. Thus, it contains Date, Open, Close, High, Low, Volume, and Adjusted Close data. Notice that it says it is pausing for a second between requests, this is due to the restrictions Yahoo places on the requests - they are limited to 200 tickers at a time. By pausing intermittently, the requests become smaller chunks, and it can download all of the tickers. This will create an xts object for every single ticker containing the data for that stock, and it will be named with the ticker symbol.
While that is running, we can prepare the next few steps. Go back to your Excel sheet containing the list of companies. In the next column, we will prepare a concatenated string. First, some quick background on the quantmod functions. As stated previously, the getSymbols() command will return an xts object with certain data series in it. We can get at those specific data series using the functions below:
- Op() - Open price
- Cl() - Close price
- Hi() - High price
- Lo() - Low price
- Vo() - Volume
- Ad() - Adjusted close price
So, if I just want the adjusted close price for CAT, I type Ad(CAT) and I get it. To store that in a vector, just use the <- syntax as is normal in R.
But what if I only want a specific date range? That's possible too. Suppose I wanted from July 1, 2014 to August 31, 2014, I would type Ad(CAT['2014-07-01::2014-08-31']). The dates must always be in yyyy-mm-dd format, so I must type 07 for July and not just 7. If I just wanted from July 1, 2014 through today, I would type Ad(CAT['2014-07-01::']), essentially just not restricting the end of the data set.
We must now construct a column of commands like Ad() or Hi(). For this example, suppose we want adjusted close prices from July 1, 2014 until today. In a blank column in Excel, type =CONCATENATE( "Ad(", A2, "['2014-07-01::']),") which will create a column of functions like the above. Again, copy this down to the end of the data set. Afterwards, copy the entire column and then paste it as values to convert it to plain text. Also, don't forget to delete that pesky comma off of the very last item. Copy the data in this column, not by clicking on the column header, but by highlighting the specific cells (don't forget your CTRL+SHIFT+DOWN shortcut here!).
Add the concatenated strings in Column C. |
Hopefully by now quantmod has finished downloading the data for all of your stocks. If not, I'm sorry. Perhaps this is all the justification you need to upgrade to a faster internet package - I don't know. Anyways, when it is done go back to the RStudio console and type stockprices <- cbind( and delete the closing parenthesis so that the function isn't finished and you will get the + sign.
R is waiting for more input. |
Now paste in the data we copied from Excel, make sure there is no comma at the end of it, and hit enter. You should still have the + sign indicating R is waiting for more input. Now add the closing parenthesis from before to complete the cbind() function. R will now execute the cbind() function which will bind all of those columns together in a matrix called stockprices for you, and it may not finish immediately, so be patient with it. When the cbind() command is done, you can type View(stockprices) to see the matrix in the viewer if you'd like.
The final step is to save the table to a file, which can be done very easily from R. We will use the write.csv() method. Unfortunately, there is one little hiccup to this process; we can't use write.csv(stockprices, file="whatever.csv") because the dates will not be saved. Luckily for you, I know the workaround! The proper code is write.csv(as.data.frame(stockprices), file="yourfilenamehere.csv") where yourfilenamehere.csv is whatever you want to name the file. Open the file in Excel to see the results if you want - good as gold!
Finished product. |
Summary
Using this method, you can download historical data for any securities with data on Yahoo Finance or Google Finance. I've shown you how to download the historical adjusted close prices for every stock in the S&P 500 Index, probably saving you lots and lots of time. After the first time using this process, you'll have everything figured out and ready to go and will only have to wait for the data to download. If you found this useful, great! If you have any problems with the procedure, let me know and I'll try my best to help. Do you have an even better (also free) method? Feel free to share in the comments.
No comments:
Post a Comment