Trading is often a game of information, analysis, and luck. Having better information than the next guy can be the difference between making a profit and losing money. I try to have as much information as I can get my hands on. Recently, a few algo traders introduced me to the website Alpha Vantage which is a free to use API for downloading market data, including 1 min time series and a bunch of indicators. The downside to using Alpha Vantage is that it only allows 5 calls per minute and 500 calls per day. Regardless, it is still a great free tool.
Since I’ve been learning how to write python scripts for the past two months in my Data Analysis class, I figured it would be a good exerciser to figure out a way to get 1 min data for a few stocks I follow out of Alpha Vantage and into my SQL database where I store the rest of my market data for my project code named Edge.
I shared my script with Reddit and a user pointed out that it is actually bad practice to write SQL directly into the script, and that I should be using Object Relational Mapping (ORM) instead. I am looking into that and will be updating the script once I have a handle on what I need to do to replace the SQL. I also know it is very bad practice to have a visible API key within the script, but this is just meant to be a “plug and play” script that is easy for a noobie like me to use. In the meantime, my python script does what I need it to do. It downloads the 1 min data as a CSV file, and then uploads that data into a table in my Microsoft SQL Server Database:
#Dependencies
import requests
import datetime as dt
import urllib.request as req
import time
import pyodbc
import csv
#Get today's datetime information
date = dt.datetime.now()
#API KEY from Alpha Vantage
api_key = ''
#value used to iterate through the syms list
i = 0
#list of stock symbols to track
syms = ['MSFT', 'SPY', 'AAPL', 'DIA', 'QQQ', 'GS', 'GE', 'IBM', 'JPM', 'XLF', 'AA', 'BABA', 'TWTR', 'XHB', 'INTC', 'C', 'CZR', 'MGM'
'SQ','BAC', 'AMD', 'FB', 'VXX', 'TSLA', 'IWM', 'GLD', 'SVXY', 'EEM', 'FCX', 'WMT']
#variables to store year/month/day numbers to append to file name.
y = date.strftime("%Y")
m = date.strftime("%m")
d = date.strftime("%d")
#connects to SQL database
conn = pyodbc.connect('Driver={SQL Server};'
'Server=your server here;'
'Database= your database here;'
'Trusted_Connection=yes;')
cursor = conn.cursor()
#for loop to iterate through the symbol list
for sym in syms:
#sleep for 15 seconds to avoid API call limit
time.sleep(15)
i+=1
#Create the URL for the API call
url = 'https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&symbol=' + syms[i] + '&interval=1min&outputsize=full&apikey='+ api_key+'&datatype=csv'
#create the file path to save the csv
save_path = 'your save path here'+ syms[i] + y + m + d + '.csv'
print(url)
#read the page and store it
get_url = req.urlopen(url)
data = get_url.read()
get_url.close()
#write the data to a file
tnsfile = open(save_path,'wb')
tnsfile.write(data)
#close python connection to the file
tnsfile.close()
#create the table name
tableName = syms[i] + y + m + d
#set the database you want to use, create the table
cursor.execute('use YOUR DATABASE HERE; Create Table ' + tableName + '(symID int IDENTITY(1,1) NOT FOR REPLICATION NOT NULL, sym varchar(5), [timestamp] datetime ,[open] float,[high] float,[low] float,[close] float, [volume] float)')
#Open the csv file, build the query.
with open (save_path, 'r') as f:
reader = csv.reader(f)
columns = next(reader)
query = "insert into " + tableName +" (sym, timestamp,[open],[high],[low],[close],[volume]) values (" +"'"+ syms[i]+"'"+ ",{1})"
query = query.format(','.join(columns), ','.join('?' * len(columns)))
cursor = conn.cursor()
#print(query)
#write the query to the database
for rows in reader:
#print(rows)
#print(query)
cursor.execute(query, rows)
cursor.commit()
Before getting into writing web scrapping scripts, I was scrapping the data manually, and it was tedious and time consuming. Now I can spend more time do analytics and less time manually building datasets! School is paying off already.