Trading is often a game of information, analysis, and luck. Having better information than the next guy can be the difference between making a profit and losing money. I try to have as much information as I can get my hands on. Recently, a few algo traders introduced me to the website Alpha Vantage which is a free to use API for downloading market data, including 1 min time series and a bunch of indicators. The downside to using Alpha Vantage is that it only allows 5 calls per minute and 500 calls per day. Regardless, it is still a great free tool.
Since I’ve been learning how to write python scripts for the past two months in my Data Analysis class, I figured it would be a good exerciser to figure out a way to get 1 min data for a few stocks I follow out of Alpha Vantage and into my SQL database where I store the rest of my market data for my project code named Edge.
I shared my script with Reddit and a user pointed out that it is actually bad practice to write SQL directly into the script, and that I should be using Object Relational Mapping (ORM) instead. I am looking into that and will be updating the script once I have a handle on what I need to do to replace the SQL. I also know it is very bad practice to have a visible API key within the script, but this is just meant to be a “plug and play” script that is easy for a noobie like me to use. In the meantime, my python script does what I need it to do. It downloads the 1 min data as a CSV file, and then uploads that data into a table in my Microsoft SQL Server Database:
#Dependencies import requests import datetime as dt import urllib.request as req import time import pyodbc import csv #Get today's datetime information date = dt.datetime.now() #API KEY from Alpha Vantage api_key = '' #value used to iterate through the syms list i = 0 #list of stock symbols to track syms = ['MSFT', 'SPY', 'AAPL', 'DIA', 'QQQ', 'GS', 'GE', 'IBM', 'JPM', 'XLF', 'AA', 'BABA', 'TWTR', 'XHB', 'INTC', 'C', 'CZR', 'MGM' 'SQ','BAC', 'AMD', 'FB', 'VXX', 'TSLA', 'IWM', 'GLD', 'SVXY', 'EEM', 'FCX', 'WMT'] #variables to store year/month/day numbers to append to file name. y = date.strftime("%Y") m = date.strftime("%m") d = date.strftime("%d") #connects to SQL database conn = pyodbc.connect('Driver={SQL Server};' 'Server=your server here;' 'Database= your database here;' 'Trusted_Connection=yes;') cursor = conn.cursor() #for loop to iterate through the symbol list for sym in syms: #sleep for 15 seconds to avoid API call limit time.sleep(15) i+=1 #Create the URL for the API call url = 'https://www.alphavantage.co/query?function=TIME_SERIES_INTRADAY&symbol=' + syms[i] + '&interval=1min&outputsize=full&apikey='+ api_key+'&datatype=csv' #create the file path to save the csv save_path = 'your save path here'+ syms[i] + y + m + d + '.csv' print(url) #read the page and store it get_url = req.urlopen(url) data = get_url.read() get_url.close() #write the data to a file tnsfile = open(save_path,'wb') tnsfile.write(data) #close python connection to the file tnsfile.close() #create the table name tableName = syms[i] + y + m + d #set the database you want to use, create the table cursor.execute('use YOUR DATABASE HERE; Create Table ' + tableName + '(symID int IDENTITY(1,1) NOT FOR REPLICATION NOT NULL, sym varchar(5), [timestamp] datetime ,[open] float,[high] float,[low] float,[close] float, [volume] float)') #Open the csv file, build the query. with open (save_path, 'r') as f: reader = csv.reader(f) columns = next(reader) query = "insert into " + tableName +" (sym, timestamp,[open],[high],[low],[close],[volume]) values (" +"'"+ syms[i]+"'"+ ",{1})" query = query.format(','.join(columns), ','.join('?' * len(columns))) cursor = conn.cursor() #print(query) #write the query to the database for rows in reader: #print(rows) #print(query) cursor.execute(query, rows) cursor.commit()
Before getting into writing web scrapping scripts, I was scrapping the data manually, and it was tedious and time consuming. Now I can spend more time do analytics and less time manually building datasets! School is paying off already.