This page documents project Edge: Finding an Edge in the Options Market by Analyzing Aggressive Option Order Flow. It is a work in progress. I have been working on collecting the data and writing analysis scripts for over a year.
Part I: Thesis
The working thesis is aggressive trades in the option market can lead to momentum in the underlying stock. By analyzing aggressive order flow, I hope to find patterns that lead me to make better trades to improve my profitability and consistency.
Part II: Data Collection and Cleaning
The option order flow I collect is a curated list of aggressive option trades. The un-curated, raw data is from Trade-Alert‘s Real-Time Alerts: Option blocks and sweeps. The curating is done by the Sang Lucci/Wall St. Jesus trading group.
I collect the data twice a month: 15th, and last day.
The website from which I get the data is not conducive to web scrapping. I tried writing a python script using Beautiful Soup, but it was unsuccessful due to their web design. Since I was unable to get the data automatically, and their website doesn’t allow native copying, I use a Google Chrome extension called Allow Copy to enable copying in the browser. I copy the data manually from their trading room. I paste the data into a spreadsheet to clean it up. The raw data I collect looks like this:
|trending_down||14:59:57||T||09-19 M||30||30.67||$105K||Long Put|
|trending_down||14:59:56||UVXY||03/22-19 W||41||37.9||$612K||Long Put|
Before importing the data into a Microsoft SQL database, I enhance it by adding dates and transforming the dollar value into a numeric value. The google-sheets formula for transforming the dollar value is this:
If the value ends with K, it knows to match and multiply by 1000; if it ends in M, match and multiply by 1000000.
Once completed, the dataset looks like this:
|trending_down||14:59:57||T||09-19 M||30||30.67||$105K||Long Put||03/15/2019||105000|
|trending_down||14:59:56||UVXY||03/22-19 W||41||37.9||$612K||Long Put||03/15/2019||612000|
After being enhanced in Google-Sheets, I save the data as a CSV file and use the import function to pull the data into my database. Since each collection is imported, I run a script to insert the data into a master table called FlowTotal:
INSERT INTO FlowTotal ( flowID ,Flow ,Received ,Sym ,Expiration ,Strike ,Ref ,Size ,Sentiment ,DATE ,spent ) SELECT FlowID ,Flow ,Received ,Sym ,Expiration ,Strike ,Ref ,Size ,Sentiment ,DATE ,spent FROM d0315
Flow total contains two additional columns: Sector, and Expiration Date (expirDate).
To populate the additional columns I use a series of scripts. First, I update the sectors in the flowTotal table from a master list in my sectors table as follows:
update f set sector = s.sector from flowTotal f inner join sectors s on s.sym = f.sym where f.sector IS NULL
If I do not have the sector in the master list, I add it to the table and re-run the update. Here is the process:
update f set sector = s.sector from flowTotal f inner join sectors s on s.sym = f.sym where f.sector IS NULL select distinct sym from flowTotal where sector is null INSERT INTO Sectors (sector, sym) select 'Cyclical Consumer Goods', 'LAUR' --Repeat the insert process until all nulls are accounted for --Rerun update
The process to insert the expiration dates is similar. I use a master table of expiration values and expiration dates. I keep a count of the expiration as well. The entire process looks like this:
--Find the expiration values that do not yet exist --in the ExpiRank table select f.expiration, count (f.expiration) as expicount from flowtotal f left join ExpiRank e on e.Expiration = f.Expiration where e.expiration is null group by f.expiration --Insert null the expiration values Insert into ExpiRank(Expiration, ExpiCount) select f.expiration, count (f.expiration) as expicount from flowtotal f left join ExpiRank e on e.Expiration = f.Expiration where e.expiration is null group by f.expiration --Update the date value associated with --the expiration value: Update expiRank set expiDate = '07/27/2018' where expiration = '07/27-18 W' --Update the number of times the expiration appears update expirank set expiCount = t.expicount from (select count (expiration) as expiCount, expiration from flowTotal group by expiration)t where t.expiration = expiRank.expiration --Update flowTotal to add the expiration dates --to the expirDate column update f set expirDate = e.expidate from ExpiRank e inner join FlowTotal f on f.Expiration = e.Expiration where expirDate is null
Once updated, the FlowTotal data looks like this. TotalID is the flowTotal primary key. FlowID is the key from the import.
|1||1||trending_up||14:58:55||WLL||12/01-17 W||24||23.22||$18K||Long Call||2017-11-29||18000||Energy||2017-12-01|
|2||2||trending_up||14:58:47||TEVA||12-17 M||16||15.05||$20K||Long Call||2017-11-29||20000||Healthcare||2017-12-15|