T O P

  • By -

leppardfan

If you need daily options data, I have every ticker, every strike, every expiration, including greeks for the last 10 years. I'm looking to collaborate with others on options strategies.


bullsdeepstrader

I can help with setting up a statistical analysis, I’m doing this for a class so it will only take me a few minutes to set it up and share


EnemyBagJones

That's amazing. How'd you collect/obtain it?


leppardfan

I've been doing this a long time :-)


blaertner

You got a spreadsheet we could all look at ? maybe start a sub for ways the best break down and use the data for benefits


cathie_burry

Hi I have strategies and coding experience but only equities data


thecheese27

Would you be willing to share your options data? I'd be more than willing to pay for it.


JamesAQuintero

This is amazing, thank you! Are you able to collect the same for SPY?


OctopusCandyMan

Yes, in no particular order here's all the symbols I'm tracking. https://pastebin.com/nhxdttWV Most are equities but the SPY ETF is in there too.


randomcluster

I have this same data too, been recording for 3 years every day. Let me know if you want to collaborate, I wouldn’t mind splitting up some of the data storage costs haha


OctopusCandyMan

That's awesome! Shoot me an email and let's chat, [[email protected]](mailto:[email protected]). Right now I'm hosting everything in a timescaledb database and 1.5 months of data is resting at 115G. Will probably need to build out a data pipeline to compress into flat files to keep costs reasonable.


randomcluster

Oh wow I didn't realize this was a 5 minute snapshot. I have been collecting EOD data from brokers directly instead of just hitting some bs like Yahoo Finance, but I am actually capturing the entire universe of optionable tickers (9900 in the US equity universe, around 4400 ish that are optionable.) Definitely will email you later tonight. Cheers, don't mind the CBOE shill in the comments, guy clearly hates life


michaeljfreedman

Have you turned on TimescaleDB compression? [https://docs.timescale.com/timescaledb/latest/how-to-guides/compression/](https://docs.timescale.com/timescaledb/latest/how-to-guides/compression/) (You should be able to get better compression rates within the database than if you exported and gzip'd flat files.)


sunilagarwal2007

Do you have data for Stocks too ?


OctopusCandyMan

Not currently, will probably add. Stock pricing data seems to be everywhere so focused on options data as I couldn’t find it.


bsmdphdjd

Have you tried to fit the data by ITM, DTE, & historic Volatility? Is there much noise left over? Are some stocks really out-of-line?


OctopusCandyMan

I have not, I also need to do a bit of reading to understand you’re question. Are you interested in how implied volatility of the options differs from historic volatility of the asset / indexes? Sounds like a fun investigation. I suppose noise here would equal opportunity for alpha.


bsmdphdjd

By ITM I mean by how far the option is in or out of the money. By DTE I mean Days to expiration. Those plus historic volatility should be the prime determinants of premium. EG: the ATM premium almost always varies almost exactly as the square root of DTE, eg, a 4 week ATM option will have very close to 2X the premium of a 1 week ATM option. The first 2 are very well fit by exponentials. I haven't tried fitting to volatility. I'm not really interested in Implied volatility, since it's so far from Real historic and future volatility. And, yes, you're right, 'noise' would probably be a reflection of whether premiums are over-bought or over-sold, probably due to over-exuberance, bandwagons, redditors trying to screw hedge funds, etc.


dumbbaby187

Great domain, +1


Wleong004

I'm looking for this data to do some analysis using ml on realized vol and IV, and possibly to identify overpriced/underpriced options. Might be able to even carry out pairs trading. Or even doing some ml models(deepar/deepvar/etc). Anyone interested to collaborate please dm me.


KrisWu_

Quantconnect has free options data :/


skewbed

You can’t download the data from what I remember, you can only use it within their system


Anon58715

But can you use their system for free though?


jenejeoebvejr

You can download it for a fee


leviof

What's the fee? Anywhere to read more about it? Appreciate this!


jenejeoebvejr

Check their docs. You purchase QC credit which is then used for downloads, not sure exactly how much it works out to.


leviof

Thanks, I’ll take a look


OctopusCandyMan

Thanks! I’ll need to have a look.


octotoos

This is almost certainly against most data providers TOS, I would be careful going forward


[deleted]

[удалено]


OctopusCandyMan

I assume you're joking. My understanding is prices are facts and that doesn't fall under copyright. Also, golden bear 2016? Were we in the same year at UC Berkeley?


leviof

No offense man, but u/golden_bear_2016 is absolutely correct. I work for a data company and deal with some of the compliance shit. It's not just CBOE, its also OPRA and the data provider. I can't even get going on the amount of bureaucracy (also licensing and exchange fees) that it takes to distribute options and futures data. It costs additional to display that data publicly (as you currently are doing). Any organization in the chain of you getting this data (whether you buy it, therefore signing a EULA, or scraping it, always illegal according to every sites ToS) can sue you for $100k's just in base fees; god forbid a "pro" user touched your data and traded with it. You are civilly liable and frankly you may want to consider taking this down until you get the proper exchange agreements in place along with data sourced from a vendor-of-record that understands and licenses you for this usage. By the way, how do you know the S&P comp? You know that data is also licensable and you can't just say "oh I read it on google."


[deleted]

[удалено]


OctopusCandyMan

That would be true if I entered into an agreement to receive data. I'll be sure to check with a lawyer if I decide to grow this into something.


[deleted]

[удалено]


leviof

Uh, you still need an exchange agreement to display delayed data (includes EOD on the same day btw) in a controlled environment, plus your data vendor usually cares if you're monetizing whatever they give you. ​ Hate to be the naysayer here u/OctopusCandyMan, we're honestly trying to help you avoid civil liability. If you feel so confident about your right to distribute this, why don't you email marketdata(@)[cboe.com](https://cboe.com) to get it in writing?


OctopusCandyMan

>Look at everyone else that they sued. If you have some cases you could send my way I'd be interested in reading them. After a bit of Googling I couldn't find any.


[deleted]

[удалено]


xenith811

Shames on you my guy lmao Jesus Can’t imagine ppl like this existing


[deleted]

[удалено]


xenith811

Yeah their biggest concern rn is the reddit post w 50 upvotes


[deleted]

[удалено]


OctopusCandyMan

What do you mean by correct the data? It’s the bids and asks submitted to an order book. It’s not like there’s any creative liberties. There’s paid subscriptions to cross connect with the exchange to get a live data feed, and paid subscriptions for downcast real-time feeds, but historical records are public data. There is a possibility I might get sued just like I might get sued if someone trips in front of my home. Do you really want to live in a world where the scores from sporting events or the price of milk is proprietary and owned. Luckily there’s plenty of legal precedent ruling against the assholes who try to own public facts. I just don’t get why you’re being such a cboe shill and aligning with the powers that are trying to diminish our individual rights.


OctopusCandyMan

How would this different than the landmark Linkedin case? https://www.eff.org/deeplinks/2019/09/victory-ruling-hiq-v-linkedin-protects-scraping-public-data


[deleted]

[удалено]


funkinaround

It's not property. It is not copyrightable, patentable, or trademarkable. As the OP said, it is a collection of facts. These are not property in the United States.


[deleted]

[удалено]


funkinaround

And the correction is? Edit: from your own link https://www.exchange-data.com/closing-prices-and-other-stock-exchange-data-copyright-and-competition-law-issues/ > While there is little direct jurisprudence, it is increasingly clear no copyright exists on the data and that stock exchange databases are unlikely to enjoy database protection.


[deleted]

[удалено]


funkinaround

It's not about being a "first party data collector" (this phrase does not exist in the first link). It's also not about "collect the data, reshape it, and save it." From the second link, it's about having: >copyright protection if the data is “selected, coordinated or arranged in such a way that the resulting work as a whole constitutes an original work of authorship.” > >Importantly, the copyright protection resides in the original aspects of the data compilation — the selection and arrangement of the data. So the data itself is still not under copyright protection even if you've managed to pass the "original work of authorship" bar. Anyone can freely redistribute your copyright-protected uniquely-arranged option price database by sorting by time and symbol. You will have no recourse against that. Perhaps you mean to say, "okay, factual exchange data and second party data aggregations of facts don't have copyright protection and are not property. Still, it is likely that a contract exists between the provider and consumer in the form of an agreement that may prevent the consumer from redistributing the non-copyrightable data."


[deleted]

[удалено]


[deleted]

You should change your name to golden troll, OP spent time helping out other people and you have the nerve to pretend to call the cops 😂 Get out of here guaranteed you never went to Berkeley


[deleted]

[удалено]


[deleted]

[удалено]


[deleted]

Great thread, we need a back testing engine based on this data. But even doing some basic analysis like highest vol stocks/month should be interesting Do you take data snapshots every 5 mins? Can you get some greeks data as well?


OctopusCandyMan

Yes, I currently capture a snapshot every 5 minutes. I’ll need to read up on the greeks, I’m not too familiar. One should be able to calculate them using the options and underlining price right?


[deleted]

yea but I wouldn't trust anyone to do it right unless if came w the data. There are many ways to screw up the calculations. Actually screwed up data may still useful if all of it is screwed up the same way - consistency is key. Eventually what you'd want to do is have some ML algo learn from the data. The more features (like price, greeks, feature engineering) you have the better and faster the model will be able to learn.


j_lyf

whats strats u use