Hacker News new | past | comments | ask | show | jobs | submit login

Trying to teach my wife pandas and the thing she most wants to do is compute the 10 year projected return on fortune 500s (buffetology) based on last ten year financial reports. It's really hard to find a good data source though as it's either in PDF or Google has been optimized to rent seeking data repackagers where it's hard to see if they have the data without jumping through hoops. Would love a source for that.



I watch Aswath Damodaran's videos, Professor of Valuation at NYU. He recently had a video on the data he uses [0]. Might be worth it to email him to see if the data sources he buys matches your needs.

[0] https://youtu.be/M9pFTApeo_8

Additional link to his data site: http://people.stern.nyu.edu/adamodar/New_Home_Page/data.html


Look at the SEC/EDGAR page where you can find the data in xml and json formats

https://www.sec.gov/edgar/searchedgar/accessing-edgar-data.h...


If you use Google's Dataset Search for SEC Filings[1], you get outdated information. FTP access has been removed for years but SEC Filings are still are great example of large datasets. I built a side business at https://Last10K.com using buffettology and provide 10 years of company annual reports (10Ks). There's also an API at https://dev.Last10K.com that returns financial data from these filings in JSON or XML.

[1]https://datasetsearch.research.google.com/search?query=EDGAR...


Interesting. I was considering having her as a side hustle type these sheets into a place I could then sell. Sounds like that was what you did. How did that work out?


Didn't see any contact details on your HN profile so feel free to contact me directly and I can provide details.


Grab the numbers from a few of the pdf files, then do Google searches for those exact numbers and see if you can find one of those "auto-generated news sites" that shows the same numbers and scrape that?


Capital IQ




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: