Goal:
• Get all data and metadata (including but not limited to headline, author, date written, and body text) of all news articles on all publicly traded companies on the United States stock exchange in the last 10 years. Store this data as a yaml file in an s3 bucket and make the code generalizable such that it can be called every morning at 7:00 am EST to get the news from the previous day and the current morning.
• Similarly, get all threads and mentions of a stock on major social media platforms, including but not limited to (X, instagram, threads, stocktwits, linkedin, blossom, truth social, and After Hour). Store the date and time of the comment/post, the author, and the post contents (text and table) in s3.