Hi Preston, great work on this project! I really liked that you incorporated web scraping into this.
I'd also love to see the code for this. I'm not sure if I understand the data correctly, but it seems that your data has a sequential element: from January 2018 and January 2020.
If so, GridSearchCV would not work here because it uses random splitting for the Cross Validation. Instead, you want to use TimeSeriesSplit, which uses sequential splitting instead.
Also, is this a panel data set that you have? If so, then it could be worthwhile to explore regression models that are used specifically for panel data.