For my Data Science group project at UCSD, I wanted to explore a question that had always fascinated me: in our increasingly polarized political climate where distrust in traditional media is growing, does corporate media viewership decline coincide with political podcast growth? While some group members proposed using existing Kaggle datasets, I was excited by the challenge of investigative data science - building everything from scratch. The project became this fascinating journey of web scraping with Selenium, integrating with the Wayback Machine API, and building a sophisticated YouTube Data API pipeline with intelligent caching to stay within Google's rate limits. We faced significant hurdles: much of the podcast data didn't exist on YouTube, requiring us to create algorithms to filter valid channels, and we had to rotate API keys and implement extensive caching of channel IDs and playlist IDs. The statistical analysis revealed some fascinating insights - when we removed outliers (17% of data), podcasts showed 11.6% average growth versus corporate media's -15.8% decline, with highly significant year × type interactions (p < 0.001). It was incredibly rewarding to see our hypothesis confirmed through rigorous statistical testing, and the project taught me that sometimes the most interesting insights come from building the data pipeline yourself rather than relying on pre-existing datasets.
clutchdev.apps@gmail.com
949-910-7879
© 2025 Clutch Studio. All rights reserved.