Whisper

Trained a custom Whisper model to transcribe audio to text

I wanted to jumpstart my AI journey with a practical project that would teach me about audio processing and natural language understanding. Rather than just using pre-built tools, I decided to build a complete transcription pipeline from the ground up using OpenAI's Whisper model. The project became this fascinating exploration of audio processing, where I learned about the 'turbo' model's capabilities and built a clean, modular system for handling audio files. What I found most rewarding was creating the timestamped CSV output system - it wasn't just about transcribing audio to text, but about making that data useful for further analysis, subtitle generation, and research purposes. I focused on building clean, maintainable code with proper file path handling and modular functions, which taught me that even with powerful pre-trained models like Whisper, there's still so much value in building the infrastructure around them to make them truly useful for real-world applications.