AI-powered web application that automatically transcribes YouTube videos and translates them into Japanese using OpenAI Whisper and GPT. Bridges language barriers in video content with modern web technologies.
- Repositoryhttps://github.com/insertfahim/vlog-translator
- PlatformWeb (Next.js) / Desktop / Mobile
- Frontend StackNext.js 13+, TypeScript, Stitches CSS, Radix UI
- Backend StackNext.js API Routes, Node.js, Python, Shell Scripts
- AI ServicesOpenAI Whisper API, OpenAI GPT API, yt-dlp
- Output FormatSRT Subtitles, Real-time Progress Streaming

Core Features
- YouTube Integration: Direct video URL processing from any YouTube video with high-quality audio extraction using yt-dlp
- AI Transcription: OpenAI Whisper-powered speech-to-text conversion with industry-leading accuracy
- Smart Translation: Context-aware Japanese translation with formal tone using OpenAI GPT API
- Real-time Processing: Live progress tracking with streaming responses and user-friendly status updates
- SRT Export: Standard subtitle format output ready for video editing and professional use
- Responsive Design: Modern UI that works seamlessly on desktop, tablet, and mobile devices


Processing Pipeline
The application follows a sophisticated 5-step processing pipeline: URL Input → Audio Extraction → AI Transcription → Translation → SRT Output. Each step is optimized for performance and accuracy, with real-time progress updates throughout the entire process.

Use Cases & Applications
- Content Creators: Add professional Japanese subtitles to vlogs, tutorials, and educational content
- Language Learning: Generate accurate transcriptions and translations for study materials and comprehension practice
- Accessibility: Create subtitles for hearing-impaired viewers to improve content accessibility
- Documentation: Convert video content to searchable, indexed text for reference and archival purposes
- Translation Services: Professional Japanese localization for international content distribution
Technical Innovation
This platform showcases advanced integration of multiple AI services with a modern web architecture. The combination of OpenAI's Whisper for transcription and GPT for translation, coupled with efficient audio processing and real-time streaming, demonstrates sophisticated full-stack development capabilities and AI service orchestration.