Tasnimul Mohammad Fahim

VlogTranslator 2024

AI-powered web application that automatically transcribes YouTube videos and translates them into Japanese using OpenAI Whisper and GPT. Bridges language barriers in video content with modern web technologies.

Repositoryhttps://github.com/insertfahim/vlog-translator
PlatformWeb (Next.js) / Desktop / Mobile
Frontend StackNext.js 13+, TypeScript, Stitches CSS, Radix UI
Backend StackNext.js API Routes, Node.js, Python, Shell Scripts
AI ServicesOpenAI Whisper API, OpenAI GPT API, yt-dlp
Output FormatSRT Subtitles, Real-time Progress Streaming

Core Features

YouTube Integration: Direct video URL processing from any YouTube video with high-quality audio extraction using yt-dlp
AI Transcription: OpenAI Whisper-powered speech-to-text conversion with industry-leading accuracy
Smart Translation: Context-aware Japanese translation with formal tone using OpenAI GPT API
Real-time Processing: Live progress tracking with streaming responses and user-friendly status updates
SRT Export: Standard subtitle format output ready for video editing and professional use
Responsive Design: Modern UI that works seamlessly on desktop, tablet, and mobile devices

Processing Pipeline

The application follows a sophisticated 5-step processing pipeline: URL Input → Audio Extraction → AI Transcription → Translation → SRT Output. Each step is optimized for performance and accuracy, with real-time progress updates throughout the entire process.

Use Cases & Applications

Content Creators: Add professional Japanese subtitles to vlogs, tutorials, and educational content
Language Learning: Generate accurate transcriptions and translations for study materials and comprehension practice
Accessibility: Create subtitles for hearing-impaired viewers to improve content accessibility
Documentation: Convert video content to searchable, indexed text for reference and archival purposes
Translation Services: Professional Japanese localization for international content distribution

Technical Innovation

This platform showcases advanced integration of multiple AI services with a modern web architecture. The combination of OpenAI's Whisper for transcription and GPT for translation, coupled with efficient audio processing and real-time streaming, demonstrates sophisticated full-stack development capabilities and AI service orchestration.