I specialize in building advanced audio transcription systems and natural language processing tools (especially Persian/Farsi).
What I Can Build:
Audio Transcription Systems:
• Multi-engine transcription (Whisper AI, Google Speech, custom models)
• GPU-accelerated processing (CUDA) with automatic CPU fallback
• Batch processing with real-time progress tracking
• Multiple input formats: MP3, WAV, M4A, FLAC, MP4, OGG, and more
• Multiple output formats: TXT, JSON (with precise timestamps), SRT subtitles
• High accuracy for Persian/Farsi, English, and 90+ languages
Persian/Farsi NLP Tools:
• Text normalization and cleaning (Hazm library)
• Tokenization and lemmatization
• Persian text processing pipelines
• Audio-to-Persian-text with proper formatting
• Custom language models for Persian content
Recent Project:
Built a production-ready Persian Audio Transcriber featuring:
- 3 transcription engines (Faster-Whisper, OpenAI Whisper, Google Speech)
- GPU acceleration with automatic CPU fallback
- Parallel batch processing
- Persian text normalization
- SRT subtitle generation with precise timestamps
- Support for 10+ audio/video formats
Tech Stack:
Python, OpenAI Whisper, Faster-Whisper, Google Speech API, CUDA, Hazm, FFmpeg, PyTorch
How It Works:
1. You describe your transcription/NLP requirements
2. I design the system architecture and workflow
3. I implement and optimize the solution (GPU acceleration, error handling)
4. I test thoroughly with your sample data
5. You receive: working code + documentation + setup guide + optional walkthrough
You Provide:
• Clear description of your use case
• Sample audio files (for transcription projects)
• Sample text data (for NLP projects)
• Preferred languages and output formats
• Any specific requirements or edge cases
Deliverables:
✓ Production-ready Python code with proper structure
✓ Comprehensive setup and usage documentation
✓ Performance-optimized implementation
✓ Error handling, logging, and progress tracking
✓ Requirements file and installation instructions
✓ Optional: video walkthrough or live demo
Timeline:
• Basic transcription script: 3–4 days
• Advanced multi-engine system: 5–7 days
• Custom NLP pipeline: 4–6 days
Payment: USDT/USDC/ETH via LaborX escrow.