I will build custom tools or audio transcription/NLP tools

I specialize in building advanced audio transcription systems and natural language processing tools (especially Persian/Farsi).

 

What I Can Build:

 

Audio Transcription Systems:

• Multi-engine transcription (Whisper AI, Google Speech, custom models)

• GPU-accelerated processing (CUDA) with automatic CPU fallback

• Batch processing with real-time progress tracking

• Multiple input formats: MP3, WAV, M4A, FLAC, MP4, OGG, and more

• Multiple output formats: TXT, JSON (with precise timestamps), SRT subtitles

• High accuracy for Persian/Farsi, English, and 90+ languages

 

Persian/Farsi NLP Tools:

• Text normalization and cleaning (Hazm library)

• Tokenization and lemmatization

• Persian text processing pipelines

• Audio-to-Persian-text with proper formatting

• Custom language models for Persian content

 

Recent Project:

Built a production-ready Persian Audio Transcriber featuring:

- 3 transcription engines (Faster-Whisper, OpenAI Whisper, Google Speech)

- GPU acceleration with automatic CPU fallback

- Parallel batch processing

- Persian text normalization

- SRT subtitle generation with precise timestamps

- Support for 10+ audio/video formats

 

Tech Stack:

Python, OpenAI Whisper, Faster-Whisper, Google Speech API, CUDA, Hazm, FFmpeg, PyTorch

 

How It Works:

1. You describe your transcription/NLP requirements

2. I design the system architecture and workflow

3. I implement and optimize the solution (GPU acceleration, error handling)

4. I test thoroughly with your sample data

5. You receive: working code + documentation + setup guide + optional walkthrough

 

You Provide:

• Clear description of your use case

• Sample audio files (for transcription projects)

• Sample text data (for NLP projects)

• Preferred languages and output formats

• Any specific requirements or edge cases

 

Deliverables:

✓ Production-ready Python code with proper structure

✓ Comprehensive setup and usage documentation

✓ Performance-optimized implementation

✓ Error handling, logging, and progress tracking

✓ Requirements file and installation instructions

✓ Optional: video walkthrough or live demo

 

Timeline:

• Basic transcription script: 3–4 days

• Advanced multi-engine system: 5–7 days

• Custom NLP pipeline: 4–6 days

 

Payment: USDT/USDC/ETH via LaborX escrow.

 

Terms of work
80
ETH, USDT, TIME
+53

More Gigs from Amir Aeiny

You might also like

Show more