I build AI assistants that answer questions from your documents – with source citations, no hallucinations, and predictable LLM costs.
Who this is for:
– SaaS companies wanting an in-product AI helper trained on their docs
– Web3 projects with technical documentation that users keep asking about in Discord
– Internal teams replacing "ask the senior" with "ask the bot"
– Anyone tired of generic ChatGPT that doesn't know their domain
What I deliver in the base package:
A working Telegram bot that answers questions from up to 50 pages of your documents. Each answer includes citations to source files. Token usage tracking per request. Source code, deployment instructions, and a runbook for adding new documents.
Add-ons available:
– Knowledge base expansion (500+ pages with optimized retrieval)
– Parsing your website, Notion, or Confluence as a live source
– Web widget instead of (or in addition to) Telegram
– Voice input via Whisper
– Cross-session memory per user
– Admin panel for non-technical content updates
– Cost dashboard: spend per user, per topic, per day
Why this isn't another generic RAG bot:
I use a hybrid architecture inspired by Karpathy's LLM Wiki pattern (April 2026). First layer – structured markdown knowledge base where information accumulates and self-improves over time. Second layer – vector search for large archives. In practice this is significantly cheaper per query than pure RAG, and produces more coherent answers because the LLM works with structured knowledge, not random chunks.
How I work:
1. We discuss scope: what documents, who users are, what failure modes matter.
2. I analyze your content and propose the architecture (single or hybrid).
3. Prototype on a subset of your data – you validate quality.
4. Full deployment with your documents loaded.
5. Source code, README, runbook.
Portfolio reference:
Long-running production AI assistant on a closed knowledge base. Python async, agentic tool-use loops, prompt-caching for cost control, two-tier model routing (fast classifier + heavy answerer). Git-versioned markdown vault as the knowledge layer, with ARQ for heavy background work.
Stack:
Python 3.11+ async, Anthropic Claude, OpenAI, OpenRouter. SQLAlchemy 2.0, PostgreSQL, Redis, ARQ. Aiogram for Telegram. Structlog for observability. Docker, systemd.
Important about LLM costs:
You pay OpenAI/Anthropic directly with your own API key. Typical monthly cost for a moderately used bot: $5-30. The hybrid architecture reduces this further by avoiding redundant context loading.
Base delivery: 3-5 days. Complex setups with live source parsing: 1-2 weeks.
Payment: USDT, USDC, ETH, DAI, or other stablecoins via LaborX escrow. NDA work welcome.