The goal: a chatbot that actually does things
Most free chatbots are wrappers around a single LLM call. You type something, the model generates a response, and that’s it. No real data. No live information. No ability to generate images, solve math symbolically, or build you a working game.
The goal with helloandy.net’s AI Chat was different: build a chatbot that genuinely does useful things across many domains — weather, code, math, research, image generation, currency conversion, trivia — and make it completely free. No account, no API key, no credit card.
The result is a 16-mode chatbot powered by 18 free APIs with a smart LLM router that automatically detects what you’re asking and routes your question to the right mode. You don’t pick modes manually. Just ask naturally and the system figures it out.
How the smart router works
At the core is an LLM-powered query classifier. When you send a message, the router reads your input and classifies it into one of 16 categories with a single fast LLM call. The router uses zero-temperature inference for consistency — the same question should always route the same way.
The router prompt contains detailed classification rules with edge cases. For example, “how many feet in a mile?” routes to calculate (not lookup), while “what does HN think about X?” routes to research (not news). These distinctions matter because each mode has different API calls, prompt engineering, and output formats.
After classification, the system dispatches to the appropriate handler. Some modes are direct LLM calls. Others involve multi-step pipelines with external API calls, data enrichment, and synthesis. Here’s every mode in detail.
All 16 modes explained
The research engine: where depth matters
Research mode is the most complex and capable mode. It’s what sets this chatbot apart from simple LLM wrappers. Here’s the full pipeline:
- Query expansion — Your question gets expanded into 3 semantically diverse sub-queries using Jaccard similarity filtering. This ensures broader source coverage than a single search.
- Source detection — The system analyzes your query to determine which APIs to search. Academic questions hit arXiv and Semantic Scholar. GitHub questions search repositories. Community-opinion questions go to Hacker News.
- Multi-source search — Searches run in parallel across up to 10 sources: DuckDuckGo, Wikipedia, arXiv, Semantic Scholar, Crossref, GitHub, Hacker News, Open Library, Wikidata, and World Bank.
- Credibility scoring — Every source gets a credibility score. Academic papers (arXiv, Semantic Scholar) score highest. Government and .edu domains rank above commercial sites. Wikipedia is high but not top.
- Forced-attribution synthesis — The top sources are extracted into structured fact blocks. The LLM must cite every single fact — there is no way to write an uncited claim. This produces answers with 7+ real citations in every response.
The result: research answers that cite real papers, real data, and real community discussions. Not hallucinated references — verified, clickable sources from authoritative databases.
Quality engineering: the ratchet process
Building a chatbot is easy. Making it consistently good is hard. The quality of this chatbot was systematically improved through an iterative process called the ratchet — a series of measured improvements where each iteration must score higher than the last.
The evaluation framework tests 10 diverse questions across 5 runs each, producing 50 data points per iteration. Every answer is scored on:
- Source quality — Are the cited sources authoritative and relevant?
- Citation density — How many real citations per paragraph?
- Completeness — Does the answer cover the question fully?
- Accuracy — Are the claims factually correct?
- Domain diversity — Are sources from multiple domains, not just Wikipedia?
Over 46 iterations, the median composite score improved from 5.2 to 8.12 — a 56% improvement through systematic source injection, citation engineering, prompt architecture, and synthesis pipeline redesign.
The 18 free APIs
Every API used by this chatbot is completely free. No API keys required for most — only OpenRouter (for LLM inference) and FRED (for economic data) need keys, and both offer free tiers.
| # | Service | Purpose | Used By |
|---|---|---|---|
| 1 | OpenRouter | LLM inference (all modes) | All modes |
| 2 | wttr.in | Real-time weather data | Weather |
| 3 | Pollinations.ai | AI image generation (FLUX) | Image |
| 4 | QR Server | QR code generation | QR |
| 5 | FRED | Federal Reserve economic data | Data |
| 6 | Frankfurter | ECB currency exchange rates | Currency |
| 7 | Free Dictionary | Word definitions | Word |
| 8 | Open Trivia DB | Trivia questions | Trivia |
| 9 | DuckDuckGo | Web search | Research, News, Lookup |
| 10 | Wikipedia | Encyclopedia | Research, Lookup |
| 11 | Hacker News | Tech community | Research, News |
| 12 | arXiv | Academic papers | Research |
| 13 | Semantic Scholar | Academic search | Research |
| 14 | Crossref | DOI/citation lookup | Research |
| 15 | GitHub | Repository search | Research |
| 16 | Open Library | Book search | Research |
| 17 | Wikidata | Structured knowledge | Research |
| 18 | World Bank | Development data | Research |
Bring your own key (BYOK)
The chatbot works out of the box with no configuration — it uses a shared free-tier OpenRouter API key. But if you want faster responses and no rate limiting, you can bring your own OpenRouter key.
Click the settings icon in the chat interface, paste your API key, and it’s stored locally in your browser. Your key is sent directly to OpenRouter — it’s never stored on the server. The free model (arcee-ai/trinity-large-preview) still costs nothing even with your own key, but you bypass the shared 15-requests-per-hour limit.
Format validation and self-correction
Several modes include a format-checker pipeline that validates LLM output structure and retries if needed. The calculate mode checks for the required Expression → Steps → Answer format. The HTML and game modes run a full lint pass — checking for matched tags, required elements, valid JavaScript, and responsive design patterns. If validation fails, the system provides specific error feedback and regenerates.
This means the chatbot can catch and fix its own mistakes before you see them. A game that crashes on load gets rebuilt. A calculation missing its answer line gets reformatted. Self-correction produces noticeably better output quality than raw single-pass generation.
What makes this different
There are thousands of AI chatbot wrappers. Here’s what this one does differently:
- Real data, not fabrication. Weather comes from wttr.in. Currency rates from the ECB. Economic data from the Federal Reserve. Research cites real papers from arXiv. Nothing is hallucinated.
- Automatic mode selection. You don’t pick modes from a dropdown. The smart router classifies your intent and dispatches to the right handler. It works for 99%+ of queries.
- Quality-engineered research. 46 iterations of measured improvement. Forced citation architecture that makes every claim traceable to a real source.
- Completely free. No account, no credit card, no freemium upsell. The same chatbot for everyone.
- Open harness. The evaluation harness is open source. The codebase includes 10 question types, 5-run averaging, and composite scoring across source quality, citation density, and completeness.
helloandy.net provides free AI tools and tutorials for developers. No account required.