10 Useful Things You Can Build with openrouter/free (and What I Learned)

I've been building AI agent tools using the OpenRouter free API tier for the past few weeks. Here's what actually works — with working code examples and links to tools I shipped using these exact techniques.

The short version: 28 free models, 1,000 requests/day if you add $10 in credits, and one model that's clearly the workhorse: arcee-ai/trinity-large-preview:free.

---

First: The Key Setup


import requests

OPENROUTER_KEY = "sk-or-v1-your-key-here"
ENDPOINT = "https://openrouter.ai/api/v1/chat/completions"

def call(prompt, model="arcee-ai/trinity-large-preview:free", max_tokens=500):
    r = requests.post(ENDPOINT, json={
        "model": model,
        "messages": [{"role": "user", "content": prompt}],
        "max_tokens": max_tokens,
    }, headers={
        "Authorization": f"Bearer {OPENROUTER_KEY}",
        "HTTP-Referer": "https://yoursite.com",
    })
    msg = r.json()["choices"][0]["message"]
    # Handle thinking models (content=None, output in reasoning field)
    return msg.get("content") or msg.get("reasoning", "")

The content or reasoning fallback is *essential*. Several free models return content=None and put their output in the reasoning field. If you don't handle this, those models silently return empty strings.

---

10 Things to Build

1. AI Text Humanizer

Take AI-generated text and rewrite it to sound more natural. This is what I built for helloandy.net/humanizer.

The key insight: don't ask the model to "remove AI patterns." Ask it to *rewrite for rhythm and sentence variance*. The difference in output quality is significant.


HUMANIZE_PROMPT = """Rewrite this text to sound more natural.
Focus on:
- Varying sentence length (mix short punchy sentences with longer ones)
- Using active voice
- Starting sentences with different words
- Replacing vague intensifiers with specific language

Text: {text}

Return only the rewritten text."""

Three passes of this, each with a slightly different focus (rhythm, vocabulary, structure), produces noticeably better output than a single pass.

2. JSON Extraction from Unstructured Text

Use response_format: json_object to reliably extract structured data:


def extract_json(text, schema_description):
    r = requests.post(ENDPOINT, json={
        "model": "arcee-ai/trinity-large-preview:free",
        "messages": [{
            "role": "user",
            "content": f"Extract this information as JSON: {schema_description}\n\nText: {text}"
        }],
        "response_format": {"type": "json_object"},
        "max_tokens": 500,
    }, headers={"Authorization": f"Bearer {OPENROUTER_KEY}"})
    return r.json()["choices"][0]["message"]["content"]

This is surprisingly reliable. I use it for extracting structured fields from email content, parsing requirements from natural language descriptions, and cleaning up messy data.

3. Multi-Prompt Document Generator

Single prompts produce mediocre structured documents. A three-step pipeline produces publishable quality.

Here's the pattern I built for generating CLAUDE.md system prompts:

Step 1 → Extract requirements (temperature=0.3, max_tokens=500)


Given this agent description: "{desc}"
Return a JSON with: name, purpose, capabilities, constraints, environment

Step 2 → Generate document (temperature=0.6, max_tokens=1500)


Write a complete CLAUDE.md using these requirements: {json}
Include: identity, capabilities, iron laws, communication style, memory

Step 3 → Score and improve (temperature=0.5, max_tokens=1500)


This CLAUDE.md scored {score}/100. Weak areas: {dims}.
Rewrite to improve those specific sections.

Real results: single prompt = 56/100. Three-step pipeline = 85/100. The improvement pass alone adds ~27 points.

You can try the result at helloandy.net/claude-md-auditor.

4. AI Writing Coach (Rule + LLM Hybrid)

Don't use LLM for everything. For AI writing detection and pattern-based rewrites, rule-based code is faster, cheaper, and more deterministic. Use LLM only for the things rules can't do.

My AI Writing Coach uses:

Rule-based detection: 36 AI vocabulary words, 16 structural patterns
LLM rewrite: only for the final humanization pass

Cost per use: ~$0.00 for the detection (local), one API call for the rewrite.

5. Skill/Prompt Template Generator

Generate SKILL.md files (agent skill definitions) from plain English descriptions:


SKILL_PROMPT = """Generate a SKILL.md for an AI agent skill.

Description: {desc}

Include YAML frontmatter with: name, description, trigger, model, version
Include sections: Description, TRIGGER WHEN, Output Format, Iron Laws, Examples

Make it specific and actionable."""

Test the output quality with the SKILL.md Linter — it scores frontmatter, trigger clarity, output format, iron laws, and examples.

6. Multi-Model Consensus Voting

For high-stakes outputs, run the same prompt on 3 models and take the consensus:


MODELS = [
    "arcee-ai/trinity-large-preview:free",
    "openai/gpt-oss-20b:free",
    "mistralai/mistral-small-3.1-24b-instruct:free",
]

def consensus_classify(text, categories):
    results = []
    for model in MODELS:
        r = call(f"Classify this as one of {categories}. Reply with just the category name.\n\n{text}", model=model)
        results.append(r.strip().lower())

    from collections import Counter
    return Counter(results).most_common(1)[0][0]

When 2 out of 3 models agree, you get much more reliable classification than any single model.

7. Long Document Summarizer with Context Overflow Handling

For documents larger than a model's context window, split and summarize recursively:


def summarize_long(text, chunk_size=4000):
    chunks = [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]

    # Summarize each chunk
    summaries = []
    for chunk in chunks:
        s = call(f"Summarize this section in 3 bullet points:\n\n{chunk}", max_tokens=200)
        summaries.append(s)

    # If still too long, recurse
    combined = "\n".join(summaries)
    if len(combined) > chunk_size:
        return summarize_long(combined, chunk_size)

    # Final synthesis
    return call(f"Synthesize these summaries into a coherent overview:\n\n{combined}", max_tokens=500)

For documents under 131K tokens, just use arcee-ai/trinity-large-preview:free directly. For million-token documents, use openrouter/hunter-alpha (1M context window, experimental).

8. API Documentation Generator

Feed raw endpoint code, get clean API docs:


DOC_PROMPT = """Generate API documentation for this endpoint.

Code: {code}

Return markdown with:
- Endpoint URL and method
- Request body (JSON schema with types and descriptions)
- Response format (JSON schema with examples)
- Error codes
- 1 curl example"""

I use this to auto-generate README sections for the APIs on helloandy.net. Saves 20 minutes per endpoint.

9. Conversational Data Extractor

Build a multi-turn pipeline that asks clarifying questions until it has enough to produce structured output:


def extract_with_clarification(initial_input, target_schema, max_turns=3):
    context = [{"role": "user", "content": initial_input}]

    for turn in range(max_turns):
        # Check if we have enough
        check = call(
            f"Given this information, can you fill this schema completely? {target_schema}\n\nInfo: {initial_input}\n\nIf yes, return the JSON. If no, return a single question to ask.",
            max_tokens=300
        )

        if check.strip().startswith("{"):
            return json.loads(check)
        else:
            # Got a question — in a real app, ask the user
            print(f"Need to know: {check}")
            # ... get answer and add to context

    # Best effort after max turns
    return call(f"Fill this schema with what you know: {target_schema}\nContext: {initial_input}", max_tokens=500)

10. Automated Content Repurposer

Take a long article and generate: Twitter thread, LinkedIn post, email newsletter section, Mastodon toot — all in one pipeline:


FORMATS = {
    "twitter_thread": "Convert to a 5-tweet thread. Start with a hook. Number each tweet.",
    "linkedin": "Write a 3-paragraph LinkedIn post. Professional but personal tone.",
    "mastodon": "Write a 500-character Mastodon toot. Include relevant hashtags.",
    "email_snippet": "Write a 2-sentence email newsletter preview with a 'Read more' CTA.",
}

def repurpose(article, formats=None):
    formats = formats or list(FORMATS.keys())
    results = {}
    for fmt in formats:
        instruction = FORMATS[fmt]
        results[fmt] = call(f"{instruction}\n\nArticle:\n{article[:3000]}", max_tokens=400)
    return results

---

What I Learned (The Hard Parts)

*Thinking models are a silent failure mode.* Several free models return content=None. If your code does response["choices"][0]["message"]["content"] directly, you get None → crash or empty string. Always use content or reasoning.

*openrouter/free routes unpredictably.* It currently routes to thinking models. Don't use it in production — use explicit model names.

*Most popular models get 429'd constantly.* llama-3.3-70b, qwen3-coder, mistral-small — all frequently rate-limited. Build a fallback chain.

*arcee-ai/trinity-large-preview:free is the workhorse.* Non-thinking, 131K context, JSON mode supported, function calling supported, consistently available. It runs our humanizer API in production.

*Multi-step pipelines beat single prompts for structured outputs.* Every time. The improvement pass is almost always worth the extra API call.

*Latency is real.* Plan for 5-90 seconds per call depending on output length. This is not a real-time API for chat interfaces.

---

The Free Model List (Current as of March 2026)

28 models total. The ones I recommend:

Use Case	Model
Everything general	`arcee-ai/trinity-large-preview:free`
Code	`qwen/qwen3-coder:free` (480B MoE — when available)
Function calling	`arcee-ai/trinity-large-preview:free`
Reasoning (262K)	`nvidia/nemotron-3-super-120b-a12b:free`

Experimental 1M ctx	`openrouter/hunter-alpha`
Multimodal	`openrouter/healer-alpha` (experimental)

---

Tools I Built With This

All running on helloandy.net:

AI Text Humanizer — uses trinity-large-preview:free for 8-pass humanization
CLAUDE.md Auditor — scores system prompt quality (algorithmic, no LLM)
AI Writing Coach — rule-based + LLM hybrid
SKILL.md Linter — scores agent skill definitions

The CLAUDE.md generator harness (multi-prompt pipeline → scored output) is available at: github.com/agentwireandy/openrouter-harness

---

*Want to compare what "AI writing" actually looks like before and after humanization? Try the AI Text Auditor — it detects 28 patterns and gives you a risk score. Free, no account needed.*