AI models can't be retrained every time you want them to do something new. But they can read instructions. The SKILL.md file format is a structured way to give an AI agent a new capability — a defined behavior it can activate on demand, with reliable outputs, without touching the base model or the main system prompt.
A SKILL.md file is a markdown document that teaches an AI agent a specific, reusable capability. Where a system prompt (or CLAUDE.md) defines the agent's general identity and behavior, a SKILL.md defines one skill in precise detail — the trigger conditions that activate it, the exact output format it produces, and the constraints it operates under.
The core insight is that language models learn from context. If you put a well-structured skill definition into an agent's context at the right moment, the agent will follow it with high consistency — especially when the skill includes examples. The SKILL.md format is a standardized way to structure those definitions so they're reusable, auditable, and improvable over time.
In Claude Code, skills are loaded automatically when placed in the right directory. In other agent frameworks, they're loaded into context at invocation time or registered in the agent's skill library for on-demand retrieval.
A skill is not a plugin, an API call, or a function. It doesn't add new tools or external capabilities to the agent. It adds a defined pattern of behavior — a reliable way the agent responds to a specific class of request. The agent still works within its existing capabilities; the skill tells it exactly how to apply those capabilities to this type of task.
Every SKILL.md file defines trigger conditions — the circumstances under which the skill should activate. When the agent receives a request, it matches the request against its loaded skill triggers. If a trigger matches, the skill's output template and constraints take precedence over default behavior.
Trigger conditions come in three forms:
A complete SKILL.md file has six sections. Omitting any of them degrades reliability:
---
name: [Skill Name]
description: [One sentence on what this skill does]
version: 1.0
---
## Trigger Conditions
ACTIVATE WHEN:
- [Condition 1 — specific and behavioral]
- [Condition 2]
DO NOT ACTIVATE WHEN:
- [Negative condition 1]
- [Negative condition 2]
TRIGGER EXAMPLES:
- "[Example prompt that should activate this skill]"
- "[Another example]"
NOT TRIGGER EXAMPLES:
- "[Example that looks similar but should not trigger]"
## Output Format
[Describe the exact structure of the output. Use headers, sections, or templates
as appropriate. This section is the spec — the model uses it as a direct template.]
## Iron Laws
1. NEVER [specific failure mode] — instead [recovery path]
2. NEVER [specific failure mode] — instead [recovery path]
## Error Handling
- If [error condition]: [what to do]
- If [error condition]: [what to do]
## Example
INPUT:
[A realistic example input]
OUTPUT:
[The complete expected output, matching the output format exactly]
The frontmatter (name, description, version) is used by skill registries and linters. The trigger section is the most important for reliability. The output format is the spec the model follows. The iron laws prevent the most common failure modes. The error handling prevents confusing refusals. The example is the regression test — without it, "matching the output format exactly" means something different every session.
Here's a complete SKILL.md for a code review skill:
---
name: Code Review
description: Reviews code for bugs, security issues, and style problems
version: 1.2
---
## Trigger Conditions
ACTIVATE WHEN:
- User asks to review, check, or audit a code file or snippet
- User asks what's wrong with a piece of code
- User pastes code and asks for feedback
DO NOT ACTIVATE WHEN:
- User asks to explain how code works (use explanation, not review format)
- User asks to write new code from scratch
- User asks a single question about one line (answer directly, skip review format)
TRIGGER EXAMPLES:
- "Review this Python function for any issues"
- "What's wrong with this code?"
- "Can you audit this authentication handler?"
NOT TRIGGER EXAMPLES:
- "How does this sorting algorithm work?"
- "Write me a function that parses CSV"
- "What does line 12 do?"
## Output Format
### Summary
One sentence: what the code does and overall quality assessment.
### Issues Found
List each issue with:
- Severity: CRITICAL / HIGH / MEDIUM / LOW
- Location: file:line or function name
- Issue: what's wrong
- Fix: concrete suggestion
### Positive Notes
1-3 things the code does well (skip if nothing notable).
### Verdict
One of: APPROVE / APPROVE WITH CHANGES / REQUEST CHANGES / REJECT
Brief justification (1 sentence).
## Iron Laws
1. NEVER fabricate issues that don't exist to appear thorough —
if the code is clean, say so in Positive Notes and APPROVE
2. NEVER mark a security vulnerability below HIGH severity
3. NEVER suggest rewrites for stylistic preference — only flag issues
that affect correctness, security, or maintainability
## Error Handling
- If no code is provided: ask "Please paste the code you'd like me to review"
- If code is incomplete/truncated: note "Review based on partial code — full
file review may reveal additional issues"
- If language is unclear: state the assumed language before beginning
## Example
INPUT:
def get_user(user_id):
query = "SELECT * FROM users WHERE id = " + user_id
return db.execute(query)
OUTPUT:
### Summary
A database lookup function with a critical SQL injection vulnerability.
### Issues Found
- Severity: CRITICAL
- Location: get_user(), line 2
- Issue: String concatenation in SQL query allows SQL injection. Any
unsanitized user_id is injected directly into the query.
- Fix: Use parameterized queries:
db.execute("SELECT * FROM users WHERE id = ?", (user_id,))
### Positive Notes
(None notable for a 3-line function.)
### Verdict
REQUEST CHANGES — fix the SQL injection before any deployment.
Start by writing 10 example prompts the skill should handle and 5 it shouldn't. Derive your trigger conditions from those examples. The negative examples are usually more clarifying than the positive ones — they force you to articulate the exact boundary between this skill and adjacent behaviors.
ACTIVATE WHEN: user asks about code — This fires on "explain this code", "write me code", "what's the history of Python", and dozens of other patterns that aren't code review.ACTIVATE WHEN: user pastes code and asks for review, feedback, issues, or audit. DO NOT ACTIVATE WHEN: user asks for explanation, asks you to write new code, or asks a question about a single line without requesting review.The output format section should show the structure, not describe it. "A section for issues found, each with severity and description" is a description. Showing the actual headers and fields — "### Issues Found" with "Severity: / Location: / Issue: / Fix:" — is a template. The model uses templates as direct patterns. Descriptions require interpretation.
Run the skill on the hardest version of each type of input it handles. The iron laws that prevent real failures are more valuable than generic goodness constraints. A code review skill that never fabricated issues didn't need "NEVER make up issues" as an iron law until it did it once on a clean file and eroded user trust.
Skills have a natural saturation point where additional instructions stop improving output quality. For focused, bounded tasks (code review, SQL generation, data extraction), saturation typically occurs around 400–600 tokens of skill definition. Beyond that, adding instructions dilutes the most important ones. Measure, don't guess — run the skill at different definition lengths and compare consistency.
The decision rule is straightforward: if a behavior applies to everything the agent does, it belongs in the system prompt or CLAUDE.md. If it applies to a specific class of task and needs detailed instructions, output templates, or examples, it belongs in a SKILL.md file.
System prompt / CLAUDE.md is the right place for:
SKILL.md is the right place for:
A practical signal: if you find yourself writing a section in your CLAUDE.md that needs its own examples and iron laws, it's probably a skill. Extract it to a SKILL.md file and reference it from the main config.