llms.txt is a plain markdown file at the root of your domain that summarizes your site for language models. An AI crawler reads it in seconds, without parsing HTML, CSS, or waiting for JavaScript to execute. Anthropic, Vercel, Mintlify, FastAPI and Drizzle already implement it. This is the guide to do it right.
What it is exactly
The file lives at https://yourdomain.com/llms.txt. Plain markdown, no styling, no complicated metadata. Its goal is to answer, in fewer than 200 lines, the basic questions a model would ask about your site: what we are, what we offer, where to find more.
The standard was proposed by Jeremy Howard (Answer.AI) in September 2024. Not W3C yet. But adoption is moving fast among companies that depend on being cited by LLMs.
How it differs from files you already have
- robots.txt says what crawlers can read. Access control.
- sitemap.xml lists all indexable URLs with update dates. For traditional crawlers.
- llms.txt editorially summarizes the brand, in dense language, for AI crawlers.
The three coexist. They don't replace each other.
Recommended structure
The standard does not impose a rigid structure. The one working best in production looks like this:
# Your brand name
## What it is
A clear sentence of what you do and who you serve.
## Why it matters
The concrete problem you solve. In 1-2 lines.
## Services or products
- Item 1 — what it includes, in one line.
- Item 2 — what it includes, in one line.
- Item 3 — what it includes, in one line.
## Who we are
Team, location, relevant public sites or projects.
## Differentiators
What sets you apart, no marketingese. Facts.
## Cases or evidence
- Client / project X — what outcome we achieved.
- Client / project Y — what outcome we achieved.
## Contact
Email: hello@yourdomain.com
Web: https://yourdomain.com
## Languages
Spanish: https://yourdomain.com/es
English: https://yourdomain.com/enThe trick is writing as if your reader were an editor in a hurry. No empty adjectives, no long enumerations, no sentences that sound like a brochure.
When to add llms-full.txt
If your site has relevant technical documentation — APIs, frameworks, libraries, manuals — it's worth exposing it in a separate file: /llms-full.txt. A plain, ordered dump of all the documentation, with no HTML chrome.
Anthropic does this for their API. Drizzle does it for their ORM. The idea is that an agent programming against your product can read all the relevant documentation in a single request, without navigating separate pages.
Common mistakes
- Filling it with slogans.“Innovation leaders” tells a model nothing. Verifiable facts do.
- Copying the home as is. The home has marketing. llms.txt needs compressed, useful information.
- Forgetting to maintain it. When your services change or your positioning sharpens, llms.txt has to reflect that. Crawlers come back.
- Not declaring it in robots.txt. Not strictly necessary, but some crawlers discover it faster if you list it.
- Making it 5,000 words long. The file is designed for density. If you exceed 200-300 lines, focus is lost. Long form goes in llms-full.txt.
How to verify it works
- Serve it correctly with
Content-Type: text/markdownortext/plain. Some crawlers get confused with strange MIME types. - Ask an LLM with web search: “Read https://yourdomain.com/llms.txt and summarize what the brand does.” If the summary is accurate, the file is well written.
- Check your access logs for user-agents like GPTBot, Claude-Web, PerplexityBot. If you see requests to the file, the crawlers are consuming it.
Why publish it even if few respect it today
It's a near-zero-cost bet with material upside. Today some crawlers respect it; in 12 months more will. Your llms.txt will be ready when the ecosystem finishes adopting it — and it will be read when it matters.
The precedent: robots.txt appeared in 1994 as a convention with no technical enforcement. Today it's respected by every serious crawler. llms.txt is on the same curve, a decade later.