llms.txt
What is llms.txt?
llms.txt is a proposed web standard, introduced by Jeremy Howard of Answer.AI in September 2024, that provides a standardized way for websites to offer LLM-friendly content. The core idea: websites place a /llms.txt markdown file at their root, containing a curated overview of the site's content — brief background information, guidance for AI interpretation, and links to detailed markdown versions of key pages.
The standard addresses a fundamental problem: LLMs have limited context windows and struggle to parse complex HTML with navigation, ads, and JavaScript. llms.txt gives them a clean, structured entry point to understand a website's content.
The Problem llms.txt Solves
- Context window limitations: LLMs cannot ingest entire websites — llms.txt provides a curated summary
- HTML complexity: Converting HTML with navigation, ads, and scripts to LLM-friendly text is error-prone
- Content discovery: LLMs don't know which pages are most important — llms.txt guides them
- Site understanding: A brief project description helps LLMs contextualize the content they consume
- No existing standard: robots.txt controls crawling; sitemap.xml lists pages; neither is designed for LLM inference-time use
llms.txt Format Specification
A file following the spec contains these sections (in order):
- H1 title: Name of the project or site (required)
- Blockquote: Short summary with key information for understanding the site
- Additional markdown: More detailed information about the project (optional, any markdown except headings)
- H2 file lists: Sections of URLs with descriptions where detailed content can be found
- Optional section: An H2 titled "Optional" with URLs that can be skipped for shorter context
Basic example:
# Your Site Name
> A brief description of what this site is about and key information for understanding the content.
Additional details about the site, its purpose, and how to interpret its content.
## Docs
- [Getting Started Guide](https://yoursite.com/docs/start.html.md): Introduction and setup instructions
- [API Reference](https://yoursite.com/docs/api.html.md): Complete API documentation
- [FAQ](https://yoursite.com/faq.html.md): Frequently asked questions
## Blog
- [Key Article 1](https://yoursite.com/blog/article1.html.md): Description of article
- [Key Article 2](https://yoursite.com/blog/article2.html.md): Description of article
## Optional
- [Archive](https://yoursite.com/archive.html.md): Older content that may not be essential
.md Page Extensions
The standard also proposes that pages with information useful for LLMs provide a clean markdown version at the same URL with .md appended:
https://yoursite.com/page→https://yoursite.com/page.mdhttps://yoursite.com/page.html→https://yoursite.com/page.html.mdhttps://yoursite.com/dir/→https://yoursite.com/dir/index.html.md
llms.txt vs robots.txt vs sitemap.xml
| Standard | Purpose | Consumer | Format |
|---|---|---|---|
| robots.txt | Control crawling access | Search engine bots | Plain text rules |
| sitemap.xml | List all indexable URLs | Search engine bots | XML |
| llms.txt | Curated content for LLM inference | LLMs and AI agents | Markdown |
Key differences: sitemap.xml lists everything (too large for LLM context), robots.txt controls access (not content), and llms.txt provides a curated entry point with context.
Why llms.txt Matters for GEO
Implementing llms.txt directly supports GEO goals:
- Improved citation accuracy: LLMs that understand your content structure cite it more accurately
- Content prioritization: You control which content AI engines see as most important
- Reduced hallucinations: Curated markdown reduces AI misinterpretation of HTML
- Brand context: The site summary helps AI engines understand what your brand does
- Content freshness: Link to your most current content to signal relevance
Implementation Tools
- llms_txt2ctx CLI: Python tool for parsing llms.txt and generating LLM context files
- VitePress plugin: Auto-generates llms.txt for VitePress documentation sites
- Docusaurus plugin: Auto-generates llms.txt for Docusaurus documentation sites
- Drupal Recipe: Full llms.txt support for Drupal 10.3+ sites
- llms-txt-php library: Programmatic llms.txt creation and parsing in PHP
- VS Code PagePilot: Extension that loads llms.txt context into VS Code Chat
Directories of existing llms.txt files: llmstxt.site and directory.llmstxt.cloud