Skip to content

llms.txt Implementation

When to Use

You want AI coding assistants, documentation tools, and LLM-powered search to find and understand your site's content. The llms.txt standard (llmstxt.org, September 2024) is a lightweight convention: place a machine-readable index at /llms.txt so AI systems know what your site contains and where to find full content. Primary value is for AI coding assistants (Cursor, Copilot, Claude Code) that read project docs — not yet honored at inference time by ChatGPT or Google AI Overviews.

Decision

Situation Approach Effort
Static site or docs site Static llms.txt file in web root Minimal
Drupal site, content changes rarely Static file in /web/ committed to repo Low
Drupal site, content changes frequently Custom route that generates file dynamically Medium
Large site with many content types Build script — generate from MkDocs/CMS on deploy Medium
You want full page content for RAG Also generate llms-full.txt with page body Medium
AI assistants are your primary audience llms.txt as entry point to per-topic bundles Dev docs pattern

The llms.txt Format

Defined at llmstxt.org. The format is intentionally minimal — structured Markdown that LLMs can parse without special tooling.

# Site or Product Name

> One-paragraph description of what this site/product is and who it is for.
> Written for an LLM audience: factual, concise, complete.

## Section Name

- [Page Title](https://example.com/page/): One-line description of what this page covers
- [Another Page](https://example.com/other/): Description

## Another Section

- [Guide Title](https://example.com/guide/): Description

Format rules: - H1: site or product name - Blockquote after H1: description paragraph (required) - H2: logical sections grouping related links - Bullet list entries: - [Title](URL): description - Keep descriptions to one line — LLMs read these to decide whether to fetch the full page - llms-full.txt is the same format but includes the full body of each page inline, for RAG vectorization

Confirmed Adopters (as of Q1 2026)

Organization URL Notes
Cloudflare cloudflare.com/llms.txt Developer docs, Workers, AI Gateway
Anthropic anthropic.com/llms.txt Claude API docs, model reference
Stripe stripe.com/llms.txt Payment API documentation
Supabase supabase.com/llms.txt Database and auth docs
Vercel vercel.com/llms.txt Deployment and framework docs
FastAPI fastapi.tiangolo.com/llms.txt Python web framework

Adoption is concentrated in developer-tool companies because the primary current use case is AI coding assistants reading documentation.

Current Limitations

Limitation Detail
No AI company committed to honoring at inference time As of Q1 2026, OpenAI, Google, and Anthropic have not confirmed they read llms.txt during ChatGPT/AI Overviews/Claude web answers
No standardized crawl protocol There is no GPTBot or ClaudeBot equivalent that specifically fetches llms.txt
Discovery depends on tool implementation AI coding assistants (Cursor, GitHub Copilot Workspace) actively support it; consumer AI chat does not yet
Spec is still evolving llmstxt.org is community-maintained; breaking changes possible

Primary value today: AI coding assistants. If your Drupal site serves developer documentation, llms.txt is worth implementing. For consumer content sites, it is a low-effort forward-looking signal.

Place a file at /web/llms.txt in your Drupal repo. This works because Drupal's front controller only intercepts paths that don't match existing files.

# My Drupal Site

> Developer documentation and guides for configuring [Site Name] on Drupal 11.
> Covers [main topics]. For site builders and developers.

## Getting Started

- [Overview](https://example.com/docs/overview/): What this site does and how it is organized
- [Installation](https://example.com/docs/install/): Requirements and install steps

## Configuration

- [SEO Configuration](https://example.com/docs/seo/): Meta tags, structured data, sitemaps
- [Content Types](https://example.com/docs/content-types/): Available content models

## API Reference

- [REST API](https://example.com/api/): JSON:API endpoints and authentication

Commit this file to web/llms.txt. Drupal will serve it as a static asset. Verify with curl https://example.com/llms.txt.

Pattern: Custom Drupal Route

For sites where content changes frequently and you want the index auto-generated:

// mymodule.routing.yml
mymodule.llms_txt:
  path: '/llms.txt'
  defaults:
    _controller: '\Drupal\mymodule\Controller\LlmsTxtController::index'
    _title: 'llms.txt'
  requirements:
    _access: 'TRUE'
  options:
    no_cache: FALSE
// LlmsTxtController.php
public function index(): Response {
  $content = $this->buildLlmsTxt();
  return new Response($content, 200, [
    'Content-Type' => 'text/plain; charset=UTF-8',
    'Cache-Control' => 'public, max-age=3600',
  ]);
}

The controller queries published nodes by content type and builds the Markdown index. Cache with Drupal's render cache or a dedicated cache bin — regenerating on every request is expensive on large sites.

Pattern: Per-Topic Bundling (Dev Docs Sites)

This project's own llms.txt implementation uses per-topic bundles rather than a single flat file. The main /llms.txt is a directory of topics, each linking to /llms/topic-name.txt — a full content bundle for that topic.

# Dev Guides

> Atomic decision guides for Drupal 11, CSS, and frontend development.

## Drupal

- [Drupal SEO & GEO](https://example.com/llms/drupal-seo-geo.txt): 27 guides covering SEO, structured data, and GEO for Drupal 11
- [Drupal Forms](https://example.com/llms/drupal-forms.txt): Form API, validation, AJAX, and form alter patterns

Each /llms/topic-name.txt file contains all guides for that topic concatenated with section markers, making it efficient for AI assistants to load one file and get the full context for a domain.

llms-full.txt for RAG

If you are building a RAG (Retrieval-Augmented Generation) pipeline over your own content — for an AI chatbot, search, or documentation assistant — generate llms-full.txt with full page bodies:

# Site Name

> Description paragraph.

## Section Name

### [Page Title](https://example.com/page/)

Full page body content here...

---

### [Another Page](https://example.com/other/)

Full page body content here...

This is the format used by vector database ingestion pipelines. The --- separator between pages aids chunking. Size warning: llms-full.txt can be several MB for large sites; this is intentional for RAG use but not suitable for direct LLM context injection.

Common Mistakes

  • Wrong: Putting llms.txt in the Drupal public files directory (sites/default/files/) → Right: Place in web/ root alongside index.php; files directory URLs are unpredictable
  • Wrong: Making llms.txt the same as your sitemap (XML + all URLs) → Right: llms.txt is a curated, human-readable index with descriptions; include only meaningful pages
  • Wrong: Expecting ChatGPT to cite you more because you have llms.txt → Right: Current value is for AI coding assistants, not consumer AI chat inference
  • Wrong: Generating llms.txt on every page request → Right: Cache aggressively; this file changes at most daily
  • Wrong: Skipping the blockquote description paragraph → Right: The description is required by the spec and is the primary signal LLMs use to decide if this site is relevant

See Also