What is an llms.txt file?

An llms.txt file is a plain-text markdown document hosted at the root of a domain, like example.com/llms.txt, that gives large language models a curated index of the site's most important pages. It uses an H1 title, a blockquote summary, and H2-grouped link lists with short descriptions. The format was proposed by Jeremy Howard at Answer.AI in September 2024. Treat it as the LLM-readable counterpart to sitemap.xml. Sitemaps are exhaustive and built for ranking crawlers. llms.txt is short, curated, and built for answer engines that need to know which pages to read first when synthesizing a response about the brand.

Is llms.txt an official standard?

No. As of mid-2026, llms.txt is a community proposal maintained at llmstxt.org. It has no W3C ratification and no formal endorsement from Google, OpenAI, Anthropic, or Perplexity. Adoption is climbing among publishers, but no major AI engine has confirmed it as a ranking or selection input. That said, the cost to ship the file is roughly an afternoon, and the downside of publishing one is zero. Most treatment center operators treat it as a low-risk hedge.

How is llms.txt different from sitemap.xml and robots.txt?

Sitemap.xml lists every indexable URL for traditional search crawlers. Robots.txt sets crawl permissions for bots like GPTBot, ClaudeBot, and Googlebot. llms.txt is a curated, narrative map written specifically for LLMs that synthesize answers, not for ranking algorithms that return links. The three files coexist. A treatment center should publish all of them. Sitemap.xml is exhaustive. Robots.txt is permissive or restrictive. llms.txt is opinionated about which twenty or fifty pages the model should treat as canonical.

What should a treatment center include in its llms.txt?

A treatment center llms.txt should include the main services page, the admissions process page, the levels-of-care explainer, accreditation and licensing pages, clinical leadership bios with credentials, modality and program pages, and the insurance and admissions FAQs. These are the pages that define standing in any AI answer about the brand. Group them under Core, Modalities, and Optional sections. Keep total link count tight, usually between twenty and fifty entries.

Do AI engines actually read llms.txt yet?

Adoption is uneven. No major AI engine has formally confirmed that llms.txt influences answer selection. Early testing by independent SEO practitioners suggests some engines fetch the file opportunistically, but the behavior is not documented or guaranteed. What is documented is that publishers shipping the file, including Anthropic, Cloudflare, Hugging Face, and Stripe, are doing so because the cost is low and the curation discipline itself improves how their content is structured for AI ingestion.

What Is Llms.txt? Format, Spec, And AEO Use Cases

llms.txt is a proposed plain-text standard, hosted at the root of a domain like /llms.txt, that gives large language models a curated, structured guide to the website’s most important content. It is the LLM-readable equivalent of a sitemap.xml or robots.txt, designed for AI-first discovery.

For treatment centers, llms.txt is an emerging signal in the same family as schema markup and answer engine optimization. It does not guarantee citation in ChatGPT or AI Overviews, but it gives the engines a fast lane to the pages a facility actually wants quoted.

Key Takeaways

llms.txt is a proposed standard, not an official one. Jeremy Howard at Answer.AI introduced the proposal in September 2024. There is no W3C ratification and no confirmed adoption by Google, OpenAI, Anthropic, or Perplexity as of mid-2026.
The file lives at the root of the domain and uses plain markdown to tell an LLM which URLs matter most, with short blockquote descriptions and grouped link lists under H2 headers.
llms.txt is additive to sitemap.xml and robots.txt, not a replacement. Sitemaps tell crawlers what exists. Robots.txt tells them what to skip. llms.txt tells an LLM what to read first when synthesizing an answer about the brand.
For treatment centers, the file is a curation surface. It is where a facility points models at clinical leadership bios, accreditation pages, modality explainers, insurance and admissions FAQs, and any content the operator wants quoted instead of guessed at.
Webserv ships its own llms.txt at webserv.io/llms.txt as proof-of-concept. The file points models at the MCP server, AI Information page, case studies, capabilities, and markdown alternates for every public URL.
Treat llms.txt as a low-effort, high-signal hedge. Cost to ship is roughly an afternoon. Downside is zero. Upside is being one of the first behavioral health brands in any given market with a citation-ready LLM map.

What llms.txt Is

llms.txt is a markdown file served at the root of a domain. Jeremy Howard at Answer.AI introduced the proposal in September 2024, framing it as a way to give models a clean entry point to a site. The spec is maintained at llmstxt.org.

The format is deliberately simple: a single H1 title, a blockquote summary of the site, then H2 sections grouping curated links with one-line descriptions.

The problem Howard set out to solve is concrete. Modern websites are heavy. HTML pages carry navigation, ads, scripts, and visual chrome that bury the text an LLM actually needs. Context windows are finite. A 30,000-token homepage may give a model two paragraphs of useful information about a company.

llms.txt fixes that. It hands the model a short, curated, plain-text index of the pages worth reading. The model can fetch those links directly, often with markdown alternates, and skip the rest. The file is human-readable too, which is why it doubles as a brand-positioning document for AI engines.

Why llms.txt Matters for Treatment Centers

Behavioral health is a YMYL category. The information a treatment center publishes about levels of care, medications, dual diagnosis, accreditation, and insurance is the kind of content AI engines have to be careful with. Misquoting a rehab is a real-world harm, not a marketing miss.

That makes the curation layer valuable. An llms.txt file gives a facility a place to point models at the pages built and reviewed for clinical accuracy: licensed clinical leadership bios, accreditation pages, modality explainers, admissions and insurance FAQs, and clinical glossary entries.

It also flags what is canonical. If a facility has been renamed, acquired, or rebranded, the llms.txt file is the cleanest single place to declare which URL is the authoritative one. The same pattern applies to disambiguation between facilities with similar names in the same metro.

And it compounds with other AI visibility work. A facility investing in topical authority across modality clusters can use llms.txt to surface the hub pages of those clusters so the engine reads the strongest source first.

How llms.txt Works

The file lives at https://yourdomain.com/llms.txt. There is no required server header beyond serving it as plain text. Models that support the spec look there the way a traditional crawler looks for robots.txt or sitemap.xml.

The internal structure is markdown. The spec calls for an H1 with the project or brand name, a blockquote summary describing the site, optional prose context, then one or more H2 sections that group curated links. Each link line is a markdown bullet with a URL and a short description.

The spec also recommends an optional llms-full.txt companion file. That version expands every linked page into one long markdown document the model can ingest in a single fetch. For a treatment center, llms-full.txt is the closest thing to handing an LLM the full briefing book on your facility.

llms.txt Format Specification

The canonical structure published at llmstxt.org is straightforward. H1 is the site name. The first blockquote is a one-sentence summary. Optional paragraphs add context the model should carry into any answer.

H2 sections group links by purpose. The ## Optional section is reserved for pages the model can skip if context is tight.

# Example Recovery Center

> A residential and outpatient addiction treatment facility in Austin, TX, accredited by The Joint Commission.

## Core

- [Treatment Programs](https://example.com/programs): overview of residential, PHP, IOP, and outpatient levels of care.
- [Admissions](https://example.com/admissions): step-by-step admissions process, insurance verification, and what to expect on intake day.
- [Clinical Leadership](https://example.com/team/clinical): bios and credentials for the medical director, clinical director, and lead therapists.

## Modalities

- [CBT](https://example.com/modalities/cbt): cognitive behavioral therapy for substance use and co-occurring disorders.
- [DBT](https://example.com/modalities/dbt): dialectical behavior therapy program structure and indications.
- [MAT](https://example.com/modalities/mat): medication-assisted treatment options, including buprenorphine and naltrexone.

## Optional

- [Blog Archive](https://example.com/blog): published articles on recovery, family support, and clinical topics.

That structure is the entire format. There are no required custom fields, no XML, no schema validation. The discipline is in what gets included and what gets left out. An llms.txt with 400 links is not a curation. It is a sitemap with extra steps.

llms.txt vs sitemap.xml vs robots.txt

The three files do different jobs and coexist on the same domain. Robots.txt sets crawl permissions. Sitemap.xml lists every indexable URL for traditional search crawlers. llms.txt is a curated, narrative map written for LLMs, not for ranking algorithms.

Sitemap.xml is exhaustive. It might list 800 URLs for a multi-location treatment center. llms.txt should be tight. Twenty to fifty links is typical, prioritized by what the operator wants the model to quote when someone asks about the facility.

Robots.txt is a permission layer. It tells GPTBot, ClaudeBot, PerplexityBot, and traditional Google crawlers what they can and cannot fetch. llms.txt assumes the model already has permission. The two files complement each other. If a facility blocks AI crawlers in robots.txt, an llms.txt file is mostly decorative.

Implementation Walk-Through for Treatment Centers

The build is short. A facility producing llms.txt for the first time can finish it in a working session. The harder part is deciding what to include. The file is a curation, so every link earns its slot.

What to include under Core

The Core section is what the model reads first. Put the pages that define the facility: the main services page, the admissions page, the levels-of-care explainer, the accreditation and licensing page, and the clinical leadership bios. These are the pages that establish standing in any AI answer about the brand.

What to include under Modalities

Group every modality and program page under one section. CBT, DBT, EMDR, MAT, family programs, alumni programs. These pages are also where semantic triples tend to live, since each modality page describes a clinical relationship the model can extract and reuse.

What to include under Optional

Blog archives, alumni stories, and topic clusters that build authority but do not need to be in the first fetch. The Optional section is also a good home for the facility’s glossary or definitions page, since those are reference assets the model may pull on demand.

What to leave out

Thin location pages, duplicate landing pages built for paid traffic, outdated press releases, and any content that is not clinically reviewed. The file is a brand statement. Anything that gets cited from it should be content the operator would stand behind on the phone with a journalist.

Current Status and Adoption

As of mid-2026, llms.txt remains a proposal. Google, OpenAI, Anthropic, and Perplexity have not formally announced that they read the file as a ranking or selection input. Early signals are encouraging but not authoritative.

Adoption among publishers is climbing. Major documentation sites including Anthropic, Cloudflare, Hugging Face, and Stripe ship the file. The pattern is most common where the publisher already cares about being read accurately by developer-facing models. Behavioral health adoption is early and uneven.

The right operator stance is to ship the file now and treat it as a hedge. The cost is low. The upside, in the scenario where one or more major engines confirms the signal, is being early in a category where most competitors will need a quarter to catch up.

Webserv’s llms.txt as Proof of Concept

Webserv ships its own file at webserv.io/llms.txt. The structure follows the spec, with a few additions tuned for an agency that wants to be quoted accurately in AI Mode, AI Overviews, and ChatGPT.

The Webserv file opens with a one-sentence positioning statement: a digital marketing agency for behavioral health treatment centers. It then surfaces the MCP server, the AI Information page, the agent card, and the markdown alternate convention every public URL on the domain supports.

The Pages section is a curated list of capability pages, case studies, playbooks, and resources. Each link carries a short description written for the model, not the human reader. The descriptions front-load the verbs and the proof.

The file is a working example treatment centers can borrow from. It is also a hint at where this category is going. The combination of llms.txt, markdown alternates, MCP endpoints, and the Google Knowledge Graph is starting to look like the AI-era press kit.

Shipping llms.txt as Part of an AEO Practice

llms.txt is one tactic inside a broader answer engine optimization practice. The file by itself does not win citations. The content it points at does.

Webserv’s AEO practice and SEO capability build the underlying content depth, clinical review, and structural signals that make a facility worth citing. The llms.txt file then puts that work on a shelf the engines can reach.

llms.txt