Cloudflare's Agent Readiness Score Explained

Cloudflare's isitagentready.com scores sites across 4 dimensions for AI-agent compatibility. What each checks and the small-site fixes that matter.

Abstract network technology visualisation representing AI agent web access
Updated How we review →
Rob
By Rob14 June 2026 · 9 min read

Cloudflare quietly launched isitagentready.com as a free public tool that scores any site for AI-agent readiness. The four-dimension model (discoverability, content, access control, capabilities) is genuinely useful as a checklist - and the score itself is a useful conversation-starter for owners trying to decide what to actually do about AI bots in 2026.

The launch matters because, until now, there was no shared scoring framework. Site owners had a vague sense that they should do something about ChatGPT, Perplexity, and Claude crawling their content, but no specific gauge of where they stood. The Agent Readiness Score gives a number, breaks it into actionable chunks, and points at fixes. This guide walks each dimension in plain English and gives the small-site owner's prioritised punch-list.

What does the Agent Readiness Score actually measure?

Four dimensions, weighted differently depending on the type of site Cloudflare detects.

  • Discoverability - can AI agents find your content cleanly? Checks llms.txt presence, robots.txt explicitness on AI crawlers, sitemap quality, and canonical clarity.
  • Content - is the content machine-parseable? Checks structured data (Schema.org / JSON-LD), HTML cleanliness, semantic markup, image alt text, and language clarity.
  • Access control - which AI agents do you explicitly allow or block? Checks robots.txt entries for the ~20 major AI crawlers, IP blocks, and HTTP-level controls.
  • Capabilities - can an AI agent actually act on your site beyond reading? Checks MCP endpoints, well-documented APIs, OAuth flows, and machine-friendly data formats.

The first three are reasonable for any site to engage with. The fourth (capabilities) is essentially asking whether your site has been deliberately designed as an AI-actionable surface - which is overkill for a blog, useful for a SaaS, and essential for an e-commerce destination.

How does the discoverability score work, and what should you do about it?

Three concrete checks dominate the discoverability score.

First, llms.txt. This is the emerging convention - a plain-text file at /llms.txt that tells AI agents what your site is about, what the most important content is, and how it is organised. The format is simple (a short overview plus a list of important URLs), it is well-described in the llms.txt specification, and almost no small UK sites have one yet. Implementing it is a 15-minute task: write a 200-word overview, list your 20 most important pages, save as /llms.txt at the root. The Agent Readiness Score notices its absence immediately.

Second, robots.txt explicitness. Most small sites have an implicit robots.txt - either the default 'allow everything' or a copy-pasted block that does not name AI crawlers specifically. The Agent Readiness Score rewards sites that explicitly name each major AI crawler (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, CCBot, etc.) with an Allow or Disallow directive, even if the answer is 'Allow'. Explicit beats implicit because it signals that you have thought about it.

Third, sitemap quality. The score checks that the sitemap exists, is referenced from robots.txt, and lists canonical URLs only (no redirects, no 404s, no orphans). If you have a recent build problem creating sitemap entries for stale slugs, the Agent Readiness check will pick it up.

What does the content score reward?

Two things really matter here: structured data and HTML cleanliness.

Structured data means JSON-LD blocks describing each page in Schema.org vocabulary. For a typical small site that is Article, Review, Comparison (ItemList), FAQPage, BreadcrumbList, and Organization at minimum. The Agent Readiness Score verifies the schemas are present, valid, and resolve to known Schema.org types. Missing or invalid JSON-LD drags the content score down materially.

HTML cleanliness is more nebulous but the practical checks are: semantic tags (article, section, header rather than wall-to-wall div), heading hierarchy that reflects actual structure (one h1 per page, then h2s in order), image alt text present on every image, and language attributes set on the html element. Small sites with WordPress or static-site-generator boilerplate often get this right by accident; sites with hand-rolled templates often miss two or three of these.

The score does NOT penalise marketing copy, CSS-heavy designs, or interactive widgets directly - it cares about what the bot sees in the rendered HTML, and modern AI crawlers do execute JavaScript. The bar is parseable, not stripped-down.

Should you allow or block AI crawlers?

This is the access-control dimension and it is the one where individual judgement matters most. The short answer for most independent content sites is 'allow with awareness' - allow the major AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Google-Extended) because being cited inside AI assistants is increasingly important for discoverability, but understand that this is a trade-off and revisit it periodically as the AI landscape settles.

The arguments for blocking are real: AI crawlers consume bandwidth, the content may be used to train models without compensation, and the eventual citation may not drive readers back to your site. The arguments for allowing are also real: if you are not cited, you are not in the conversation, and the bandwidth cost for a typical content site is modest.

Whichever way you decide, what the score rewards is explicitness. A clear robots.txt that names each crawler and gives an explicit Allow or Disallow scores better than the same site with a default-allow implicit policy. The score does not judge your choice - it judges whether you have made one.

What is the capabilities dimension - and should small sites care?

Capabilities is asking: can an AI agent do something on your site beyond reading the page? The score checks for documented APIs, machine-friendly data formats, OAuth flows the agent can use, and increasingly MCP (Model Context Protocol) endpoints that let an agent invoke specific actions.

For a typical content site or smart-home blog, the answer is no, and that is fine - this dimension will score low, and that low score is honest. The Cloudflare tool's overall grade is weighted to take site type into account, so a blog getting a 0 on capabilities and 90+ on the other three dimensions still ends up with a respectable overall score.

For e-commerce sites, SaaS products, and any site where 'book a thing', 'check stock', or 'search inventory' is a useful agent action, the capabilities dimension is worth taking seriously. The current best path is exposing a small, well-documented JSON API and (where useful) wrapping it as an MCP server. That is genuinely a half-day to multi-day task and outside the scope of a small-site punch-list.

What does a typical UK small-site score look like in 2026?

Most small UK sites land between 35 and 55 out of 100 on first scan, with this pattern:

  • Discoverability: 50-65 - sitemap and robots.txt usually exist but llms.txt is missing.
  • Content: 40-60 - basic schema markup exists but is often incomplete or invalid.
  • Access control: 25-40 - default-allow implicit policy on AI crawlers, no explicit decisions.
  • Capabilities: 0-15 - no APIs, no MCP, no expectation of agent-actionable surfaces.

The fastest way to lift the overall score by 15-20 points in an afternoon is to (1) add llms.txt, (2) add an explicit robots.txt block for the major AI crawlers, (3) sweep the JSON-LD on the top 10 pages for missing or invalid fields. Capabilities can be left at zero for most small sites without penalty.

What is the prioritised punch-list for a small-site owner?

In order of effort-to-impact ratio:

  1. Add /llms.txt (15 min) - immediate +5 to +8 on discoverability. Use the llms.txt spec at llmstxt.org as the template.
  2. Explicit AI-crawler robots.txt (15 min) - immediate +10 to +15 on access control. Use the template in the callout above and decide allow / block per crawler.
  3. JSON-LD sweep on top 10 pages (1-2 hours) - immediate +5 to +10 on content. Run each page through Google's Rich Results Test and fix the warnings.
  4. Sitemap clean-up (30 min) - drops stale URLs, redirects, and 404s. +2 to +5 on discoverability.
  5. HTML / alt-text sweep (1 hour) - adds missing image alt text, fixes orphan heading hierarchy. +3 to +5 on content.
  6. Capabilities (optional) - skip unless you have an obvious agent-action surface. The score lets you skip with grace.

Total effort: 2-4 hours for a 25-point lift on a typical small UK site. The Agent Readiness Score from Cloudflare gives you the gauge to verify the improvement.

Frequently asked questions

Q01Is the Agent Readiness Score the same as AEO (Answer Engine Optimisation)?
Closely related, not identical. AEO focuses on getting your content cited inside AI search results and assistant answers - it cares about content structure, source authority, and citation density. The Agent Readiness Score is broader and includes whether AI agents can act on your site (capabilities) in addition to reading it. Most of the discoverability and content dimensions overlap with AEO best practice.
Q02Do I need to host my site on Cloudflare to get a score?
No. The isitagentready.com tool works on any public website regardless of hosting. Cloudflare obviously benefits from increased adoption of llms.txt and explicit robots.txt because it makes their AI-bot management products more useful, but the tool itself is host-agnostic.
Q03How often should I re-run the score?
Quarterly is reasonable for a stable content site. After any significant change (new templates, new schema markup, new sections), re-run to confirm the score reflects the change. There is no scoring drift if your site does not change, so monthly is overkill.
Q04What if I want to block AI crawlers entirely?
Use robots.txt with explicit Disallow lines for each named bot. The score will reward you for being explicit and will not penalise the blocking decision itself - just for being unclear about it. Note that some AI crawlers (notably Bytespider in the past) have been reported to ignore robots.txt, so if you want hard guarantees you need IP-level blocks at the CDN or origin.
Q05Will following the punch-list affect my Google SEO?
Positively or neutrally. llms.txt is independent of Google. Schema markup improvements help Google rich results. Explicit robots.txt entries for AI crawlers do not affect Googlebot. Sitemap clean-up helps Google. HTML and alt-text improvements help Google. There is no downside to the punch-list from a traditional SEO perspective.
Q06Is there a similar tool that is not from Cloudflare?
Several. Bing's IndexNow is the closest direct analogue for the discoverability dimension. Schema.org's own validators cover the content dimension. For access control, there is no single authoritative tool yet - the closest is a manual robots.txt review. The Cloudflare tool's value is bundling all four dimensions into one number.