How it differs from Site Readiness
Site Readiness checks whether a website is technically accessible and understandable to agents and crawlers. It looks at things like robots rules, structured data, page discoverability, snippet controls, accessibility, and agent-facing content quality. AI Indexing runs separately because it is heavier. It sends page-level probes to external AI or search surfaces and waits for platform responses. A Site Readiness run can complete quickly from crawl and page analysis; an AI Indexing run may take longer because each platform has its own retrieval behavior, rate limits, and response time.How we measure this
AI Indexing currently checks:| Platform | How it is measured |
|---|---|
| ChatGPT | Title/domain-scoped web-search probe that checks whether ChatGPT cites the same canonical page. |
| Claude | Brave Search proxy for Claude’s web retrieval surface, using host-scoped slug queries. |
| Gemini | Grounded Gemini search probe that reads the result URLs from Gemini’s search response. |
www. or mobile subdomain variants, query strings, fragments, trailing slashes, and percent encoding should not turn the same page into a false miss.
Why Claude uses Brave Search
Anthropic does not expose a first-party per-page Claude index API. For Claude, OpenLens uses Brave Search as a retrieval proxy because Claude’s web search behavior has strong public evidence pointing to Brave, and because OpenLens performed independent empirical validation comparing Claude web-search results with Brave Search results across sampled exact-page probes. That validation found the similarity strong enough to use Brave as the closest practical public proxy for Claude retrieval coverage. Anthropic’s web search tool docs describe Claude web search as a server-side tool that runs searches and returns cited sources. TechCrunch reported that Anthropic added Brave Search to its subprocessor list, that Simon Willison observed matching Claude and Brave citations for the same query, and that Claude’s internal web search schema exposed aBraveSearchParams name. See TechCrunch’s report and Brave’s Search API page.
This is still a proxy. A pass means the exact page is available through Brave Search for the query OpenLens ran. That is the closest practical public signal for Claude retrieval coverage, not a guarantee that every Claude answer will cite the page.
Scores and statuses
Each platform-page probe resolves to one of three product-facing outcomes.- Indexed means the platform returned or cited the exact normalized page URL.
- Not indexed means the platform completed a grounded probe but did not return or cite that exact canonical page.
- Unknown means OpenLens could not get enough grounded provider evidence to decide. It is not the same thing as not indexed.
- Score is the percent of completed checks that passed, weighted through the same readiness scoring system used elsewhere in OpenLens.
How accurate are these verdicts
Our verdicts are highly accurate, and we measured how accurate against a labelled dataset. To check whether an assistant can find your page, we built our own prompts about that page that reliably get the assistant to show it when the page is indexed. We then tested those prompts against a labelled dataset of both popular and un-indexed pages to see how well they work.- When we mark a page Indexed on ChatGPT or Claude, we are right about 100% of the time. For ChatGPT we only count pages the assistant actually cited, and for Claude we can search its Brave index directly.
- Gemini is the exception. Gemini does not reliably tell us which URLs it retrieved, so we read them from its response, and it can sometimes give a page URL it reconstructed from the site address rather than one it actually found. Treat a Gemini Indexed as slightly lower confidence. We are actively improving this.
- When we mark a page Not indexed, we are right more than 99% of the time based on our labelled dataset. That figure comes from the dataset, so per-site results can vary.
Page scope
Use Pages to crawl to choose how many discovered pages the run should inspect. Use Path prefix to restrict the run to one part of a site, such as/blog or /docs.
The run uses the discovered page list for the project and applies the selected scope before sending platform probes. Scanned pages count toward AI Indexing usage limits even when page details are hidden by plan gating.
Monthly AI Indexing page allowances are separate from Site Readiness: Free includes 150 pages a month, Starter includes 1,000 pages per seat each month, and Agency includes 10,000 pages per seat each month. The allowance resets on the 1st of each month, UTC.
Free and paid views
Free users can preview the first few page results and see summary counts for the full run. Paid plans reveal more per-page details, evidence, and the basic recommendation attached to each not-indexed or unknown result. The summary still counts every scanned page. If a run says 50 pages were checked, hidden rows are included in the score and platform totals.Scheduling
Scheduled AI Indexing checks re-run on the configured cadence for the selected project URL. They use the current URL and schedule settings from the AI Indexing page. Scheduling is separate from Site Readiness scheduling. You can run Site Readiness and AI Indexing independently, and one does not block the other.Cancellation
Cancel stops an in-flight AI Indexing run from continuing to process additional work. Any results already written remain visible as partial output, and the run is marked cancelled. Use cancellation when a run is taking too long, when the wrong project or scope was selected, or when you want to preserve usage for a narrower follow-up run.How to use it
- Start with the default page count to verify the flow for a project.
- Use a path prefix when you care about one content section, such as blog posts or documentation.
- Compare platform totals first, then expand individual platform rows to inspect the pages that need attention.
- Treat a single not-indexed result as a signal to investigate, not proof that the platform can never retrieve the page.
- Treat unknown results as provider or runtime uncertainty. Re-run before making content changes based on them.
- Re-run after publishing content, changing crawl directives, or improving page metadata.