Methodology v7 · last reviewed May 2026

How I decide whether your page isready to be cited.

Every audit is the output of 14 signals on your page, scored against the way 6 AI engines actually pick what to cite — built on Google's own AEO guide and the latest 54-experiment meta-analysis from May 2026.

What we measure01

Readiness, not citations.

I don't query ChatGPT to ask "is your brand cited?" — that's a different tool. I score your page on the signals each engine uses to decide what's quotable: answer positioning, entity density, freshness, crawler access, format diversity, E-E-A-T. If you score high, you're ready to be cited. If you score low, here's why.

02The pipeline

Six steps. One audit.

Discover

Parse robots.txt and sitemap; detect paywalls and JS-heavy rendering. No llms.txt check — Google has confirmed it's not used.

Render

Crawl 5 pages via real Chrome (Cloudflare Browser Rendering). Average crawl: 8–14 seconds.

Extract

Pull markdown, headings, entities, schema, dates, outbound citations. Classify page intent (pricing / how-to / docs / etc.).

Score

14 weighted dimensions, 0–10 each. Hybrid AI-analyst + rule-based scoring. Per-engine re-weighting for ChatGPT, Claude, Gemini, Perplexity.

Fixes

8–12 recommendations, sorted by impact ÷ effort. Each fix names which engine it moves.

Ship

Persist + render the report. Shareable, exportable, re-runnable with a single click.

14 dimensions03

The fourteen signals I score.

Each dimension scores 0–10, normalized to 0–100 in the overall. Weights are the default; per-engine scores re-weight based on what that specific engine prefers. Re-balanced May 2026 using Google's published guidance + Cyrus Shepard's 54-experiment Zyppy meta-analysis.

01Semantic completeness16%

Does the page cover the full sub-topic graph a user would expect? Pages scoring >8.5/10 here are 4.2× more likely to be cited (Wellows, 2026).

02Search rank + fan-out · NEW · May 202612%

The strongest empirical signal in 2026. Top-10 organic results contain 38% of AI Overview citations; 47% come from below rank #5. AI fans out 5–15 sub-queries per question and pulls from whatever ranks for each.

03Entity knowledge12%

Count and disambiguate named entities. Pages with 15+ connected entities are 4.8× more likely to be selected (Wellows).

04Quotability12%

AI engines extract 134–167-word atomic chunks (Wellows 2026, updated from the 50–150 range used through 2024). I find every passage that could be lifted verbatim and flag the gaps.

05Answer positioning10%

Mirror the buyer's question in your H1, then answer it in 1–2 sentences before expanding. AI extracts from the first ~80 words.

06Multi-modal richness · NEW · May 20268%

Pages combining text + images + video + structured data see +239–317% selection rates vs. text-only (Wellows).

07Outline quality6%

Clean heading hierarchy is the 9th-ranked factor in Shepard's May 2026 meta-analysis. Hierarchy validity, question-form ratio, heading-to-word density.

08Original data points · NEW · May 20266%

Pages with 3+ unique data points (original surveys, internal benchmarks) are 4× more likely to be cited. Google's own guide explicitly contrasts commodity content with non-commodity, expert-driven pages.

09Crawlability6%

Test allow/deny for OAI-Searchbot, GPTBot, ClaudeBot, PerplexityBot, Google-Extended. I deliberately don't check llms.txt — Google says it's not used; Shepard ranks it 2.0/10.

10Schema stack4%

Google: structured data isn't required for AI Overviews. Shepard ranks schema #20/23 with a "typically small" effect. Supplementary, not load-bearing.

11Topical cluster3%

Orphan pages with no internal links score poorly — AI sees no topic territory. I map internal link graph and anchor concentration.

12E-E-A-T2%

Experience, Expertise, Authoritativeness, Trustworthiness. 96% of cited content carries verified E-E-A-T signals (Wellows).

13Performance2%

LCP, JS ratio, render mode. JS-heavy pages get discounted by engines that don't run JS.

14Freshness1%

23% of cited content is <30 days old. AIO heavily discounts pricing pages older than 90 days.

—Page intent · pre-score—

Before scoring, I classify the page: pricing / comparison / how-to / definition / listicle / landing / product / docs / blog / homepage. Every fix from there is intent-tailored.

046 engines

Six engines. Six rule sets.

Only 11% of cited domains overlap between ChatGPT and Perplexity. Google AIO and AI Mode cite the same URLs just 13.7% of the time. Optimizing for one ≠ optimizing for all. Each card shows that engine's top weighted dimensions.

Google AI Overviews

48–60% of searches

Top weightsSearch rank + fan-out · Semantic completeness · Quotability
BiasTop-10 organic still drives 38% of citations. But 47% come from below rank #5 — answer-shaped pages outperform rankers.

ChatGPT (Search)

OAI-Searchbot grounding

Top weightsEntity knowledge · Answer positioning · Freshness
BiasHeavy weight on entity disambiguation (sameAs, Wikidata). Cites <30-day content at 3.2× the rate of older sources.

Claude

Web search + citations

Top weightsOutline quality · E-E-A-T · Original data
BiasHeavy E-E-A-T preference. Author bylines, named customers, real testimonials. Outbound citations to .gov / .edu lift scores meaningfully.

Perplexity

Live RAG

Top weightsOutbound authority · Multi-modal richness · Search rank
BiasCites Reddit at +45% of baseline. Off-page social context matters. Pages with comparison tables outperform pure prose.

Gemini

Google AI Mode

Top weightsMulti-modal richness · Schema · Entity knowledge
BiasMost multi-modal-aware of the engines. Pages combining text + images + video + schema see +239–317% selection (Wellows).

Bing / Copilot

Bingbot + ChatGPT layer

Top weightsCrawlability · Semantic completeness · Topical cluster
BiasStrict on robots.txt. Bingbot misconfig = invisible. Beyond crawl, weights overlap with both Google AIO and ChatGPT.

AEO snake oil05

Myths I'm ignoring (and you should too).

Six tactics that get sold as AEO requirements. I refuse to score for them. Citations below each.

llms.txt files

Google: "you don't need to create new machine readable files, AI text files, markup, or Markdown to appear in generative AI search." Shepard's May 2026 meta-analysis: dead last, 2.0/10, "no credible evidence of impact." Build the page, not the metadata file.

AI-specific content rewriting

Google: "you don't need to write in a specific way just for generative AI search. AI systems understand synonyms and general meaning." Writing for AI alone violates Google's scaled content abuse policy.

Tiny content chunking

Google: "there's no requirement to break your content into tiny pieces for AI to better understand it." Optimize for humans; AI follows.

Inauthentic mentions

Google: "Seeking inauthentic 'mentions' across the web isn't as helpful as it might seem." Real audience signal beats astroturfing.

Special AEO schema

Google: "there's no special schema.org markup you need to add." Schema ranks #20/23 with a typically small effect (Zyppy, 2026). Useful for rich results. Not load-bearing for AEO.

Domain authority hunting

Link-based DA correlation with AI citations: r=0.18 (weak). Spend the budget on original data instead — that has 4× citation lift.

06Questions

Common questions.

Where do the weights come from?

Initial weights from public research (Google's AEO guide, Cyrus Shepard's May 2026 Zyppy meta-analysis, Wellows' ranking-factor breakdown). Re-calibrated monthly using delta data from users who re-run audits after shipping fixes — I measure which dimension changes actually moved per-engine scores.

Why 14 dimensions, not more?

Do I need an llms.txt file?

Is schema required for AI Overviews?

Should I write content specifically for AI?

How accurate are the per-engine scores?

Let's go07

Find out if AI is
citing you.

Free. 90 seconds. 4 engines.

Run my audit →

Start building