AI Coverage · Standard Operating Procedure
A repeatable procedure for finding which of your pages get cited, why each one does or doesn't, and exactly what to fix. Four parts: classify intent, run the six-step analysis (ending in a findings report), fill the gaps, then track over time.
Why this exists
AI tools tell you which pages they cite or crawl, never the prompts. Search Console's AI impression data (AI Overviews and AI Mode) and your own website's server logs, where AI crawlers like ChatGPT's show up, both confirm pages, not queries. This process works backward from a cited page to the prompts that likely triggered it, so you have something concrete to optimize.
Part 1
The first decision, every time, before any analysis. Intent usually follows page type: blog and resource pages are informational, service and product pages are commercial. It sets the goal: informational pieces aim for a citation (be the trusted source the answer quotes), commercial pieces aim for a recommendation (be named in the response). Each needs different extraction archetypes, and commercial pages are where ROI primarily comes from.
Informational
Goal: citation. Shape the messaging to be viewed as a trusted source the answer quotes.
Why: shape the messaging in ways that align with your organization's position.
Knowledge-seeking. Look for Quick Facts, FAQ answers, bounded lists, and named statistics.
Example prompts
Commercial
Primary ROIGoal: recommendation. Be named as an option included in the response.
Why: drive new business to your website.
Provider- or purchase-seeking. Look for named attributes, criteria lists, fee/process language, and jurisdiction specifics.
Example prompts
Commercial intent isn't one thing. The content a page needs depends on the angle, so match it to the prompt:
| If the prompt is... | Your content must prove... |
|---|---|
| "best [service] in [city]" | Why you're the best there: case results, awards, reviews, named specializations |
| "[service] cost / how much" | Transparent pricing: fee structure, ranges, what's included |
| "[service] near me" | Location and reach: address, service area, hours, local proof |
| "[service] for [situation]" | Fit: you handle that exact situation, with proof |
The rule
Start from the page type. A blog post is almost always informational; a service or product page is commercial. Read the page first, then reconstruct the queries that intent should win (Part 2). A rare hybrid page gets classified by its primary purpose.
Part 2
Runs the same for both intent types, with different extraction targets per step. Work one confirmed page at a time.
Step 1
Two page-level sources confirm which pages AI engines actually use: Google Search Console's AI impression data (citations in AI Overviews and AI Mode) and your website's own server logs, which show ChatGPT and other AI crawlers fetching your pages. Both tell you the page and whether it was used, but neither gives you the query or prompt. That missing piece is what the rest of this process reconstructs.
Step 2
Catalog each passage an engine could lift. The archetypes differ by intent:
Then read the page through the goal's lens to judge which prompts it can actually win:
Informational: does it earn a citation?
Commercial: does it earn a recommendation?
Across both, the content has to be information-dense, unique, and supported (examples, facts, awards, original data). Thin or generic content rarely gets cited or recommended, however it's phrased.
Step 3
This is the reverse-engineering step: reconstruct the prompts you never see. For each extraction target, write the queries it could win, applying the specificity rule: specific enough that no .gov, .edu, or aggregator has a clean standalone answer. Build them with the formula:
| Intent | + Entity / topic | + Qualifier | = Candidate query |
|---|---|---|---|
| Informational | the claim or process | circumstance / threshold | "can an IIED claim be filed alongside a PI case in [state]" |
| Commercial | the service | jurisdiction / criterion | "flat-fee estate planning attorney in [city]" |
Step 4
Test each reconstructed prompt on every surface that matters: Google AI Overviews and AI Mode, ChatGPT, Perplexity, and Gemini (plus organic, scored independently). For each, record the outcome against the page's goal: are you cited (informational) or recommended (commercial)? Inclusion and organic rank are independent: an AI citation can come from a page that ranks poorly, and a strong organic page can be absent from AI entirely.
Step 5
For each citation, test three rephrasings: the noun ("victims" to "claimants"), the verb ("win" to "prevail"), and the framing ("how often" to "what percentage"). Holds across 2 of 3: durable, you own the direction. Breaks at the first: fragile, winning on surface match rather than direction ownership.
Step 6
Compile the results into a report organized by page: for each page, list its reconstructed prompts and the outcome on each platform, scored against the page's goal. Read it by intent, your informational citation rate versus your commercial recommendation rate, so you can see where you stand on each. Every miss is a gap, and every gap is an opportunity to improve the content and increase AI inclusion. Classify each one so you know how to fix it:
Part 3
Every gap from your report has a specific fix, and the fix differs by intent. Match the prescription to the gap type, apply it, then re-test inclusion.
Reposition the lead claim and restructure the FAQ so the heading matches the query phrasing exactly, the question is the query.
Add jurisdiction and location specificity to the phrasing so it matches "[service] in [city]" patterns that institutional pages never address.
Add a Quick Facts summary block and convert prose to bounded lists, so one passage covers the direction without stitching.
Add criteria / selection lists and explicit fee and process language, formatted as extractable lists rather than paragraphs.
Remove context dependency so the answer stands alone, and niche the question so a .gov / .edu source has nothing clean to cite for it.
Replace vague claims with named attributes only a provider can state: specific services, credentials, fees, and locations.
Build your own dedicated section or page that owns the direction completely, with its own named H2 and a single clear extraction target.
Build a dedicated page for that service plus jurisdiction, leading with the named attributes and explicit fee / process specifics.
Part 4
Coverage is a moving target, so monitor it on a cadence rather than treating any single citation as won.
The honest caveat
Citations erode as competitors improve. The structural fixes do not lock in a permanent position, they raise the floor for recapture speed, so when you lose a slot you win it back faster. Treat coverage as maintained, not achieved.