AI Grounding · Observed from the traffic

What ChatGPT does before it answers "best X".

You type one prompt. Before it replies, ChatGPT expands it into several searches, fetches dozens of pages through commercial web-scraping services, throws most away, cites a few, and recommends fewer still. This is every narrowing step, captured from one real query.

Retrieval pipelines

Every fetched page is tagged with the service that retrieved it. The scraped tiers (bright, oxylabs) are what gets cited; the licensed tier (labrador) is the one that gets dropped.

The funnel

Every narrowing step

Six stages, top to bottom. Each panel shows the headline count; expand it for the underlying data. Panels narrow as the funnel drops, so the fall from 75 fetched to 3 recommended is visible at a glance.

Stage 1Fan-out: 1 prompt to sub-queries›

Stage 2Search waves: waves, not one round trip›

Stage 3Candidate pool: distinct pages fetched ( rows across waves)›

Stage 4Filter: pages dropped before the answer was written›

Stage 5Cited: pages cited (~% of what was fetched)›

Stage 6Recommended: businesses named›

For those who want to go deeper

The authoritative tier gets fetched, then dropped

Among the pages ChatGPT fetches is a licensed / authoritative tier the traffic labels labrador (major publishers, Wikipedia, academic sources). Across many different prompts, this tier is consistently pulled into the candidate pool and then dropped before the answer is composed. The visible answer is sourced almost entirely from the scraped commercial tier. In this capture, the dropped authoritative pages were:

The composition is intent-matched: a B2B query like this one pulls academic papers; health questions pull medical and news publishers. The outcome (fetched then dropped) is constant; what gets fetched follows the topic.

Hypothesis, not conclusion

Because we so regularly see the authoritative tier fetched and then dropped rather than cited, we suspect ChatGPT may be using it as an internal grounding step rather than as a source, pulling known-reliable material to anchor its response in a factual region and to sanity-check the claims introduced by the lower-quality scraped pages it actually cites.

We can see that the authoritative tier is fetched and dropped. We cannot see why from the traffic; the ranking logic stays server-side. The grounding-and-validation idea is our best hypothesis, and the kind of thing worth testing rather than asserting.

Read the numbers carefully

Caveats

Stage 6 is parsed from prose, not a data field. The recommended names are extracted from the answer text, so they need human review. This capture is hand-verified.
Numbers are one capture; the shape is repeatable. The exact counts (75, 9, 3) are this one query. The shape (fan-out, waves, fetch, filter, cite, recommend) and the fetch-then-drop pattern held across many prompts. The grounding explanation for the drop is a hypothesis, not an observed fact.
The pool is what reached the browser. Pages the engine fetched and discarded server-side never appear in the data. " fetched" is a floor on the real work, not a ceiling.