How to improve your brand's visibility in AI search
Get cited in ChatGPT, Perplexity, and Gemini. Brand recognition sets your floor. On-page structure earns the citation. CheckThat data shows which levers actually move the odds.
Most teams treat AI search as one more keyword to chase, tuning the same signals that won them Google rankings. That instinct can backfire because AI citation does not run on the blue-link algorithm, so the old SEO chores alone do not earn it. Brand recognition sets your floor in AI search, where you earn the specific citation with valuable content on the page.
CheckThat, our AI-visibility data engine, tracks 5,800+ brands across 2.6M+ AI responses. Holding that data lets us see a clear pattern where pages that get cited show no domain-authority advantage over the pages that do not.
It turns out that authority is not the lever. The levers are on-page and mostly boring, which is exactly why competitors skip them. Things like ensuring every page is crisply serving up signals like a BreadcrumbList schema, a credentialed author byline, and a real freshness cadence are what keep AI crawlers unblocked and assigning authority to your pages.
We can pull this thread even more to understand how this plays out to get you some pragmatic advice on how to execute well in AI visibility.
Why AI search visibility is a different game from SEO
AI citation and keyword ranking share some plumbing, but the retrieval logic diverges enough that a page ranking #1 on Google can go uncited by every answer engine. Google rewards keyword relevance and link authority against a query. Answer engines assemble a response from multiple retrieved sources, then decide which of those to attribute. Training data and live retrieval both factor in, and platforms weight the trust signals differently.
So just running the old SEO playbook backfires. Tactics that once signaled relevance, like dense question-headers and FAQ markup, now read to the systems deciding what to cite as thin or over-optimized content. Answer engines cite pages when authors structure claims clearly, back them with sources, and date them in a way a model can attribute.
How the major AI search engines decide what to cite
The biggest architectural fault line is retrieval-augmented generation (RAG) versus training-data synthesis. RAG systems fetch live documents before answering and cite real URLs. Parametric systems answer from training data alone and often produce vague or nonexistent attributions. The four platforms you should care about the most in search split across that line:
- Perplexity: Retrieval-first, triggering live search on every query with numbered citations.
- Google AI Overviews: Grounded in Google's Search index and Knowledge Graph, using query fan-out.
- Google Gemini: Pulls from Google's index with inline URL annotations when the Google Search tool runs. In local and business-recommendation testing, it tends to favor brand-owned websites.
- ChatGPT Search: Defaults to static training data for stable queries, often with zero outbound citations, and switches to live retrieval via Bing only when recency triggers it or when web search is included in the query (which is becoming more common).
Perplexity and Google AI Overviews cite from live retrieval every time, so retrievability and freshness matter most there. ChatGPT citations lean more on brand mentions baked into training data, plus retrievable content for anything time-sensitive.
The levers that move citation odds
This is where a robust source of AI visibility data earns its keep. When we compare cited pages against uncited ones at the same query, the cited set does not win on domain authority. It wins on-page. Three on-page signals do most of the work.
These signals help an answer engine resolve what a page is and whether a credible author stands behind it. Old keyword-crawler tricks do not. First, anchor with schema.
Add breadcrumb schema
Breadcrumbs help engines understand where a page sits in your site hierarchy, which supports entity resolution: what the page is about, what category it belongs to, and how it connects to the rest of your content. In the CheckThat data, breadcrumb presence is the single strongest on-page correlate of citation. Google lists Breadcrumb among its supported rich-result schema types and recommends JSON-LD. Schema alone is not a guaranteed lift, so treat breadcrumbs as an entity-resolution aid, not a magic switch.
Start with your highest-value pages: product pages and category hubs, then the comparison and guide content buyers reach during evaluation. Then, there's author credibility.
Use credentialed author bylines
A credentialed author byline gives an answer engine a cleaner attribution path, and it is one of the largest on-page effects in our first-party data gathered by CheckThat. Attach a real byline to substantive content, link the author to a profile that establishes their expertise, and mark it up with Person schema tied to your Organization schema. Models weight authorship and credentials when deciding what to trust.
Your next most important component is, unsurprisingly, freshness.
Signal freshness in your content
Freshness helps retrieval-first systems decide whether a source is current enough to use, and it is the third on-page lever in the CheckThat data. Perplexity and Google AI Overviews favor content that signals recency, because a query fan-out prioritizes current sources. Freshness is more than a publish date. It covers visible dating, updated timestamps, and body language that anchors a claim to a current moment ("as of 2026," "in the latest release"). Update your evergreen pages on a real cadence and surface the update date.
There are also some things that can hurt your odds of getting cited.
What moves citation odds down
Two common tactics suppress citation odds, and one legacy trick is now dead weight.
Subscribe walls and hard paywalls hurt. A hard paywall that blocks crawler access removes AI Overview eligibility entirely, because a page Googlebot cannot index cannot be cited. Publishers who grant access through flexible sampling keep their citations. AI Overviews data shows 96.3% of New York Times citations in AI Overviews come from paywalled content granted crawler access, versus zero for content locked behind a hard wall.
Keyword-stuffed question-headers push odds down. Models read high heading density as a flag for thin, over-chunked content. The AirOps report found that matching 3-4 subheadings drops citation probability by 6 percentage points versus matching 0-1, and headings in the 20-39 character range correlate with the highest citation rate.
Skip FAQ schema as a citation play. Google stopped showing FAQ rich results on May 7, 2026. FAQPage markup stays valid and won't penalize you, but it no longer triggers the SERP feature marketers chased, so it should not anchor your strategy.
It turns out that conversational content that's good for readers is far more important than 'format hacking.'
Optimize content for conversational, question-based queries
Structure gets a page retrieved, but the words on it decide whether it gets lifted. Pages earn citations when they answer a question directly, up front, in language a model can lift verbatim. Put the answer in the first two sentences under a heading, then support it. A paragraph that buries the answer three sentences down gives the model nothing clean to pull.
The other reliable move is to include citable statistics with sources. The KDD 2024 GEO paper showed that adding citations, quotations, and statistics boosts source visibility over 40% across queries. Format matters too, with comparison and "best tool for X" pages getting cited disproportionately, which is where buyers form opinions if you own a category.
Then, you also want to make sure that you are publishing strong first-party content that is likely to get cited.
Build your off-page citation footprint
On-page structure has a ceiling. Answer engines weight sources they do not control more than your own claims about yourself, which makes consistent, positive third-party mentions the highest-leverage off-page signal. An Ahrefs study across 75,000 brands ranked YouTube mentions as the strongest correlate of AI visibility, at roughly 0.737, outperforming every other factor across ChatGPT, AI Mode, and AI Overviews.
Digital PR is how you seed the authoritative sources answer engines pull from because an earned placement on an outlet a model already retrieves lands your brand in a source it trusts. Concentrate placements where retrieval is dense. For most B2B brands, that means industry publications, credible news outlets, and video mentions. Low-authority guest posts rarely enter the retrieval sets that matter.
And, most importantly, you don't want to miss sticking the landing by skipping a technical audit.
Technical foundations for AI crawlers
Off-page mentions and on-page structure both assume that a crawler can reach the page. Your citation odds start at zero if AI crawlers cannot reach your content or cannot resolve your brand as a clean entity. Implement Organization and Person schema first. Organization schema establishes your brand as a resolvable entity and person schema, linked to authors via sameAs, ties your credentialed bylines to real identities.
Get your robots.txt right, because blanket blocking costs more than it protects. Both OpenAI and Google let you opt out of model training while staying retrievable in search, so keep retrieval crawlers open and only block training collection if you must. Keep your NAP data consistent across platforms too, because conflicting details cause a model to lose confidence and skip you.
How reputation and reviews shape AI recommendations
Once an engine can resolve and reach you, reviews shape how it recommends you, and content quality matters far more than review count. A practitioner analysis of roughly 700 local queries measured review content quality correlating with AI search visibility at about 0.71, while review count correlated at only 0.12, close to a 6x difference in predictive power.
Recency and human responses feed the same signal, so keep responses human and current. Platform priority depends on your model. In local and business-recommendation testing, Gemini tends to favor brand-owned websites while Perplexity weights niche directories heavily. For local brands, Google Business Profile and Yelp lead. For B2B, your owned pages and industry-relevant third-party sources carry more weight.
Then, you have to measure it.
How to measure your AI search visibility
The levers above only pay off if you can watch them move. Start with prompt-based brand checks: run the buyer-intent queries your prospects ask across ChatGPT, Perplexity, Gemini, and Google AI Overviews. Record whether you appear, how each engine describes you, and which competitors get cited alongside you. Do this on a fixed prompt panel so you can track movement over time.
Layer in referral tracking, but know the gaps. GA4 added a native AI Assistant channel group on May 13, 2026, covering ChatGPT, Gemini, and Claude, though it excludes Perplexity and Copilot.
For a standing benchmark instead of a manual audit, AI visibility, powered by CheckThat, benchmarks where you stand across 172 categories, 5,800+ brands, and 2.6M+ AI responses. It is a freemium way to see your position before you commit to a program.
If you run this as an ongoing program, structure your monitoring across four dimensions: Presence (do you appear), Reputation (how engines characterize you), Perception (the sentiment applied to you), and Influence (whether you shape the category narrative). GrowthOS tracks 5,000 prompts across ChatGPT, Claude, Perplexity, and Google AI Overviews against those four dimensions, which is how a Growth PMM connects a positioning gap to the specific pages and sources driving it.
This wouldn't be a complete piece unless we also talked about some 'not to dos.'
Common mistakes that keep brands out of AI answers
The fastest way to stay uncited is to block the crawlers, then wonder why competitors show up in every answer. The errors that keep otherwise strong brands out:
- Blocking retrieval crawlers to protect content. You lose traffic without preventing citation, and you remove yourself from the retrieval sets where citations get decided.
- Inconsistent NAP and entity data. Conflicting business details across platforms make engines lose confidence and skip you.
- Thin entity signals. No Organization schema, no author identities, no clear site hierarchy leaves a model unable to resolve who you are or what you're authoritative about.
- Scaled content abuse. Citation systems treat high heading-density, keyword-stuffed pages as SEO spam.
- Hard subscribe walls. AI Overviews cannot cite content Googlebot cannot reach.
Most of these are unforced. Fix crawler access and entity signals first, then content structure, and you clear the floor most competitors are still tripping over.
Doing this yourself, and the operated version
The manual job is completely doable if you're motivated and high leverage. Ship BreadcrumbList and Person schema on your money pages, put a credentialed byline on anything substantive, set a freshness cadence you can actually keep, seed a few earned placements where retrieval is dense, then run a fixed prompt panel across the four engines every month and log where you appear and who gets cited next to you. That loop, run carefully and regularly, will move the needle. We know, because we operated it for hundreds of clients!
GrowthOS is the operated version of that loop. It tracks 5,000 prompts against Presence, Reputation, Perception, and Influence, benchmarks you with CheckThat data, and connects each visibility gap back to the specific pages and sources that would close it, so the work above runs on a schedule instead of a good intention. If that is the loop you want running for you, book a demo. Engagements start from $6,000/mo.