How the AI Presence Index Works
This page is designed to be cited. Every decision in our methodology is documented, weighted, and explained. No black boxes.
The Problem With Measuring AI Visibility
Traditional SEO has Domain Authority. Backlink profiles. Rank tracking. Decades of tooling that tells you exactly where you stand.
GEO has nothing. Until now.
When a buyer asks ChatGPT "what's the best video hosting platform for my SaaS?" is your brand in the answer? Are you first, or fifth? Are you described as a top pick, or just mentioned in passing? Is ChatGPT the only platform that knows you, or are you visible everywhere?
The AI Presence Index answers all of these questions with a single number: your score from 0 to 100.
What the Score Actually Measures
The score is not a vanity metric. It is a proxy for one thing: how likely is a buyer to encounter your brand when they ask an AI system for a recommendation in your category.
It measures four dimensions:
| Dimension | Weight | What It Captures |
|---|---|---|
| Mention Rate | 30 pts | Are you in the AI consideration set at all? Weighted by how strongly AI recommends you. |
| Position | 30 pts | Are you named first, or fourth? Buyers rarely look past the second recommendation. |
| Sentiment | 20 pts | When AI mentions you, is it positive, neutral, or cautionary? |
| Platform Breadth | 20 pts | Are you visible on all four major AI platforms, or just one? |
Total: 0 to 100.
The 5-Step Pipeline
Brand Context Inference
When you enter a brand name or URL, we do not guess. We fetch your live homepage and extract the title, meta description, and page content. This is fed to GPT-4o alongside your brand name.
The model returns a structured profile:
- Specific category — "Video Hosting and Streaming SaaS", not "tech company"
- Ideal customer profile — who the product is built for
- Top 5 competitors — used in competitive prompts
- Key differentiators — used in feature-specific queries
- Primary use cases — used in use-case queries
Wrong context produces wrong prompts, which produce useless results. This is why we fetch the actual website instead of inferring from a brand name.
84 Buyer-Intent Queries
We run 7 competitive prompts across 4 platforms, 3 times each. 84 total query executions per brand.
The 7 Competitive Prompts:
| # | Prompt Type | Example |
|---|---|---|
| C1 | Category recommendation | "What are the best video hosting platforms for SaaS products?" |
| C2 | Purchase decision | "I need a video hosting platform. Which one should I choose?" |
| C3 | Market comparison | "Compare the top video hosting tools available right now." |
| C4 | Competitor alternative | "Best alternatives to Vimeo for businesses" |
| C5 | Competitor alternative | "Best alternatives to Wistia for developer teams" |
| C6 | Feature-specific | "Video hosting platform with DRM protection" |
| C7 | Use-case specific | "Video hosting for online course platforms" |
Why 3 Runs Per Query:
AI APIs are non-deterministic. A single run can place your brand first in one call and fourth in the next. By running each prompt three times and taking the median result, score variance collapses from plus/minus 8 points to plus/minus 2 to 3 points. All scoring queries run at temperature 0.
The 4 Platforms:
| Platform | Model | Why It Matters |
|---|---|---|
| ChatGPT | GPT-4o | 900M+ weekly users. The dominant AI assistant. |
| Perplexity | Sonar | AI search with real-time web citations. Growing fast in B2B. |
| Claude | Sonnet | Strong enterprise and technical user base. |
| Gemini | 1.5 Flash | Google ecosystem. Deep integration with Search. |
Branded Queries — Report Only, Not Scored:
We also run 4 branded queries per platform. These generate verbatim quotes and perception data for your report but do not feed the score. Why? If branded queries counted, any recognisable brand would score well regardless of whether AI actually recommends them to buyers. The score must reflect discovery, not fame.
Structured Response Extraction
Each of the 84 competitive query responses is individually analysed by GPT-4o-mini using a strict JSON extraction prompt. For each response, we extract:
- Mentioned — is the brand specifically named?
- Position — in what order does it appear relative to other brands?
- Sentiment — positive, neutral, or negative framing?
- Recommendation Strength — top pick, recommended, mentioned, or mentioned negatively?
- Competitors Found — every other brand named in the same response
- Verbatim Quote — the exact sentence where the brand is first named
After 3 runs of the same prompt, we resolve to a single data point using majority vote for categorical fields and median for position integers.
Scoring
Mention Rate (0 to 30 points)
How often does your brand appear across the 28 competitive prompt-platform combinations?
Recommendation strength multiplier:
| Strength | Multiplier |
|---|---|
| Top pick | 1.0 |
| Recommended | 0.8 |
| Mentioned | 0.5 |
| Mentioned negatively | 0.1 |
| Not mentioned | 0.0 |
Query intent multiplier:
| Query Type | Multiplier |
|---|---|
| Category recommendation (C1) | 1.5x |
| Purchase decision (C2) | 1.3x |
| Competitor alternatives (C4, C5) | 1.3x |
| Market comparison (C3) | 1.2x |
| Feature-specific (C6) | 1.1x |
| Use-case specific (C7) | 1.0x |
Position (0 to 30 points)
| Position | Points |
|---|---|
| 1st | 30 |
| 2nd | 22 |
| 3rd | 15 |
| 4th | 8 |
| 5th or later | 3 |
| Not mentioned | 0 |
The drop-off is intentional and steep.
Sentiment (0 to 20 points)
| Sentiment | Weight |
|---|---|
| Positive | 1.0 |
| Neutral | 0.5 |
| Negative | 0.0 |
Calculated only from prompts where your brand was mentioned.
Platform Breadth (0 to 20 points)
| Platforms | Score |
|---|---|
| 4/4 | 20 |
| 3/4 | 14 |
| 2/4 | 8 |
| 1/4 | 3 |
| 0/4 | 0 |
A platform counts if it mentions your brand in at least 2 of the 7 competitive query types.
Report Generation
7-section report from all 100 query results:
- Executive Summary
- Score Breakdown
- Platform Intelligence
- Competitive Landscape
- Brand Perception Analysis
- Prompt Gap Analysis
- Recommendations
Score Interpretation
| Score | Label | What It Means |
|---|---|---|
| 85 to 100 | Dominant | AI consistently recommends you first across most platforms and query types |
| 70 to 84 | Strong | High visibility with gaps in specific query types or platforms |
| 55 to 69 | Moderate | AI knows you but competitors are winning most recommendation moments |
| 40 to 54 | Weak | Rarely appearing in buyer-intent queries |
| 0 to 39 | Invisible | Not in the AI consideration set for your category |
Benchmarks and Percentiles
Once a category has 8 or more scored profiles, every brand receives: Category Rank, Category Percentile, Category Average, Top Competitor Score.
Score History
Every time a brand is scored, the result is stored and never overwritten. Profile pages display score history as a trend.
Limitations
| Limitation | Impact | How We Handle It |
|---|---|---|
| AI non-determinism | plus/minus 2 to 3 point variance after 3-run median | Acceptable. Disclosed here. |
| Model updates | Scores shift when models update | Model version logged. Recalibration notices issued. |
| Category inference errors | Wrong category = wrong prompts | Homepage fetched on every run. Lower-confidence runs flagged. |
| Platform API downtime | Fewer than 4 platforms scored | Unavailable platform excluded. Confidence level flagged. |
| Newer brands | Score may not reflect product quality | Score reflects AI training data, not product merit. |
One More Thing
Improving your score requires genuinely improving your AI visibility. There is no shortcut. That is the point. A metric worth having should be hard to fake.
Ready to see where you stand?
Want to improve your score? DerivateX specializes in Citation Engineering for B2B SaaS brands.