How to Audit Your Site for AI Readiness: Step-by-Step Guide and Templates
DIY audit guide with templates—robots.txt, JavaScript rendering, navigation testing, content extraction, scoring, and action planning.
You can audit your site's AI readiness yourself in 2-3 hours using free tools. While platforms like Compass automate this process, understanding the mechanics provides better control over your strategy. This guide walks you through the manual audit process step-by-step, enabling you to identify visibility gaps without special software.
The objective is not just to check for search engine crawlers, but to validate how AI agents discover and interpret your content. AI agents rely on navigation structures and readable HTML differently than traditional bots. By following this protocol, you gain hands-on understanding of your site's AI discoverability.
The audit consists of five phases: quick health checks, technical reviews (robots.txt and JavaScript), functional navigation testing, and content extractability verification. You will need a spreadsheet to record findings, an AI model with browsing capabilities, and access to your site's code.
Completing this audit equips you with evidence-based insights into where your content is visible and where it is hidden.
Quick Checks
Begin with a 30-minute overview to catch obvious blocking or rendering issues. These checks require no technical setup and reveal immediate barriers to AI access.
Check robots.txt
Open yoursite.com/robots.txt. Look for directives that explicitly block AI crawlers. Common user-agents to check include GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot, and CCBot (Common Crawl). If you see User-agent: * followed by Disallow: /, AI is blocked. If specific bots are disallowed without cause, this limits your AI visibility.
Ask an AI to visit
Open an AI model with browsing capabilities, such as ChatGPT or Claude. Prompt it with: "Visit [your domain] and tell me what you see. What is the primary business purpose?" Compare its response to your actual homepage. If the AI cannot identify your core offering, your navigation or text hierarchy is unclear.
Check key pages directly
Ask the same AI to visit your pricing, contact, and main product pages. Request that it reads the content aloud. Compare what it reports versus what you see in your browser. Major discrepancies suggest JavaScript rendering issues where content exists visually but is missing from the source code.
Disable JavaScript, refresh key pages
In your browser, press F12 to open Developer Tools. Go to the Network tab or Console and disable JavaScript. Refresh your key pages. What disappears? Navigation menus, pricing tables, or product information? If content vanishes, it is likely invisible to AI agents that do not fully execute JavaScript.
Page load speed check
Run your core pages through Google PageSpeed Insights. While speed is not a direct AI ranking factor, slow pages affect crawl efficiency. If a page takes more than 3 seconds to load, bots may terminate the crawl before retrieving content.
Record your findings in the template below.
| Check | Result | Notes |
|---|---|---|
| robots.txt allows AI | ✓/✗ | |
| AI understands homepage | ✓/✗ | |
| Pricing visible without JS | ✓/✗ | |
| Contact info in HTML | ✓/✗ | |
| Key pages under 3 seconds | ✓/✗ |
Robots.txt Audit
This 15-minute review ensures your site permissions align with your AI visibility goals. Robots.txt is the gatekeeper; if you block the gate, you block the agent.
Read your robots.txt file line by line. For each AI user-agent (GPTBot, ClaudeBot, PerplexityBot, CCBot), check three things: is it blocked via Disallow: /, is it rate-limited via Crawl-delay, or does it have specific exceptions?
The general recommendation is to allow access unless you have specific IP protection needs. Blocking GPTBot, for example, prevents your content from training models or being indexed by search features that utilise them. Ensure your * wildcard (all agents) does not override specific allow lists for AI crawlers.
If you find unnecessary blocks, update the file immediately. Changes can take up to 24 hours to propagate depending on the crawler.
| User-Agent | Status | Comments |
|---|---|---|
| GPTBot | Allow / Block / Rate-limit | |
| ClaudeBot | Allow / Block / Rate-limit | |
| PerplexityBot | Allow / Block / Rate-limit | |
| Googlebot | Allow / Block / Rate-limit | |
| * (all) | Allow / Block / Rate-limit |
JavaScript Rendering Audit
AI agents vary in their ability to execute JavaScript. Some render pages like a modern browser, while others rely on raw HTML. Critical content must be accessible in the HTML source to guarantee visibility. This audit takes 30-60 minutes depending on site size.
Method 1: Manual browser check
Right-click your page and select "View Page Source" (Ctrl+U). Search (Cmd+F / Ctrl+F) for key terms: "price", "£", "$", "contact", or your product name. If these terms are absent from the raw source but present on the screen, they are JavaScript-loaded.
Method 2: Screaming Frog
Download the free version of Screaming Frog SEO Spider. Configure settings: Configuration → Spider → Rendering: "JavaScript". Under Advanced, check "Store HTML" and "Store Rendered HTML". Crawl your site (15-30 minutes). Export results and compare word counts: Raw HTML versus Rendered HTML.
Calculate the ratio: Rendered ÷ Raw.
- 1.0-1.2x: Good (minimal JS).
- 1.2-2.0x: Caution (moderate JS).
- 2.0x+: High risk (heavy JS).
Method 3: Command line
If you are technical, use curl to fetch raw HTML.
# Fetch raw HTML
curl -s "https://yoursite.com/pricing" | wc -w
If key terms like "price" return zero results, they are JS-loaded.
For each critical page, record the data below.
| Page | Entry Point | Raw HTML (words) | Rendered (words) | Ratio | Risk |
|---|---|---|---|---|---|
| /pricing | HTML | 500 | 1200 | 2.4x | HIGH |
| /about | HTML | 800 | 850 | 1.1x | LOW |
| /contact | HTML | 400 | 420 | 1.1x | LOW |
High risk pages should be restructured to use Server-Side Rendering (SSR) or pre-rendering techniques. This ensures content is available to all crawlers, regardless of JavaScript execution capabilities.
Navigation Task Audit
Navigation structure is critical for AI discovery. Wayfinder's research involving 3,348 navigation tasks across 269 websites shows that 91% of successful navigation completes within two clicks. Position in the DOM matters more than semantic relevance for initial discovery.
This 60-minute test validates whether agents can find key content realistically.
Step 1: Define tasks
Create a list of 5-10 tasks based on your business model.
| Task ID | Task | Target Content | Success = |
|---|---|---|---|
| T1 | Find pricing | /pricing or pricing section | Can state specific prices |
| T2 | Find contact info | Email address | Provides actual email |
| T3 | Find returns policy | /returns or /policies | Summarises return terms |
| T4 | Find [product X] | /products/x | Describes product |
| T5 | Book a demo | /demo or /contact | Finds booking mechanism |
Step 2: Test with AI agent
For each task, open an AI agent with browsing capabilities. Use this prompt structure:
Your task: [Task description]
Starting point: [Homepage or specific URL]
Constraint: You can click up to 5 links.
Record every link you click and the pages you visit.
1. What links did you click?
2. Did you find the target content?
3. How many clicks did it take?
Step 3: Record results
| Task | Starting Point | Links Clicked | Success? | Clicks | Notes |
|---|---|---|---|---|---|
| T1 (Find pricing) | Homepage | Home > Pricing | ✓ | 1 | Found first click |
| T2 (Find contact) | Homepage | Home > Nav > Contact | ✓ | 2 | Needed footer nav |
| T3 (Find returns) | Homepage | Loop loop loop | ✗ | 5+ | Looped, couldn't find |
Step 4: Analyse
Calculate the success rate. 80%+ is good. Check the median clicks; 2 is ideal, 3+ is concerning. Identify which tasks failed and look for patterns. Did the agent loop in the navigation? Did it get stuck on infinite scroll?
If your site fails these tests, it suggests your link text or structure does not match natural language queries. This mirrors findings where search-first approaches either succeed instantly or fail badly depending on link clarity. Review your internal linking strategy to ensure key pages are reachable within two clicks from the homepage.
Content Extractability Check
AI must not only find content but understand it. This 30-minute check validates information accuracy.
Ask an AI to summarise your key pages. For each page, prompt:
Visit [URL] and answer these questions:
1. What is this page about?
2. What are the key facts/prices/features?
3. Is any information missing or unclear?
Tell me exactly what you see, don't summarise.
Compare the output to your expectation. If the AI misses key info or hallucinates details, investigate the source. Check the earlier JavaScript audit to see if content is hidden in scripts. Check if link text clearly describes the destination.
| Page | AI's Summary | Correct? | Issues |
|---|---|---|---|
| /pricing | States three plans, prices | ✓ | None |
| /about | Gets company name right, misses mission | ✗ | Mission is in hero image (non-text) |
| /products | Lists 2 products, misses third | ✗ | Third product JS-loaded |
If the AI extracts incorrect data, it is likely due to poor HTML structure or missing schema markup. Ensure headings (H1, H2) logically frame the content.
Scoring Your Results
Synthesise your audit into an overall readiness score using this framework.
Scoring framework
-
Technical foundation (40 points)
- robots.txt allows AI: 10 points
- Critical content in HTML (not JS): 15 points
- Pages load under 3 seconds: 10 points
- Schema markup present: 5 points
-
Navigation (35 points)
- 80%+ of tasks succeed: 15 points
- Median clicks ≤2: 10 points
- No loops detected: 10 points
-
Content quality (25 points)
- AI extracts info correctly: 15 points
- No hallucinations: 10 points
Interpretation
- 85-100: AI-ready. No critical issues.
- 70-85: Mostly ready. Fix 1-2 items.
- 50-70: Needs work. Multiple issues.
- Under 50: Not ready. Significant gaps.
Use this score to prioritise remediation. If you score low on Navigation, review your internal linking. If low on Technical, focus on rendering and robots.txt.
Next Steps
Turn findings into an action plan. Categorise issues by impact and effort.
Priority matrix
For each issue found, assign a category:
- High impact, easy fix: robots.txt, navigation labels, add missing links.
- High impact, hard fix: JavaScript rendering (needs SSR), site restructure.
- Low impact, easy fix: Schema markup, metadata.
- Low impact, hard fix: Deep redesign.
Create action plan
Prioritise high impact/easy fixes first. These are quick wins. Plan high impact/hard fixes next, as these require developer time. Deprioritise low impact items unless they are quick.
| Issue | Priority | Owner | Timeline | Status |
|---|---|---|---|---|
| Allow AI crawlers in robots.txt | High/Easy | Marketing | 1 day | To do |
| Fix pricing page JS loading | High/Hard | Dev | 2 weeks | In progress |
| Clarify navigation labels | High/Easy | Content | 3 days | To do |
Once you have fixed immediate blockers, review the technical AEO checklist for deeper optimisation strategies. You may also wish to revisit navigation tasks after changes to confirm improvements.
For teams that need to scale this process, manual audits become resource-intensive. Compass automates this testing and surfaces issues in minutes. See your audit results and track improvements over time.