Methodology

This is a cliff-notes version. The full METHODOLOGY.md in the repo is the canonical source.

What we measure

Pattern survival: which JS content patterns survive in bot HTTP responses, what LLMs report seeing when asked to fetch URLs containing those patterns, and how those signals differ across bot classes within a single framework.

Rendering mode (SSR / CSR / SSG) is a 2-3 mode control axis, not the primary variable. The primary axis is JS content pattern.

The 8 patterns

Each pattern is tested across 5 page types in SSR mode, plus baseline (clean) and CSR (negative control).

clean
No JS-injected content. All content in SSR HTML. Baseline.
js-images
<img src> attribute set by client JS after mount (useEffect setter pattern).
js-links
Navigation via onClick={navigate} instead of <a href>.
click-reveal
Main content body hidden until user clicks "show more".
js-fetched
Content fetched from API client-side after mount (price, reviews, recommendations).
hash-routing
URLs use fragment #/path instead of real paths.
late-loaded
Content rendered after IntersectionObserver fires.
mixed
Realistic combination of the above (e.g. product page with js-fetched price + lazy images + click-reveal reviews).

Pre-registered hypotheses

Committed to the repository before data collection started:

  1. H1 (sanity check). clean SSR ≡ clean SSG for batch crawlers. If this fails, methodology is broken.
  2. H2. clean SSR fully visible to all batch AI crawlers (GPTBot, ClaudeBot, PerplexityBot, Bytespider, CCBot).
  3. H3 (main test). Patterns js-images, js-links, click-reveal, js-fetched, hash-routing, late-loaded produce invisible main content to batch AI crawlers.
  4. H4. Googlebot (and Gemini via shared infrastructure) eventually renders all 8 patterns, with measurable per-pattern delay vs clean.
  5. H5. On-demand fetch tools (ChatGPT-User, Claude-User, Claude-SearchBot, Perplexity-User, Bing Copilot) — behaviour is exploratory; no strong prior. The split between batch crawler (GPTBot) and on-demand counterpart (ChatGPT-User) for the same provider is the most-watched signal.
  6. H6. Bing rendering of JS patterns is inconsistent — first systematic Bing data since the 2019 Edge crawler note.

Kill criterion: if pattern js-images shows no visibility difference vs clean for batch AI crawlers, the methodology is broken and investigation halts. Vercel/MERJ 2024 establishes a strong baseline that this prediction should hold.

Three-layer bot tracking

Every hit is captured by two independent layers, joined for cross-validation:

  1. Layer 1 — Next.js middleware. Every dynamic-route request hitting the test bed worker is logged before response is sent. Fire-and-forget POST to track.jsseo.dev/api/hit via ctx.waitUntil().
  2. Layer 2 — Cloudflare GraphQL Analytics ingester. Every request that Cloudflare logs at the edge is polled into the tracker, including prerendered static routes that bypass the worker. 60s poll lag, dedup via natural-key hash.
  3. Layer 3 — client-side JS beacon. Deferred — to be added when first findings need post-hydration signal. The beacon proves which bots actually executed JavaScript on the page (those that did fire the beacon, those that didn't show no signal). See the TRACKER.md for the current implementation status.

Bot identification

27 bot classes recognised, with three verification paths:

  • rDNS (Google, Bing, Apple) — reverse DNS + forward confirmation against the bot's published domain (*.googlebot.com, *.search.msn.com, *.applebot.apple.com).
  • IP-range (OpenAI, Perplexity, Anthropic) — CIDR membership check against published JSON manifests refreshed every 6 hours, plus a hardcoded list for Anthropic (whose published source is an HTML page, not JSON).
  • none / auto-unverified — deprecated anthropic-ai and Claude-Web UAs (pre-2024 vendor-deprecated) are flagged untrusted on insert, never reaching the verification queue.

HTTP Message Signature instrumentation (RFC 9421) — header detection landed for ChatGPT Agent's signed requests; verification against published key sets deferred.

LLM fetch testing

Parallel to bot crawl tracking, periodic prompt-driven fetch tests across 9 LLM surfaces (ChatGPT, Claude.ai, Perplexity, Gemini, Bing Copilot — both web UI and API where available). Class A prompts ask the model to fetch a specific cell URL and describe what it sees. Each response is scored 0-4 (fetch failed / placeholder only / partial content / full but inaccurate / full and accurate), with binary flags for marker_detected, images_described, structured_data_used.

Cadence: weekly full sweep across all cells × all surfaces, daily focused probes on 24 high-signal cells, manual sampling 5-10 per week in chat UIs to catch divergence between API and chat surfaces.

Descriptive, not inferential

N=1 per cell. Inferential statistics (p-values, confidence intervals) are misleading at this sample size. Reported metrics are descriptive: counts, rates, percentages, ratios. Claims are framed as descriptions of the test instance, not population-level inferences.

Open data, open code

All raw data (SQLite dumps, prompt logs, screenshots) published in the repo under data/. All analysis code in analysis/. Anyone can rerun the analysis from raw data with one command. License: CC0 / CC-BY for data, MIT for code.

See /data for citation guidance and downloadable datasets.

Limitations

  • One framework (Next.js). Generalisation to Vue/Svelte/Solid/Qwik ecosystems requires Phase 2.
  • 8 patterns chosen as common, not exhaustive. Real-world sites use combinations and patterns not catalogued. The mixed cell explicitly tests realistic combos.
  • 2 search engines (Google, Bing). 9 LLM surfaces (large sample, not exhaustive).
  • 12 weeks of data collection. One geographic location for tracker server (Hetzner FSN1).

Results valid for the conditions tested. Extrapolation requires care.

Live data

/dashboard shows the current pattern survival matrix aggregated from the tracker. Updated daily during builds.

/test-bed explains the live research surface and links to specific cells for direct inspection.