How Claude Code Eats the Web

I was curious about how Claude Code handles web fetching. When you fetch a URL, Claude Code shows you the document size—but is that the original HTML file size, or something smaller? My hunch was that it converts HTML to Markdown before putting it into context. Does it actually do that, or does it just dump raw HTML and burn tokens?

So I did what any reasonable person would do: I pointed Claude Code at its own minified cli.js bundle and asked it to figure out how it works. Turns out it's pretty clever.

The Token Problem

Web pages are bloated. A simple documentation page might be 50KB of HTML but only 2KB of actual content. If you're paying per token that's a problem.

Claude Code solves this in stages.

Stage 1: HTML to Markdown

The first thing that happens is your fetched HTML gets converted to Markdown using Turndown.

Th2.default().turndown(J); // HTML in, Markdown out

This strips all the <div> soup, navigation, scripts, and CSS. What remains is the content structure: headings, paragraphs, code blocks, links.

Already a big win.

Stage 2: The Small Model

Here's where it gets interesting. The Markdown doesn't go straight to Claude. Instead, it gets processed by a "small, fast model" first.

if (K && W.includes("text/markdown") && J.length < Ph2) V = J; // use as-is
else V = await vh2(Q, J, B.signal, G, K); // process with small model

This small model receives your original prompt plus the page content, and extracts only the relevant parts. So if you asked "what are the authentication options?" you get back a focused answer—not the entire page.

The main Claude model only sees this distilled output.

The Trusted Sites Shortcut

There's a hardcoded list of ~80 documentation sites that get special treatment:

  • docs.python.org
  • developer.mozilla.org
  • react.dev
  • kubernetes.io
  • docs.aws.amazon.com
  • etc.

For these trusted sites, if the server returns Content-Type: text/markdown and the content is under 100k characters, it skips the small model entirely. Direct passthrough.

if (K && W.includes("text/markdown") && J.length < Ph2) Ph2 = 1e5; // 100,000 character threshold

Important caveat: this fast-path only applies when the server sends Markdown directly. A trusted site returning HTML still goes through Turndown conversion and then the small model.

The small model also uses different prompts based on trust. Trusted sites get a more generous prompt that allows including "relevant details, code examples, and documentation excerpts." Non-trusted sites get a stricter prompt with a 125-character maximum for quotes.

Caching

Results get cached for 15 minutes in an LRU cache with a 50MB limit.

vp5 = 900000; // 15 min TTL
kp5 = 52428800; // 50MB max

Repeated fetches of the same URL are essentially free.

The Full Flow

URL + prompt
    ↓
[Cache hit?] → return cached
    ↓
[Fetch]
    ↓
[HTML?] → Turndown → Markdown
    ↓
[Trusted + Content-Type: text/markdown + <100k?] → use directly
    ↓
[Otherwise] → small model extracts relevant info
    ↓
[Cache result]
    ↓
[Return to Claude]

Should You Serve Markdown to Agents?

Given all this, is it worth serving Markdown instead of HTML if you're building docs that agents will consume?

Short answer: clean content matters more than format.

The HTML-to-Markdown conversion is good enough. And the small model processing might actually help by filtering out noise from messy pages.

But if you're building specifically for agent consumption, Markdown is the path of least resistance:

  1. No conversion step
  2. Predictable output
  3. Potential fast-path if you're on the trusted list

The real wins come from semantic structure—proper headings, code blocks, minimal boilerplate. Not the format itself.

Takeaway

Claude Code doesn't just fetch and dump. It:

  1. Converts HTML to Markdown (Turndown)
  2. Optionally processes with a small model to extract relevant info
  3. Caches results for 15 minutes
  4. Has a fast path for trusted documentation sites

Pretty reasonable architecture for keeping token costs down while still giving the model useful context.

A Small Feature Request

One thing I'd love to see: when Claude Code fetches a URL, show both the original document size and the size of what actually ends up in context. Maybe even the token cost.

Right now you see "fetched 847KB" and have no idea if that's what Claude actually receives or if it got compressed down to 12KB after processing. Given all this clever optimization happening behind the scenes, it'd be nice to see it reflected in the UI—both for peace of mind and to better understand what you're paying for.