Why does Claude Code's WebFetch fail on Reddit, X, or paywalled pages?

WebFetch is a server-side HTTP request — it gets the raw HTML response the server returns to an unauthenticated client. Reddit's content lives in a React SPA that hydrates client-side; X requires a logged-in session; paywalled Substacks return a paywall HTML. The pages exist but WebFetch sees only the shell.

How do I get full Reddit/X/Xiaohongshu content into a Claude Code session?

Use a browser extension (like Web2MD) to convert the page in your real browser, then paste the Markdown into Claude Code. The extension reads the rendered DOM in your authenticated session, sidestepping every server-side block. Pasting clean Markdown also costs ~40% fewer tokens than raw HTML.

Can I automate this — let Claude Code trigger Web2MD?

Yes. Web2MD ships an MCP server. Add it to Claude Code's MCP config and Claude Code can call `web2md.convert(url)` as a tool. It runs the conversion through your browser session (Native Messaging), returns Markdown, Claude Code consumes it like any other tool output.

Does this work with Claude Code's slash commands and skills?

Yes. Build a /research slash command that takes a URL, calls Web2MD's MCP server, and writes the result to a working file. Or wrap the flow in a skill so any Claude Code instance can run it. The conversion step is the one piece the platform does not own.

What about Claude Code subagents reading web content?

Subagents inherit the parent's tool list, so an Explore or research subagent can call the Web2MD MCP server the same way. This is the canonical 'agent reads the web through your browser session' pattern when running locally.

Is this allowed by Anthropic's usage policy?

Personal-research use of public webpages is normal. Reddit/X/Xiaohongshu content you can already see when logged in is what you are converting, not data you would not otherwise have access to. For commercial pipelines that re-distribute content, check the source site's terms separately from Anthropic's policy.

Claude Code Web Research Workflow: Feed Webpages to Claude Code Without WebFetch's Limits

Claude Code became my main interface for non-trivial coding tasks somewhere around the 4.5 release. The thing I expected to work, that does not work, is feeding it real webpages for research. The built-in WebFetch tool fails on most of the pages I actually need: Reddit threads, X status pages, paywalled Substacks, Xiaohongshu posts, anything behind a login or on a modern SPA.

The rest of this post is the workflow I use to fill that gap.

Why WebFetch fails on the pages that matter

WebFetch is a server-side HTTP request. Claude Code's tool sends a GET to the URL, parses whatever HTML comes back, and feeds it to the model. That works fine for:

Static documentation pages (MDN, Python docs, GitHub READMEs)
Wikipedia
Plain blog posts on traditional CMSes

It fails on:

Reddit: Content is rendered client-side in a React SPA. WebFetch gets a 200KB shell of nav, login banners, and the first 1–2 comments — never the full thread.
X (Twitter): SPA + auth-gated. WebFetch lands on a login wall.
Paywalled Substack / paid newsletters: WebFetch sees the paywall, not the article you actually pay for.
Xiaohongshu, WeChat public account articles, Zhihu: SPA + aggressive anti-bot. WebFetch returns empty or a captcha page.
Most modern marketing sites: lazy-loaded content, JavaScript hydration; WebFetch misses the main content.

The pages that matter for real research — niche subreddits, developer threads on X, paid analysis on Substack, Chinese platform content — are exactly the ones WebFetch cannot read.

The workflow that fills the gap

Two pieces:

A browser-side Markdown extractor that reads the actual rendered DOM in your real browser session.
An MCP server that exposes the extractor as a tool Claude Code can call.

Both are what Web2MD ships. The extension converts in the browser, and the MCP server lets Claude Code drive it.

Setup (one time)

Install Web2MD from the Chrome Web Store.
Sign in to Pro (the MCP server is a Pro feature).
Add the MCP server to your Claude Code config:

// ~/.claude/claude_code_config.json (or wherever your MCP servers live)
{
  "mcpServers": {
    "web2md": {
      "command": "npx",
      "args": ["-y", "@web2md/mcp-server"],
      "env": {
        "WEB2MD_API_KEY": "<your key from web2md.org/dashboard>"
      }
    }
  }
}

Restart Claude Code. The tool web2md.convert is now available alongside WebFetch.

Daily use

In any Claude Code session:

> Read https://www.reddit.com/r/ObsidianMD/comments/abc/thread and summarize the top 3 complaints

Claude Code calls web2md.convert(url). Behind the scenes the MCP server speaks Native Messaging to your browser, the extension reads the rendered thread, returns clean Markdown including the full comment tree, and Claude Code summarizes. The same pattern works for X, Xiaohongshu, paywalled content (if you have access), or any other site WebFetch chokes on.

Building a /research slash command

For repeated research patterns, wrap the flow in a slash command. In ~/.claude/commands/research.md:

---
description: Read a webpage with Web2MD, summarize the key points, and save raw + summary to ./research/
---

I need you to research the URL the user provided.

Steps:
1. Call web2md.convert with the URL.
2. Save the full Markdown output to ./research/<slugified-title>.md
3. Write a 5-bullet summary of the main claims.
4. Identify the 3 strongest counterarguments or open questions.
5. Save the summary to ./research/<slugified-title>-summary.md

If web2md.convert fails (extension not running, etc.), fall back to WebFetch
and warn the user that you may have missed dynamic content.

Now /research <url> is a one-line command that produces both the raw archive and a structured analysis. The summary file is your source of truth; the raw .md is there when the summary missed something.

A subagent variant for parallel research

For comparison studies ("compare how 5 different blogs explain X"), spawn parallel subagents:

> Compare how these 5 blog posts explain MCP transport:
>   - https://blog1.example.com/...
>   - https://blog2.example.com/...
>   ...
> Spawn 5 Explore subagents, one per URL. Each calls web2md.convert,
> extracts the 3 main claims, returns a structured summary. Aggregate
> the results into a comparison table.

Each subagent gets clean Markdown, each runs in parallel, each returns a small structured payload. The parent agent aggregates without ever loading the 5 full pages — saves a huge amount of context.

The honest limits

What this workflow does NOT do:

Replace a real scraper. If you need to download 10,000 pages programmatically, Web2MD's browser-bound MCP is the wrong shape. Use the REST API for that — it runs server-side and accepts only pages that work server-side.
Bypass real paywalls. The extension reads what your browser session can see. If you are not subscribed to the Substack, Web2MD reads the paywall, same as WebFetch.
Work in headless CI. The MCP server requires the Chrome extension running in a real browser session. For CI/CD use the REST API instead.

The browser-side approach is for the interactive Claude Code session on your dev machine. That is the workflow this post is about.

Why this matters more than it sounds

Claude Code's core promise is "your AI can drive your dev environment." It already drives the filesystem, your shell, your git. The web is the one input source where the platform handed you a tool (WebFetch) that fails on the pages you actually need. Filling that gap is the difference between "Claude Code can read the web you give it" and "Claude Code can read the web you actually use."

The MCP standard exists exactly for this. Pair Claude Code with a browser-side converter and the workflow is whole.

Install

Web2MD on the Chrome Web Store →

Free tier: 3 conversions/day for the extension. Pro at $9/mo unlocks unlimited conversions, the MCP server, REST API, and bulk export.

Claude Code Web Research Workflow: Feed Webpages to Claude Code Without WebFetch's Limits

Claude Code Web Research Workflow: Feed Webpages to Claude Code Without WebFetch's Limits

Why WebFetch fails on the pages that matter

The workflow that fills the gap

Setup (one time)

Daily use

Building a /research slash command

A subagent variant for parallel research

The honest limits

Why this matters more than it sounds

Install

Related Articles

How to Actually Fill Claude's 1M Context Window (Without Copy-Pasting 200 Webpages)

Feed Authenticated Web Pages to Claude Code

Wikipedia Article to Clean Markdown for AI Research: The 2026 Workflow

Most Read

Latest Articles