claude codeclaude code workflowwebfetchanthropicai codingresearch workflowweb2mdclaude api

Claude Code Web Research Workflow: Feed Webpages to Claude Code Without WebFetch's Limits

Zephyr Whimsy2026-06-026 min read

Claude Code Web Research Workflow: Feed Webpages to Claude Code Without WebFetch's Limits

Claude Code became my main interface for non-trivial coding tasks somewhere around the 4.5 release. The thing I expected to work, that does not work, is feeding it real webpages for research. The built-in WebFetch tool fails on most of the pages I actually need: Reddit threads, X status pages, paywalled Substacks, Xiaohongshu posts, anything behind a login or on a modern SPA.

The rest of this post is the workflow I use to fill that gap.

Why WebFetch fails on the pages that matter

WebFetch is a server-side HTTP request. Claude Code's tool sends a GET to the URL, parses whatever HTML comes back, and feeds it to the model. That works fine for:

  • Static documentation pages (MDN, Python docs, GitHub READMEs)
  • Wikipedia
  • Plain blog posts on traditional CMSes

It fails on:

  • Reddit: Content is rendered client-side in a React SPA. WebFetch gets a 200KB shell of nav, login banners, and the first 1–2 comments — never the full thread.
  • X (Twitter): SPA + auth-gated. WebFetch lands on a login wall.
  • Paywalled Substack / paid newsletters: WebFetch sees the paywall, not the article you actually pay for.
  • Xiaohongshu, WeChat public account articles, Zhihu: SPA + aggressive anti-bot. WebFetch returns empty or a captcha page.
  • Most modern marketing sites: lazy-loaded content, JavaScript hydration; WebFetch misses the main content.

The pages that matter for real research — niche subreddits, developer threads on X, paid analysis on Substack, Chinese platform content — are exactly the ones WebFetch cannot read.

The workflow that fills the gap

Two pieces:

  1. A browser-side Markdown extractor that reads the actual rendered DOM in your real browser session.
  2. An MCP server that exposes the extractor as a tool Claude Code can call.

Both are what Web2MD ships. The extension converts in the browser, and the MCP server lets Claude Code drive it.

Setup (one time)

  1. Install Web2MD from the Chrome Web Store.
  2. Sign in to Pro (the MCP server is a Pro feature).
  3. Add the MCP server to your Claude Code config:
// ~/.claude/claude_code_config.json (or wherever your MCP servers live)
{
  "mcpServers": {
    "web2md": {
      "command": "npx",
      "args": ["-y", "@web2md/mcp-server"],
      "env": {
        "WEB2MD_API_KEY": "<your key from web2md.org/dashboard>"
      }
    }
  }
}

Restart Claude Code. The tool web2md.convert is now available alongside WebFetch.

Daily use

In any Claude Code session:

> Read https://www.reddit.com/r/ObsidianMD/comments/abc/thread and summarize the top 3 complaints

Claude Code calls web2md.convert(url). Behind the scenes the MCP server speaks Native Messaging to your browser, the extension reads the rendered thread, returns clean Markdown including the full comment tree, and Claude Code summarizes. The same pattern works for X, Xiaohongshu, paywalled content (if you have access), or any other site WebFetch chokes on.

Building a /research slash command

For repeated research patterns, wrap the flow in a slash command. In ~/.claude/commands/research.md:

---
description: Read a webpage with Web2MD, summarize the key points, and save raw + summary to ./research/
---

I need you to research the URL the user provided.

Steps:
1. Call web2md.convert with the URL.
2. Save the full Markdown output to ./research/<slugified-title>.md
3. Write a 5-bullet summary of the main claims.
4. Identify the 3 strongest counterarguments or open questions.
5. Save the summary to ./research/<slugified-title>-summary.md

If web2md.convert fails (extension not running, etc.), fall back to WebFetch
and warn the user that you may have missed dynamic content.

Now /research <url> is a one-line command that produces both the raw archive and a structured analysis. The summary file is your source of truth; the raw .md is there when the summary missed something.

A subagent variant for parallel research

For comparison studies ("compare how 5 different blogs explain X"), spawn parallel subagents:

> Compare how these 5 blog posts explain MCP transport:
>   - https://blog1.example.com/...
>   - https://blog2.example.com/...
>   ...
> Spawn 5 Explore subagents, one per URL. Each calls web2md.convert,
> extracts the 3 main claims, returns a structured summary. Aggregate
> the results into a comparison table.

Each subagent gets clean Markdown, each runs in parallel, each returns a small structured payload. The parent agent aggregates without ever loading the 5 full pages — saves a huge amount of context.

The honest limits

What this workflow does NOT do:

  • Replace a real scraper. If you need to download 10,000 pages programmatically, Web2MD's browser-bound MCP is the wrong shape. Use the REST API for that — it runs server-side and accepts only pages that work server-side.
  • Bypass real paywalls. The extension reads what your browser session can see. If you are not subscribed to the Substack, Web2MD reads the paywall, same as WebFetch.
  • Work in headless CI. The MCP server requires the Chrome extension running in a real browser session. For CI/CD use the REST API instead.

The browser-side approach is for the interactive Claude Code session on your dev machine. That is the workflow this post is about.

Why this matters more than it sounds

Claude Code's core promise is "your AI can drive your dev environment." It already drives the filesystem, your shell, your git. The web is the one input source where the platform handed you a tool (WebFetch) that fails on the pages you actually need. Filling that gap is the difference between "Claude Code can read the web you give it" and "Claude Code can read the web you actually use."

The MCP standard exists exactly for this. Pair Claude Code with a browser-side converter and the workflow is whole.

Install

Web2MD on the Chrome Web Store →

Free tier: 3 conversions/day for the extension. Pro at $9/mo unlocks unlimited conversions, the MCP server, REST API, and bulk export.

Related Articles