Documentation Index
Fetch the complete documentation index at: https://web2md.org/docs/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Web2MD CLI lets you convert any URL to clean Markdown directly from your terminal. Pipe output to LLMs, batch-process URL lists, or ingest content into your Obsidian vault — all without opening a browser.The CLI requires Node.js 18+. Run
node -v to check your version.Installation
No installation needed — just run withnpx:
Modes
Web2MD CLI operates in three modes depending on your configuration:Local
Default mode. No API key required. Fetches pages and converts locally. Works for most public websites.
Server
With API key. Set
WEB2MD_API_KEY to unlock Reddit, Fandom/Wikia, and other restricted sites that require server-side handling.Bridge
With
--bridge flag. Uses your Chrome extension to fetch JS-rendered or login-protected pages that static fetching cannot reach.Flags
| Flag | Description |
|---|---|
--no-images | Strip image references from output |
--no-links | Strip hyperlinks from output |
--meta | Add YAML frontmatter (title, source, wordCount, tokenCount, readingTime, date) |
--json | Output as JSON { markdown, metadata } |
-o, --output <file> | Write output to a file |
--output-dir <dir> | Write each URL to a separate .md file |
--batch <file> | Read URLs from a file (one per line, # = comment) |
--vault <dir> | Obsidian vault mode: saves to <dir>/raw/ and updates <dir>/INDEX.md |
--concurrency <n> | Max parallel fetches (default: 3, max: 20) |
--bridge | Use Chrome extension for JS-rendered or login-protected sites |
-q, --quiet | Suppress progress messages |
Environment variables
| Variable | Description |
|---|---|
WEB2MD_API_KEY | API key (w2m_xxx) for Reddit and restricted sites |
WEB2MD_API_URL | Override the API base URL |
WEB2MD_EXTENSION_ID | Override the Chrome extension ID for --bridge mode |
Usage examples
Basic conversion
Pipe to an LLM
Save to file
--meta flag prepends YAML frontmatter with title, source URL, word count, token count, reading time, and date.
Batch from file
Create a fileurls.txt:
.md file in the ./research directory.
Obsidian vault ingestion
~/Documents/MyVault/raw/ and updates ~/Documents/MyVault/INDEX.md with links to all converted pages.
Reddit with API key
Bridge mode
Use the Chrome extension to handle JS-rendered or login-protected pages:Bridge mode requires the Web2MD Chrome extension to be installed and Chrome to be running. The CLI communicates with the extension via Chrome’s native messaging protocol.
JSON output
jq:
Optimized sites
Web2MD includes built-in adapters for these sites, producing cleaner output than generic conversion:Sites with optimized support
Sites with optimized support
- Wikipedia — clean article extraction, infobox handling
- arXiv — paper abstracts and metadata
- Hacker News — threads with comments
- GitHub — Issues and Pull Requests
- Stack Overflow — questions and answers
- dev.to — blog posts
- Medium — articles (bypasses paywall preview)
- Substack — newsletter posts
- OpenAI Docs — documentation pages
- Mintlify-based docs — documentation sites built on Mintlify
- Reddit — posts and comments (requires API key)
Common workflows
Feed documentation to an AI agent
Feed documentation to an AI agent
./context for grounded answers.Build a research corpus
Build a research corpus
Strip formatting for LLM input
Strip formatting for LLM input