Overview
The Web2MD CLI lets you convert any URL to clean Markdown directly from your terminal. Pipe output to LLMs, batch-process URL lists, or ingest content into your Obsidian vault — all without opening a browser.The CLI requires Node.js 18+. Run
node -v to check your version.Installation
No installation needed — just run withnpx:
Modes
Web2MD CLI operates in three modes depending on your configuration:Local
Default mode. No API key required. Fetches pages and converts locally. Works for most public websites.
Server
With API key. Set
WEB2MD_API_KEY to unlock Reddit, Fandom/Wikia, and other restricted sites that require server-side handling.Bridge
With
--bridge flag. Uses your Chrome extension to fetch JS-rendered or login-protected pages that static fetching cannot reach.Flags
| Flag | Description |
|---|---|
--no-images | Strip image references from output |
--no-links | Strip hyperlinks from output |
--meta | Add YAML frontmatter (title, source, wordCount, tokenCount, readingTime, date) |
--json | Output as JSON { markdown, metadata } |
-o, --output <file> | Write output to a file |
--output-dir <dir> | Write each URL to a separate .md file |
--batch <file> | Read URLs from a file (one per line, # = comment) |
--vault <dir> | Obsidian vault mode: saves to <dir>/raw/ and updates <dir>/INDEX.md |
--concurrency <n> | Max parallel fetches (default: 3, max: 20) |
--bridge | Use Chrome extension for JS-rendered or login-protected sites |
-q, --quiet | Suppress progress messages |
Environment variables
| Variable | Description |
|---|---|
WEB2MD_API_KEY | API key (w2m_xxx) for Reddit and restricted sites |
WEB2MD_API_URL | Override the API base URL |
WEB2MD_EXTENSION_ID | Override the Chrome extension ID for --bridge mode |
Usage examples
Basic conversion
Pipe to an LLM
Save to file
--meta flag prepends YAML frontmatter with title, source URL, word count, token count, reading time, and date.
Batch from file
Create a fileurls.txt:
.md file in the ./research directory.
Obsidian vault ingestion
~/Documents/MyVault/raw/ and updates ~/Documents/MyVault/INDEX.md with links to all converted pages.
Reddit with API key
Bridge mode
Use the Chrome extension to handle JS-rendered or login-protected pages:Bridge mode requires the Web2MD Chrome extension to be installed and Chrome to be running. The CLI communicates with the extension via Chrome’s native messaging protocol.
JSON output
jq:
Optimized sites
Web2MD includes built-in adapters for these sites, producing cleaner output than generic conversion:Sites with optimized support
Sites with optimized support
- Wikipedia — clean article extraction, infobox handling
- arXiv — paper abstracts and metadata
- Hacker News — threads with comments
- GitHub — Issues and Pull Requests
- Stack Overflow — questions and answers
- dev.to — blog posts
- Medium — articles (bypasses paywall preview)
- Substack — newsletter posts
- OpenAI Docs — documentation pages
- Mintlify-based docs — documentation sites built on Mintlify
- Reddit — posts and comments (requires API key)
Common workflows
Feed documentation to an AI agent
Feed documentation to an AI agent
./context for grounded answers.Build a research corpus
Build a research corpus
Strip formatting for LLM input
Strip formatting for LLM input