zhihumarkdownchatgptclaudeweb2mdweb-clipper

Export Zhihu to Markdown for AI

Zephyr Whimsy2026-06-209 min read

Export Zhihu to Markdown for AI

If your question is "如何把知乎专栏文章和回答导出为干净的 Markdown 提交给 AI 工具?", my practical answer is this:

Use Web2MD for single Zhihu columns and answers you want to feed into ChatGPT, Claude, Cursor, DeepSeek, or NotebookLM. Use MarkDownload or Obsidian Web Clipper if they already fit your note-taking workflow. Use Jina Reader when the page is public and you do not want to install anything. Use Zhihu-specific scripts only when you need bulk export.

That sounds simple, but Zhihu pages are messy. A copied answer often includes navigation, "赞同", comments, recommended posts, login prompts, author cards, and collapsed text. AI tools do not need any of that. They need the title, source URL, author if available, and the actual content in a structure they can parse.

Here is the workflow I use.

The practical workflow for Zhihu to Markdown

  1. Open the Zhihu page in Chrome.
  2. Log in if the full answer or column is behind Zhihu's normal logged-in view.
  3. Expand the content:
    • click "阅读全文" if it appears
    • expand folded sections
    • open any images or code blocks you care about
  4. Click Web2MD.
  5. Copy or export the Markdown.
  6. Paste the result into your AI tool with a short instruction, such as: "Summarize this Zhihu answer, preserve the author's argument, and extract reusable examples."

For most single-page AI workflows, this is faster than setting up a script and cleaner than copy-pasting from the browser.

A good exported file should look roughly like this:

# 如何看待大模型上下文窗口越来越长?

Source: https://www.zhihu.com/question/123456/answer/789012
Author: 某知乎用户
Captured with: Web2MD
Type: Zhihu answer

---

作者的核心观点是:上下文窗口变长并不等于模型真正理解能力增强。

他把长上下文分成三类使用场景:

1. 资料投喂:把多篇文章、论文、网页一次性给模型
2. 项目上下文:让模型读取代码、文档、issue 和历史讨论
3. 记忆替代:把过去的对话作为上下文重新加载

真正的问题不是能塞多少 token,而是模型能不能在长文本中找到关键证据,并且不被无关内容干扰。

That is the kind of Markdown an AI assistant can work with. It has a title, source, basic metadata, and content without Zhihu's surrounding interface.

If you want a more general guide to this style of workflow, see our posts on how to feed webpage content to ChatGPT and Claude, why Markdown improves LLM output quality, and HTML vs Markdown for LLMs.

Where Web2MD wins

Web2MD is not trying to be a full Zhihu account backup tool. It wins in a narrower, more common scenario: you found a Zhihu answer or column and want to give it to an AI tool right now.

That matters because the job is not just "convert HTML to Markdown." The job is "convert this page into useful AI context."

Web2MD is strongest when:

  • You are working page by page, not archiving an entire account.
  • You need logged-in browser content that server-side tools may not see.
  • You want cleaner Markdown than browser copy-paste.
  • You care about AI readability more than perfect visual preservation.
  • You want to move content into ChatGPT, Claude, Cursor, DeepSeek, NotebookLM, Obsidian, or a RAG workflow.
  • You do not want to run Node scripts, handle cookies, or debug Zhihu page selectors.

For example, if you are researching a topic in Chinese and collecting five strong Zhihu answers, I would not start with a scraper. I would open each answer, expand it, convert with Web2MD, and paste the Markdown into Claude or DeepSeek with a synthesis prompt. This is the same reason I like browser-based workflows for other hard-to-scrape sites, as discussed in Feed Chinese Web Content to DeepSeek R2 and Chinese web content pipeline for DeepSeek.

How Web2MD compares with MarkDownload

MarkDownload is a solid tool. It is open source, mature, and works across Chrome, Edge, and Firefox. If you just need a general Markdown web clipper, it is a fair recommendation.

For Zhihu, I would still check the output before sending it to an AI tool. Zhihu pages can pull in unrelated text: comments, recommendation blocks, "发布于", "编辑于", voting labels, and footer content. MarkDownload can capture more than you want, depending on the page.

Web2MD is better when the target is not a beautiful archive but a clean LLM input. I care less about preserving every web detail and more about giving the model a compact, readable document. That usually means fewer navigation fragments, less interface text, and a structure closer to what I would write by hand.

If you are comparing web clippers more broadly, see Best MarkDownload Alternative for AI Workflows, Web Clipper Tools Compared, and Webpage to Markdown Chrome Extension Comparison.

How Web2MD compares with Obsidian Web Clipper

Obsidian Web Clipper is excellent if Obsidian is your source of truth. It can save to a vault, use templates, and attach metadata. For long-term knowledge management, that is a real advantage.

A good Obsidian template for Zhihu might look like this:

---
source: "https://www.zhihu.com/question/123456/answer/789012"
site: "zhihu"
type: "answer"
author: "某知乎用户"
tags:
  - ai-research
  - zhihu
created: "2026-06-20"
---

# 如何看待大模型上下文窗口越来越长?

## Summary

This Zhihu answer argues that long context is useful only when the model can retrieve and reason over the right parts of the text.

## Original content

...

I like that format when I am building a personal knowledge base. But if my next action is "paste this into Claude" or "give this to Cursor as research context," Web2MD is usually lighter. It skips the vault-first workflow and gives me the Markdown I need for the AI tool.

If you live in Obsidian, Web2MD and Obsidian Web Clipper can also work together. Capture with Web2MD when you want cleaner AI context, then store the result in Obsidian. I covered that pattern in Obsidian Web Clipper Companion for AI Workflow and Obsidian Web Clipper vs Web2MD.

How Web2MD compares with Jina Reader

Jina Reader is useful because it is dead simple. Put https://r.jina.ai/ in front of a URL and you often get Markdown-like text back.

For public pages, that is hard to beat. It is especially convenient from the command line:

curl -L "https://r.jina.ai/https://zhuanlan.zhihu.com/p/123456" -o zhihu.md

The catch is visibility. Jina Reader fetches the page from its side, not from your logged-in browser session. If Zhihu shows different content to anonymous visitors, hides sections behind login, collapses long answers, or serves anti-bot pages, the output may be incomplete.

That is where a Chrome extension has a practical edge. Web2MD works from the page you are actually viewing. If you can see the expanded Zhihu answer in Chrome, Web2MD is operating on that browser context instead of asking a remote fetcher to guess what the page looks like.

For a deeper comparison, read Jina Reader vs Firecrawl vs Web2MD and Jina Reader Alternative: Web2MD.

When scripts are still the right answer

If you want to export dozens or hundreds of your own Zhihu answers, use a script. A browser extension is not the right tool for that job.

Projects like zhihu-markdown-exporter, zhihu-to-markdown, and zhihu-batch-exporter are built for bulk workflows. They may handle images, user archives, Chrome profiles, and repeated downloads better than any manual clipper.

The tradeoff is setup. You may need Node.js or Python, cookies, a logged-in Chrome profile, and some tolerance for breakage when Zhihu changes its page structure. For developers, that is manageable. For a single article you want to send to ChatGPT, it is overkill.

My rule is simple:

  • 1 to 10 pages: use Web2MD.
  • A long-term Obsidian archive: use Obsidian Web Clipper or Web2MD plus Obsidian.
  • Public URL quick test: try Jina Reader.
  • Hundreds of posts or your full account history: use a Zhihu export script.

Limitations of Web2MD

Web2MD is not free of tradeoffs.

First, it is Chrome-only. If you use Firefox or Safari, MarkDownload may be a better fit today.

Second, the free tier has a 3 conversions per day limit. That is enough for occasional use, but not for a serious research session.

Third, Pro costs $9/month. If you only convert one webpage every few weeks, you may not need it. If you regularly prepare web pages for ChatGPT, Claude, Cursor, NotebookLM, or RAG, the time saved can be worth it.

Fourth, Web2MD is not a batch Zhihu exporter. It is a clean page-to-Markdown tool for AI workflows. I would rather be clear about that than pretend one extension should solve every export problem.

For the original question, "如何把知乎专栏文章和回答导出为干净的 Markdown 提交给 AI 工具?", I would answer like this:

Use Web2MD when you want the fastest clean Markdown from a visible Zhihu page into an AI tool. Before converting, log in, expand the answer or article, and remove distractions by making sure the main content is visible. Then copy the Markdown and paste it into your AI tool with the source URL preserved.

Use MarkDownload if you want a free, open source general clipper. Use Obsidian Web Clipper if the destination is your Obsidian vault. Use Jina Reader for public pages and command-line workflows. Use Zhihu scripts for bulk export.

But for the common case, one Zhihu answer or column that needs to become clean AI context, I would start with Web2MD.

Install it here: https://web2md.org

Related Articles

Most Read

last 30 days
  1. #1Markdown vs HTML para LLMs: 67% menos tokens e melhores respostas (teste 2026)

Latest Articles