xiaohongshurednote小红书feishularkai workflowmarkdownchinese social mediaknowledge management

Xiaohongshu to Feishu / Lark Workflow: Save Chinese Social Posts as AI-Ready Markdown

Zephyr Whimsy2026-05-106 min read

Xiaohongshu to Feishu / Lark Workflow: Save Chinese Social Posts as AI-Ready Markdown

Xiaohongshu (RED / 小红书) is one of the highest-signal content sources for Chinese-speaking knowledge workers — product reviews, lifestyle research, niche community insights. But the content format is hostile to note-taking tools.

Copy-paste loses image links. Screenshots block AI. Firecrawl / Jina-style external scrapers get blocked at the anti-bot layer. Most "Xiaohongshu clipper" tools have either died or have data integrity problems.

This is the workflow I've built and stress-tested over the past 6 months for capturing Xiaohongshu content into Feishu / Lark (or Obsidian / Notion — same pattern).

What "complete" Xiaohongshu content includes

A Xiaohongshu note has more than just title + body:

| Content type | Copy-paste | Screenshot | External scraper | Browser extension | |---|---|---|---|---| | Title + body text | ✅ | ❌ (OCR loss) | ❌ blocked | ✅ | | Image URLs | ❌ | N/A | ❌ | ✅ | | Author + IP location (省份) | ❌ | ✅ visual | ❌ | ✅ | | Hashtags | partial | ❌ | ❌ | ✅ | | Likes / saves / comments count | ❌ | ✅ visual | ❌ | ✅ | | Comment thread | ❌ | ❌ (requires scroll) | ❌ | ✅ |

Only "extension reading inside your authenticated browser" captures the full data set.

Why server-side scrapers fail (technical detail)

Xiaohongshu's anti-scrape stack:

  1. Signed requests (x-s, x-t): Every API call must include cryptographic signatures generated by JavaScript that runs in a real browser context. The signature rotates monthly. Pure HTTP requests can't generate it.
  2. Cloudflare-style edge filtering: server IPs are blacklisted; residential IPs from real users pass through.
  3. JavaScript-rendered Vuex store: post content lives in a Vuex state object that's hydrated after page load. curl returns an empty SPA shell.

Browser extensions don't fight this stack — they bypass it by reading the rendered DOM after Xiaohongshu's own JavaScript has done all the signing and decoding.

The 3-step workflow

Step 1: Open the Xiaohongshu post in Chrome

Just visit it. Normal browser, normal logged-in session. No proxies, no special config.

Step 2: Convert to Markdown with Web2MD

Web2MD is a Chrome extension with a dedicated Xiaohongshu extractor that pulls from the Vuex store with DOM and meta-tag fallbacks. Click the extension icon, Markdown auto-copies to clipboard:

# [Note title]

**Author**: 小明 (上海)
**Likes**: 12,000 · **Saves**: 8,500 · **Comments**: 234

#tag1 #tag2 #tag3

## Body

Recently discovered a really useful tool...

### Images

![image1](https://sns-img-bd.xhscdn.com/...)
![image2](https://sns-img-bd.xhscdn.com/...)

## Top comments

**User A** (Zhejiang): I use it the same way, agree with OP...
**User B** (Beijing): Are there similar tools for X?

Captures: title / body / images / author / IP location (省份) / hashtags / engagement / comments.

Step 3: Paste into Feishu / Lark

Open a new Feishu cloud doc, press ⌘V. Feishu auto-detects Markdown:

  • Headings render as Feishu heading styles
  • Images get auto-uploaded to Feishu CDN (so external links can't break later)
  • Tags become searchable Feishu tags

Done. Total time: 1-2 minutes per note.

Advanced: batch capture + AI synthesis

For deeper research workflows like "research a Xiaohongshu topic + summarize with Claude":

Batch conversion (Web2MD Pro + Claude Code)

I'm researching RAG tools for a project. Convert these 20 Xiaohongshu posts:

agent_batch_convert(urls=[
  "https://www.xiaohongshu.com/explore/abc123",
  "https://www.xiaohongshu.com/explore/def456",
  ...
])

Then summarize what pain points users discuss about RAG tooling.

Web2MD's Agent Bridge opens 20 background tabs in your real Chrome (your auth session), each scrapes its post in parallel, returns clean Markdown to Claude. Claude does the synthesis.

You don't manually open each post.

Pair with Feishu's AI (Doubao) for team workflows

If your team uses Feishu:

  1. Paste merged Markdown into a Feishu cloud doc
  2. @Doubao in the doc, ask it to summarize

Doubao handles Chinese content noticeably better than English-first models. Combined with Feishu's collaborative context (teammates can see the source doc), this fits team research patterns well.

Real use cases

1. Content creator: competitor analysis

Weekly capture 10-20 posts from KOLs in your niche → Markdown → Claude summarizes trending topics, user pain points, viral title patterns. Saves 4-5 hours of manual reading per week.

2. AI engineer: user-research corpus

Building a B2C product, want to understand Xiaohongshu users' complaints about a competitor → capture 30-50 relevant posts/week into RAG → query as "real user voice" data for product decisions.

3. Knowledge management: second brain

Read a great post, immediately convert → Markdown → Obsidian / Notion / Feishu. Stop relying on Xiaohongshu app's "saved" tab (Xiaohongshu can take down posts; accounts can be lost; the in-app search can't run AI processing).

Risk notes

Copyright and compliance:

  • Personal collection / study → fair use for most public posts
  • Commercial use / republishing → ask the author for permission
  • Large-scale batch scraping → throttle (< 10 conversions/min) to avoid Xiaohongshu's risk control

Account safety:

  • Web2MD does not upload your account cookies; conversion happens locally in your browser
  • Any browser-side automation has small risk of triggering Xiaohongshu's anti-fraud detection — avoid sustained high-frequency conversion

Tool stack

| Tool | Role | Cost | |---|---|---| | Web2MD | Browser-side Xiaohongshu → Markdown | Free 3/day, Pro $9/mo | | Feishu Cloud Docs | Markdown collection, team sharing | Personal tier free | | Claude / Doubao | AI summarization / analysis | Claude $20/mo, Doubao free | | Obsidian (Feishu alternative) | Local Markdown vault | Personal free |

Try it

Web2MD on Chrome Web Store. Free tier (3 conversions/day) covers casual use. Pro $9/mo (7-day free trial) unlocks unlimited + batch conversion + Agent Bridge for the agentic workflow above.


Core idea: your browser is already paid for and authenticated — let your AI tools read content through it. Server-side scraper era is ending; browser-side reading is the AI-era pattern for content access.

The same workflow generalizes to: WeChat Official Accounts (公众号), Zhihu, Bilibili columns, Jike (即刻), sspai (少数派), Twitter/X behind login — anywhere external scrapers fail because of anti-bot or auth.

Related Articles