arXiv Paper to Claude Summary: Zero-Install Workflow for Non-Dev Researchers (2026)
arXiv Paper to Claude Summary: Zero-Install Workflow for Non-Dev Researchers (2026)
If you have ever dragged an arXiv PDF into Claude.ai, watched the upload bar crawl, and then asked "summarize this" only to get a reply where the equations are mangled and the table-of-contents got summarized instead of section 3 — this post is for you.
It is also explicitly not for you if you are happy installing a Claude Skill, running an MCP server, or writing Python. There are already a dozen good developer-grade arxiv-to-markdown tools. This workflow is for the larger group: graduate students, self-learners, and applied researchers who use Claude.ai in a browser tab and want clean paper input without touching the terminal.
The problem
You found a promising paper on arXiv. You want to ask Claude to summarize it, explain section 4, or compare it to two other papers you have read. So you download the PDF, drag it into Claude.ai, and wait. The upload is slow, the math comes back as softmax(QK^T / d k) instead of the actual formula, tables come back as one long line of cells, and the references are dead text — Claude cannot follow up on [12] because there is no link attached. You end up doing more cleanup than reading.
Why arXiv papers are hard for Claude (and other LLMs)
arXiv PDFs are LaTeX-rendered output, not text-first documents. That has three consequences for any LLM ingest:
Equations are baked into the visual layer. A PDF parser sees the rendered glyphs of an equation, not the underlying TeX. Even good parsers often produce garbled text for anything more complex than a single inline symbol. Claude is generally robust about this, but on heavy-math papers the noise compounds.
Multi-column layout confuses reading order. Two-column PDFs sometimes parse as alternating snippets jumping between columns, especially when a figure or equation breaks the column. Section ordering can scramble.
The HTML version exists but most users do not know it. Almost every arXiv paper since 2022 has a parallel HTML render on ar5iv.labs.arxiv.org (and increasingly on arxiv.org itself under the "HTML" link near "Download PDF"). That HTML version is generated directly from the LaTeX source, so equations are kept as MathJax with TeX behind them, references are real links, and tables are real <table> elements. This is the version a Markdown converter wants to eat.
Abstract pages are link-dense and citation-noisy. The /abs/ page has the abstract plus 30+ navigation and metadata links (download, bibtex, related, NASA ADS, code, etc.). Pasting raw text from it to Claude wastes context on navigation chrome.
Web2MD workflow for arXiv to Claude
Three steps. No install beyond the Chrome extension.
Step 1: Open the right arXiv page. For just the abstract and metadata, use the normal arxiv.org/abs/XXXX.XXXXX page. For the full paper with equations and references, click the "HTML" link on the abstract page (or replace arxiv.org with ar5iv.labs.arxiv.org in the URL). The HTML render is what you want — it preserves the LaTeX source.
Step 2: Click the Web2MD toolbar icon. Web2MD converts the page to Markdown in your browser. For ar5iv pages it preserves: section structure, inline math as $...$, display math as $$...$$, tables as Markdown tables, reference list with links to the cited papers' arXiv pages or DOIs, and figure captions (figures themselves are alt-text or skipped — Claude cannot see them anyway in pasted text).
Step 3: Paste into Claude.ai. Open a new conversation in Claude.ai, paste, then ask your question. For long papers, the paste will be 15-40k tokens of clean Markdown instead of a multi-megabyte PDF upload. Claude reads it in one pass, the equations render correctly in Claude's reply, and when you ask "what does the paper cite for X?" Claude can quote the actual reference with a working link.
A useful prompt template after pasting:
I just pasted the full Markdown of an arXiv paper above.
1. Give me a 6-bullet summary aimed at a researcher in [my field].
2. Explain section 3 in 200 words including the key equation.
3. List the 3 most important references and what each one contributed.
That is the entire workflow. No download, no Python, no Skill registration.
Real example: reading "Attention Is All You Need" through Claude
Take Vaswani et al. 2017, the Transformer paper. It is a good test case because (a) it is public and well-known so any output is verifiable, (b) it has nontrivial equations (the scaled dot-product attention formula), and (c) it has a sizeable reference list.
Step 1. Open arxiv.org/abs/1706.03762. Click the HTML link to get the ar5iv render at ar5iv.labs.arxiv.org/html/1706.03762. You see the full paper rendered as a webpage with proper math typesetting.
Step 2. Click Web2MD. The extension converts the page. You get clean Markdown with:
- The title and authors as
#and a byline - The abstract as a paragraph (no boilerplate)
- Each section as
##, subsections as### - The attention equation
Attention(Q, K, V) = softmax(QK^T / sqrt(d_k)) Vpreserved as display TeX - Tables (e.g. the BLEU score comparison) as actual Markdown tables
- The reference list as a numbered list where each entry has its citation text plus a link to the cited paper
Step 3. Paste into Claude.ai. The paste sits around 18k tokens — well within Claude's free-tier conversation limit and a small fraction of a Pro session. Now you can ask:
- "What is the intuition for dividing by
sqrt(d_k)in the attention formula?" Claude pulls the actual formula from the pasted text and explains it. - "Compare the architecture in this paper to a vanilla seq2seq with attention." Claude has the full Section 3 to reason over, not just an abstract.
- "Which papers in the references are about positional encoding?" Claude can scan the reference list and quote the cited titles, with links you can click to follow up.
Compare that to dragging the PDF in: the upload alone is slower than the conversion, and you still cannot click [16] afterwards because PDF references do not survive as links.
This same flow works for any paper. The only judgment call is whether to grab the abstract page (for a quick "is this paper relevant?" triage) or the HTML version (for an actual read).
Web2MD vs alternatives
There is no single best tool — each option fits a different user. Honest comparison:
| Approach | Setup | LaTeX fidelity | Speed | Best for | | --- | --- | --- | --- | --- | | Web2MD Chrome extension | 30-sec install | Preserved from ar5iv source | One click | Non-dev researchers in Claude.ai web | | Claude PDF upload | None (native) | Variable on math/tables | Slow on long papers | Quick one-off triage | | arxiv-to-md CLI / MCP / Skill | Python or Skill registration | Excellent | Fast in batch | Developers, batch processing | | Manual copy from ar5iv | None | Good (you paste rendered text) | Slow per paper | One paper, no extension allowed | | SciHub-based scrapers | Varies | Varies | n/a | Skip — legal and ethics concerns |
Three notes on this table:
Claude native PDF upload is fine for short or simple papers. It is the right tool when you just want a quick read of a 6-page note. It struggles on long math-heavy papers because of the PDF parsing limits described above.
Developer tools are strictly better if you are a developer. A Claude Skill or a Python script that batch-converts 30 papers a week is more powerful than a manual click-per-paper flow. The point of Web2MD is to give the same end result to people who do not write code.
Manual copy from ar5iv is the zero-tool fallback. If you cannot install an extension (locked-down work machine, for example), opening ar5iv and copying section-by-section into Claude gets you most of the benefit. It is just tedious for anything longer than a short letter.
FAQ
Does this work for papers that do not have an ar5iv HTML version? Yes, but with caveats. For papers older than ~2022 or unusual LaTeX setups, an HTML render may not exist. In that case Web2MD will convert the abstract page (still useful for triage) and you fall back to either Claude PDF upload for the full text or the manual route.
Will Claude render the math correctly in its reply? Claude.ai renders inline math ($...$) and display math ($$...$$) as rendered equations. If you paste TeX in and ask Claude a question about it, the reply will show real equations.
Is this against arXiv's terms? No. arXiv content is licensed for redistribution under each paper's chosen license (almost always permissive for non-commercial reading). You are converting public HTML to a different text format for personal reading. The ar5iv project itself is run by arXiv-aligned researchers.
What about privacy if I paste a draft I am co-authoring? Web2MD's conversion runs locally in your browser. The paste from your clipboard into Claude.ai goes through Anthropic, which has its own data policies (Claude.ai Free and Pro do not train on user data by default as of mid-2026 — check current policy). If a draft is highly sensitive, do not paste it anywhere, including Claude.
Is the free Web2MD tier really enough for a researcher? For one-by-one reading, yes. The free Chrome extension does single-page conversion unlimited. Paid plans add queue export (batch multiple papers into one Markdown bundle) and history sync across devices, which matter mainly for literature-review workflows where you are reading 20+ papers a week.
If you are already a Claude.ai user, the marginal cost of trying this is 30 seconds of extension install. The first paper that comes back with crisp equations and clickable references is usually convincing.