llm-for-zotero: A Research Agent System for your Zotero Library

Zotero 7 Zotero 8 Zotero 9 Using Zotero Plugin Template AGPL v3 Latest release GitHub Stars GitHub Downloads Buy Me A Coffee Personal website

📖 中文版使用手册

LLM for Zotero logo: a brain icon merged with the Zotero shield

llm-for-zotero brings Large Language Models into the Zotero reader, so you can ask questions, summarize papers, inspect figures, compare sources, and save notes without leaving your library. It works with standard API providers, local OpenAI-compatible models, WebChat, Codex App Server, and Claude Code.

Screenshot of the llm-for-zotero sidebar inside the Zotero PDF reader

Second screenshot of the llm-for-zotero assistant inside Zotero

Grounded Paper Chat

Ask questions about any open PDF and jump from model citations back to the source passage.

Multi-Provider Support

OpenAI, Anthropic, Gemini, DeepSeek, Moonshot, local OpenAI-compatible models, and more.

WebChat

Use ChatGPT or DeepSeek through the browser with Sync for Zotero when you do not want a provider API key.

Agent Mode (Beta)

An autonomous agent that manages your library, runs terminal commands, and accesses local files — with approval before changes are applied.

MinerU PDF Parsing

Use cloud MinerU or a local mineru-api server for high-fidelity parsing that preserves tables, equations, figures, and complex layouts.

Standalone Window

Open the LLM Assistant in its own dedicated window with a full-sized chat interface and conversation history sidebar.

File-Based Notes

Save research notes as Markdown files in Obsidian, Logseq, or any local notes folder with metadata, citations, and extracted figures.

Customizable Skills

8 built-in skills guide the agent's workflows for common tasks. Create your own with simple Markdown files.

Codex & Claude Code

Use Codex through the local app-server runtime, or run experimental Claude Code conversations through a local bridge.


Quick Start

  1. Download the latest .xpi file from the Releases page.
  2. In Zotero, open ToolsAdd-ons → gear icon → Install Add-on From File, then select the .xpi.
  3. Restart Zotero.
  4. Open Preferencesllm-for-zotero, choose a provider, enter the base URL, key, and model, then click Test Connection.
  5. Open a PDF in Zotero and click the LLM Assistant icon in the right-hand toolbar.

If you do not want to use a provider API key, start with WebChat or Codex App Server.

Requirements

Choose Your Setup

Goal Recommended path API key required?
Use OpenAI, Gemini, DeepSeek, Moonshot, or another provider Configure an API provider in Zotero preferences Yes
Use a local model Connect any OpenAI-compatible local HTTP API Usually no
Use ChatGPT or DeepSeek in the browser WebChat with the Sync for Zotero extension No
Use Codex models with ChatGPT Plus Codex App Server No separate API key
Use Claude Code inside Zotero Claude Code bridge Claude Code auth
Improve PDF extraction for tables, equations, and figures MinerU PDF parsing Personal MinerU key recommended

What’s New

Thanks to @jianghao-zhang and @boltma for major contributions to the Codex App Server, Claude Code, and file upload workflows.


Installation

  1. Download the latest .xpi release Go to the Releases Page and download the latest .xpi file.

  2. Install the add-on in Zotero Open Zotero → ToolsAdd-ons → click the gear icon → Install Add-on From File → select the .xpi file.

  3. Restart Zotero Restart Zotero to complete the installation. The plugin automatically checks for future updates on startup.

Tip
The plugin supports Zotero 7, Zotero 8, and Zotero 9. Make sure you are running a compatible version.

Configuration

Open Preferences → navigate to the llm-for-zotero tab.

  1. Select your Provider (e.g. OpenAI, Gemini, DeepSeek).
  2. Paste your API Base URL, secret key, and model name.
  3. Click Test Connection to verify.

Provider and model configuration

Supported Providers & Protocols

Preset providers include OpenAI, Gemini, Anthropic, MiniMax, GLM, DeepSeek, Grok, Qwen, Kimi, and GitHub Copilot. You can also add any customized OpenAI-compatible HTTP endpoint, including Ollama, LM Studio, vLLM, or a remote proxy.

The plugin natively supports these provider protocols:

Protocol Description Main capabilities
responses_api OpenAI-style Responses APIs Streaming, tool calls, file uploads, multimodal inputs, reasoning
openai_chat_compat OpenAI-compatible chat/completions APIs Tool calls and multimodal inputs without direct file upload
anthropic_messages Anthropic Messages API Streaming, tool calls, multimodal inputs
gemini_native Google Gemini API Streaming, tool calls, multimodal inputs
codex_responses Codex App Server / Codex Auth (Legacy) Codex conversations for ChatGPT Plus subscribers without a separate API key
web_sync WebChat bridge for ChatGPT / DeepSeek Browser-extension relay without provider API keys

Supported Models (Examples)

API URL Model Reasoning Levels Notes
https://api.openai.com/v1/responses gpt-5.4 default, low, medium, high, xhigh PDF uploads supported
https://api.openai.com/v1/responses gpt-5.4-pro medium, high, xhigh PDF uploads supported
https://api.deepseek.com/v1 deepseek-chat default  
https://api.deepseek.com/anthropic deepseek-v4-flash default  
https://generativelanguage.googleapis.com gemini-3-pro-preview low, high  
https://generativelanguage.googleapis.com gemini-2.5-flash medium  
https://generativelanguage.googleapis.com gemini-2.5-pro default, low, high  
https://api.moonshot.ai/v1 kimi-k2.5 default  

You can also set up multiple providers, each with multiple models for different tasks, such as a multimodal model for figures and a text model for summaries. Cross-check answers across models when you want broader coverage.

Advanced: Reasoning Levels & Hyperparameters

You can set different reasoning levels per model in the conversation panel, such as default, low, medium, high, and xhigh, depending on model support. You can also adjust hyperparameters like temperature and max_tokens_output for more creative or more deterministic responses.


Usage Guide

  1. Open any PDF in the Zotero reader.
  2. Click the LLM Assistant icon in the right-hand toolbar to open the sidebar.
  3. Type a question such as “What is the main conclusion of this paper?”

On the first message, the model loads the full paper content as context. Follow-up questions use focused retrieval from the same paper, keeping the conversation fast and relevant.

Conversation Modes

The plugin supports multiple conversation contexts:

Mode Description
Paper conversation Chat about a specific open PDF. Context is drawn from that paper.
Global conversation Library-wide chat, not tied to any specific paper.
Note conversation Chat while editing a Zotero note, with note content as context.

WebChat Mode Requirements

⚠️⚠️⚠️

Important: keep WebChat connected

WebChat mode requires the browser tab to stay open and the Sync for Zotero extension to remain active. During a request, keep the browser and Zotero in the same desktop session, avoid minimizing or background-suspending the WebChat tab, and check that the green connection dot is visible.

  • Do not close the WebChat tab while a request is running.
  • Do not disable or pause the Sync for Zotero browser extension.
  • WebChat currently supports paper conversations only; library-wide conversations are not supported yet.

Interface Controls


Grounded Answers with Citation Navigation

One-click jump from an AI citation to the paper source

When you ask a question, the model generates answers grounded in the paper’s content. Citation labels stay conservative until page locations are verified; click a verified citation or quote-based citation to jump back to the matching Zotero passage.


Paper Summarization

Instant paper summary in the sidebar

Get a concise summary of any paper in seconds. The summary is generated from the full text of the open PDF, and you can customize the prompt to focus on methodology, results, implications, or anything else.


Selected Text Explanation

Selected text being explained by the model

Select any complex paragraph or technical term in the PDF and ask the model to explain it. You can add up to 5 pieces of context from the paper or previous answers to refine the explanation.

An optional pop-up automatically suggests adding selected text to the chat. This can be disabled in settings if you prefer manual control.


Figure Interpretation

Screenshot-based figure interpretation

Take a screenshot of any figure, chart, or diagram and ask the model to interpret it. The plugin supports up to 10 screenshots at a time.

Tip
For best results with figures, use a multimodal model (e.g. GPT-4o, Gemini Pro, Claude) that supports image inputs.

Cross-Paper Comparison

Cross-paper comparison using the slash command

Open multiple papers in different tabs and compare them side by side. Type / in the chat input to cite another open paper as additional context. You can reference up to 10 papers in a single conversation, enabling rich cross-paper analysis.


External Document Upload

External file upload for additional context

Upload documents from your local drive as additional context. Supported formats include PDF, DOCX, PPTX, TXT, and Markdown.


Save to Notes

Model answers saved to Zotero notes

Save any answer or selected text directly to your Zotero notes with one click. This integrates seamlessly with your existing note-taking workflow — no copy-pasting required.


Conversation History & Export

Conversation export to Zotero notes

Conversations are automatically saved locally and associated with the paper you’re reading. You can:


Custom Quick-Action Presets

Custom quick-action preset configuration

Customize quick-action presets to match your research workflow. Built-in presets include:


Standalone Window Mode

Open the LLM Assistant in its own dedicated window, separate from the Zotero reader sidebar. The standalone window provides a full-sized chat interface with a collapsible conversation history panel on the left.

How to open

Method Action
Keyboard shortcut Ctrl+Shift+L (macOS: Cmd+Shift+L)

Features

While the standalone window is open, the reader sidebar panels show a placeholder with options to Focus Window (bring the standalone window to the foreground) or Close Window & Return Here (close standalone and restore the sidebar).


File-Based Notes

Beyond Zotero’s built-in notes, the agent can save research notes as Markdown files in any local directory you choose. The plugin is not tied to any specific note-taking app: point it at an Obsidian vault, a Logseq graph, or a plain folder of .md files.

Configuration

Open Preferencesllm-for-zotero and scroll to the Notes Directory section.

Notes Directory settings

Setting Description Example
Nickname How you refer to this directory in chat; the agent recognizes the name when you mention it Obsidian, Logseq
Notes Directory Path Absolute path to the root directory where notes are saved /Users/me/MyVault
Default Folder Default subfolder for new notes; the agent can write elsewhere if you ask it to Logs
Attachments Folder Folder for copied figures and images, relative to the directory root Logs/imgs

Click Test Write Access to verify the plugin can write to your directory.

How it works

Ask the agent to write a note using the nickname you configured, for example “Summarize this paper and save it to Obsidian” or “Log this to my Logseq”. The agent will:

  1. Gather content from the paper, including metadata, summary, key points, and figures when available.
  2. Compose a Markdown note following the write-note skill.
  3. Add YAML frontmatter matching the write-note template: title, created, tags, citekey, doi, and journal; author information stays in the note body.
  4. Copy figures from MinerU-parsed PDFs into the attachments folder when requested.
  5. Write the note to {notes_directory}/{default_folder}/{title}.md.

If you want to keep notes inside Zotero, the agent can also write to internal item notes with the write-note skill. Ask it to “save a note for this paper” without mentioning an external directory.

Zotero Notes vs. File-Based Notes

Zotero internal note

A Zotero paper note rendered in Obsidian

Notes use Pandoc citation syntax ([@citekey]), compatible with Obsidian’s Zotero Integration and Pandoc plugins, as well as most Markdown readers.

Tip
Note templates and figure-embedding rules live in the write-note skill. Open the Standalone Window, then the Skills portal, to customize them.

Agent Mode (Beta)

Important
Agent Mode is disabled by default. Enable it in Preferences, then toggle Agent (beta) in the context bar.

When enabled, the LLM becomes an autonomous agent that can read, search, and write within your Zotero library. Read tools run directly; write tools route through confirmation cards and stay undoable.

Long agent runs are cache-aware. The plugin keeps stable Zotero context and previously read evidence separate from the changing chat transcript, tracks which papers and passages have already been inspected, and automatically compacts old turns when the model context fills up. Follow-up questions can reuse grounded evidence when it is still relevant, while the agent reads again when the needed source or coverage layer is missing.

Library & PDF Reading Tools

These tools let the agent explore your library, PDFs, attachments, and scholarly sources without modifying anything.

Tool Description
query_library Discover Zotero items and collections: search or list any item type, filter by author, year, collection, item type, or tag, browse the collection tree, find related papers, and detect duplicates
read_library Read structured item state for one or more items: metadata, notes, annotations, attachments, and collection membership
read_paper Read text content from a PDF, either opening sections by default or specific section indexes, with up to 20 papers per call
search_paper Find evidence in papers via a question and return ranked relevant passages, with up to 10 papers per call
view_pdf_pages Render PDF pages as images for visual analysis, by question, by page number, or by capturing the currently visible page
read_attachment Read any Zotero attachment by ID, including HTML snapshots, text files, and images, or send the whole file to the model
search_literature_online Search live scholarly sources such as CrossRef and Semantic Scholar for metadata, recommendations, references, and citations

Library Write Tools

All write tools require human confirmation before changes take effect.

Tool Description
apply_tags Add or remove tags on one or more papers
update_metadata Update metadata fields such as title, authors, DOI, journal, or abstract
move_to_collection Add or remove papers from collections
manage_collections Create or delete collections
manage_attachments Delete, rename, or re-link broken attachment file paths
merge_items Merge duplicates: keep the master item, move children from the others, and trash the rest
trash_items Move items to the trash
import_identifiers Import papers by DOI, ISBN, arXiv ID, or URL
import_local_files Import local files into Zotero; Zotero auto-fetches metadata for recognized PDFs
edit_current_note Edit the active Zotero note or create a new one using plain text, Markdown, or HTML
undo_last_action Undo the most recent approved write action in this conversation

Filesystem & Scripting Tools

The agent includes system-level tools for local files, scripts, and Zotero runtime automation.

Tool Description
file_io Read or write files on the local filesystem, including text and image files, with offset and length support for partial reads
run_command Run a shell command on your local machine (zsh on macOS, bash on Linux, cmd.exe on Windows) for analysis scripts and CLI tools
zotero_script Execute JavaScript inside Zotero’s runtime; use read mode for bulk data and write mode for custom mutations

Example use cases:

Important
Terminal and file access tools require confirmation before execution. The agent will show you the command or file operation it wants to perform, and you must approve it before it runs.

Built-in Actions

The agent provides high-level actions for common library workflows. These chain multiple tools together automatically.

Action What it does
Audit Library Scan your library or a collection for incomplete metadata, missing PDFs, missing tags, and other gaps; optionally save the report as a Zotero note
Auto-Tag Suggest tags for the current paper, selected papers, selected collections, or the whole library, then open an editable batch tag-review dialog
Complete Metadata Audit targeted papers for missing bibliographic fields, fetch canonical metadata, and open one review card for the proposed updates
Discover Related Find related papers from recommendations, references, or citations
Organize Unfiled Find unfiled items and organize them into collections via an interactive review workflow
Literature Review Launch the guided literature review workflow
Library Statistics Summarize library or collection statistics such as item types, years, authors, journals, collections, tags, annotations, and growth over time

MCP Server

The plugin runs a built-in Model Context Protocol (MCP) server, allowing external AI agents and tools to interact with your Zotero library programmatically.

This means you can connect any MCP-compatible AI agent (e.g. Claude Desktop, Cursor, custom agents) to your Zotero library and use all the tools listed above.

Agent Demos

Multi-step workflow

The agent can chain multiple tools together to accomplish complex tasks, such as finding a paper, reading its metadata, searching for related work, and writing a summary note.

Multi-step agent workflow

Agent finding related papers

Apply tags automatically

Agent applying tags to a paper

Write a note

Agent writing a note

Safety & Confirmation

All write operations go through a human-in-the-loop confirmation workflow:


Skills

Skills management portal

Skills are customizable guidance files that shape how the agent approaches different types of requests. Each skill is a Markdown file with regex trigger patterns: when your message matches a skill’s patterns, its instructions are automatically injected into the agent’s system prompt, guiding it to use the most efficient tools and workflows for the task at hand.

Note
Skills require Agent Mode to be enabled. They have no effect in standard chat mode.

Built-in Skills

The plugin ships with 8 built-in skills covering common research workflows. They are automatically copied to your skills folder on first run.

Skill Triggers on What it guides the agent to do
simple-paper-qa General questions about a paper, such as summaries, findings, authors, or TLDR requests Read the paper once and answer immediately, avoiding unnecessary retrieval calls
evidence-based-qa Questions about specific methods, results, data, or claims Read first, then use targeted search_paper retrieval for specific evidence
analyze-figures References to figures, tables, or diagrams by number Use MinerU-cached images when available and send images directly to the model
compare-papers Requests to compare or contrast multiple papers Batch paper reads and then retrieve focused evidence for comparison points
library-analysis Requests to summarize, analyze, or audit your library Use efficient scripting to iterate library items instead of paginating through context
literature-review Requests for a literature review or research synthesis Discover papers, deep-read the most relevant few, and synthesize thematically
write-note Requests to write reading notes as Zotero notes or Markdown files in your notes directory Compose notes with metadata, Pandoc citations, and optional figure copying
import-cited-reference Requests to import papers cited in the current PDF Extract references and import the selected cited papers into Zotero

How Skills Work

  1. When you send a message in Agent Mode, the plugin tests your text against every skill’s match patterns.
  2. If any pattern in a skill matches (OR semantics), that skill’s instruction is injected into the agent’s system prompt for that request.
  3. Multiple skills can activate simultaneously if your message matches more than one.
  4. The agent uses these instructions as guidance for tool selection and workflow — they teach the agent how to approach a task, not what tasks it can do.

Creating Custom Skills

  1. Open the Standalone Window (Ctrl+Shift+L / macOS: Cmd+Shift+L).
  2. Click the Skills icon in the top toolbar to open the Skills portal.
  3. Click the ”+ New skill” button to create a template file.
  4. The template opens in your default text editor. Edit the three key parts:
---
id: my-custom-skill
match: /your regex pattern here/i
match: /another trigger pattern/i
---

Instructions for the agent when this skill matches.
Describe the workflow, which tools to prefer, and any constraints.
  1. Save the file. The skill is loaded immediately — no restart needed.

Skill file format:

Field Required Description
id Yes Unique identifier for the skill
match Yes (at least one) Regex pattern with optional flags (i, g, m, etc.). Repeatable — multiple match lines use OR semantics
Instruction body Yes Markdown text after the closing ---. Injected into the agent’s system prompt when the skill matches

Managing Skills

Tip
You can share custom skills with others by exchanging .md files. Drop a skill file into your skills folder and it will be picked up on the next plugin startup or after creating/deleting any skill in the portal.

WebChat Setup (ChatGPT & DeepSeek Web Sync)

WebChat mode sends your questions to chatgpt.com and deepseek.com through a browser extension, then streams responses back into Zotero. It is useful when you want ChatGPT or DeepSeek web access without a provider API key.

Animation showing WebChat mode connected to chatgpt.com

Prerequisites

Step-by-step setup

1. Download the browser extension:

Go to github.com/yilewang/sync-for-zoteroReleases, download the latest extension.zip, and unzip it to a folder on your computer.

2. Install the extension (sideload):

3. Configure the plugin:

In Zotero → Preferencesllm-for-zotero:

Setting Value
Auth Mode WebChat
Model chatgpt.com or chat.deepseek.com

4. Start chatting:

Open a ChatGPT or DeepSeek tab in your browser and keep it open. In Zotero, the plugin panel shows a WebChat indicator with a connection dot (green = connected, red = not detected). Type a question and send.

WebChat features

Important
WebChat mode requires a browser tab to stay open with the Sync for Zotero extension active. Keep the browser and Zotero in the same desktop session, avoid minimizing or backgrounding the active WebChat tab during a request, and watch for the green connection dot. WebChat currently supports paper chat only; library chat is not supported yet.

Technical Notes


Codex Setup (ChatGPT Plus Subscribers)

If you have a ChatGPT Plus subscription, you can use Codex models in the plugin without a separate API key by signing in through the Codex CLI.

New users should choose Codex App Server from the Agent tab. The older Codex Auth (Legacy) path remains available for existing users, but is planned for future deprecation after app-server validation.

Special thanks to @jianghao-zhang for contributing the original Codex Auth integration, and to @boltma for designing the Codex App Server integration.

Step-by-step setup

1. Install the Codex CLI (one-time):

# macOS / Linux (requires Node.js 18+)
npm install -g @openai/codex

# macOS alternative (no Node.js needed)
brew install --cask codex

On Windows, install Codex from PowerShell or Command Prompt rather than WSL, so Zotero MCP can use the Windows-local loopback connection.

2. Log in with your ChatGPT account:

codex login

A browser window opens — sign in with your ChatGPT Plus account. Credentials are saved to ~/.codex/auth.json.

3. Enable Codex App Server in Zotero:

Open Zotero → Preferencesllm-for-zoteroAgent tab:

Setting Recommended value
Enable Codex App Server integration On
Model e.g. gpt-5.4
Reasoning auto, low, medium, high, or xhigh

Click Test connection to verify that Zotero can launch codex app-server, then click the Codex button in the chat header to enter the Codex conversation system.

Codex App Server and Claude Code are mutually exclusive runtime modes in the Agent tab. Disable one before enabling the other.

Existing users who need the old path can open the AI Providers tab, choose Codex Auth (Legacy), keep API URL https://chatgpt.com/backend-api/codex/responses, and use the same Codex model name, for example gpt-5.5.

Recommended Codex App Server configuration

Codex Auth (Legacy) Technical Notes


Claude Code Setup (Experimental)

Claude Code mode runs Claude Code as a separate conversation system inside Zotero. It reuses the familiar sidebar and standalone-window UI, but keeps its own conversation history, paper / open scope state, model/reasoning settings, permission semantics, slash commands, and project skills.

Under development
Claude Code mode currently does not support native Zotero API operations from Claude Code. Use the built-in Agent Mode for native Zotero library tools such as structured item reads, note edits, tagging, metadata updates, and imports. Native Zotero API support for Claude Code is planned for a later release.

Prerequisites

1. Install and Verify Claude Code

Install Claude Code using Anthropic’s official instructions, then run:

claude

Complete any login or authentication prompts in Claude Code before continuing.

2. Start the Zotero Claude Bridge

Claude Code mode depends on the companion bridge repo cc-llm4zotero-adapter. The bridge does not replace Claude Code; it connects Zotero to your local Claude Code runtime.

git clone https://github.com/jianghao-zhang/cc-llm4zotero-adapter.git
cd cc-llm4zotero-adapter
npm install
npm run build
npm run serve:bridge

In another terminal, check that the bridge is alive:

curl -fsS http://127.0.0.1:19787/healthz

For macOS users who want the bridge to run in the background, install the LaunchAgent from the adapter repo:

./scripts/install-macos-daemon.sh

Useful bridge daemon commands:

npm run daemon:status
npm run daemon:start
npm run daemon:stop
npm run daemon:restart
npm run daemon:uninstall

If Claude Code mode stops responding, restart the bridge and re-check /healthz. A passing /healthz check only proves that the adapter is running; it does not prove that the underlying claude CLI is installed, authenticated, or correctly configured.

3. Enable Claude Code inside Zotero

Open Zotero → Preferencesllm-for-zoteroAgent tab.

Setting Recommended value
Enable Claude Code integration On
Bridge URL http://127.0.0.1:19787
Claude Config Source default — user + project + local
Permission Mode safe
Default Model sonnet
Default Reasoning auto

Keep Claude Config Source on default unless you already understand Claude Code settings layers. In default, Claude Code can use your normal user settings plus Zotero-managed project and per-conversation local settings.

After enabling the integration, click the Claude Code button in the chat header to enter Claude Code mode. The Claude conversation system is separate from upstream chat and the built-in agent, so switching modes opens the matching conversation history instead of mixing transcripts.

4. Prepare Claude Project Skills and Commands

Zotero creates a Claude runtime root under your home directory, usually shaped like:

~/Zotero/agent-runtime/profile-.../

Inside that runtime root, shared Claude project assets live in:

CLAUDE.md
.claude/settings.json
.claude/skills/
.claude/commands/

Each Claude conversation also gets its own local .claude folder under the runtime scopes/ tree, so per-conversation overrides do not leak into other chats. You can add shared Claude skills manually under .claude/skills/ or .claude/commands/, but the easiest path is usually to ask Claude Code to create or install the skill in the Zotero project-level Claude config.

Non-Anthropic Claude Code Setups

The Zotero UI exposes opus, sonnet, and haiku as capability tiers. They do not require Anthropic-hosted models specifically. If you route Claude Code through a compatible provider layer or proxy, configure that in Claude Code itself; Zotero only selects the tier and forwards the request to the bridge.


MinerU PDF Parsing

MinerU is an advanced PDF parsing engine that extracts high-fidelity Markdown from PDFs while preserving tables, equations, figures, and complex layouts that standard text extraction often mangles.

Screenshot showing MinerU PDF parsing results in the plugin

Parsed results are cached locally and reused in later conversations. When Auto-parse newly added items is enabled, newly added PDF attachments are sent to MinerU as they enter the Zotero library. If auto-parse is off, you can still parse selected or filtered PDFs from the Manage Files panel.

The MinerU cache is designed for AI, not as a second human PDF reader. Zotero stays the place where you read, annotate, and manage the original PDF. MinerU creates structured sidecar material that models can use: clean Markdown, section ranges, page hints, tables, equations, and extracted figure assets. This keeps the original Zotero UI mostly untouched while giving the assistant much better paper context than raw PDF text extraction.

How to Enable MinerU

  1. Open Zotero → Preferencesllm-for-zotero.
  2. Find the MinerU section and check Enable MinerU.
  3. Keep cloud mode enabled, or check Use local MinerU server for local mode.
  4. For cloud mode, optionally enter your own MinerU API key.
  5. For local mode, run a self-hosted mineru-api server and keep the default base URL (http://127.0.0.1:8000) unless your server uses a different address.
  6. To parse new imports automatically, check Auto-parse newly added items. Then add or import a PDF into your Zotero library. The plugin will parse newly added PDF attachments with MinerU and cache the result for future conversations.

Using Your Own API Key

MinerU can start without an API key through the built-in API, but a personal key is strongly recommended. The built-in API may no longer be supported after June 1, 2026.

To get a free personal key:

  1. Go to mineru.net and create an account.
  2. Generate an API key in your account settings.
  3. Paste the key into Zotero → Preferencesllm-for-zoteroMinerU.
  4. Click Test Connection to verify.

When a personal API key is provided, the plugin calls the MinerU API directly at https://mineru.net/api/v4.

Using a Local MinerU Server

Local MinerU server support was contributed by @renyong18 in PR #152.

Local mode sends PDFs to a self-hosted mineru-api server through POST /file_parse and stores the returned ZIP output in the same local cache format as cloud parsing. The default base URL is http://127.0.0.1:8000.

Prerequisites for local mode:

  1. Install MinerU and run mineru-api; see the MinerU docs for installation.
  2. Make sure required models are downloaded. mineru-api lazy-loads on first request, so the first parse after starting the server or switching backend can take noticeably longer than steady state.

You can pick a Backend in the local section:

Test Connection checks that the server process responds at /health; it does not guarantee that all models are warmed up.

With the default 127.0.0.1 address, PDFs stay on your machine. If you change the base URL to a LAN or remote server, PDFs are sent to that server.

Pause / cancel limitation: mineru-api exposes no cancel or DELETE endpoint, only POST /file_parse, POST /tasks, GET /tasks/{id}, GET /tasks/{id}/result, and GET /health. When you click Pause, the plugin stops the queue and aborts the HTTP wait, but the parse already running on the server keeps executing until it finishes, so GPU/CPU work will not stop immediately. Restart the mineru-api process yourself if you need to abort immediately, such as when switching backend without waiting.

Managing MinerU Caches

The MinerU preferences tab includes a Manage Files panel for maintaining parsed PDF caches:

Automatic cache management is event-driven. The plugin watches Zotero item additions, finds PDF attachments on regular items or standalone PDFs, and waits briefly before processing so Zotero can finish importing the file. If the Zotero item exists but the PDF file path or parent attachment list is not ready yet, the queue retries after short delays instead of failing immediately. Deleted attachments are removed from the queue, and already cached PDFs are skipped.

The queue respects the same filters as bulk parsing. It skips PDFs that already have a local MinerU cache or an available synced package, avoids duplicate queue entries, and shows live status through the MinerU dots: ready, processing, or failed. A normal metadata edit does not repeatedly reparse a finished PDF; modify events are mainly used to recover in-progress, failed, or file-readiness cases.

When a parse succeeds, the plugin writes a cache under Zotero’s data directory in llm-for-zotero-mineru/<attachmentId>/. The canonical files include full.md, manifest.json, content_list.json, extracted assets such as images, and _llm_source.json provenance. The manifest is built for AI access: it maps section titles to character ranges, page hints, and section-level figures so the agent can read the relevant slice of full.md instead of loading the whole paper for every question.

After writing a MinerU cache, the plugin clears stale in-memory text and embedding caches for that PDF. The next question can then use MinerU-quality chunks and regenerate retrieval data from the better parsed text.

Advanced parsing filters can skip files before automatic or bulk parsing:

If Sync MinerU cache with Zotero file sync is enabled, the plugin can create companion ZIP attachments containing full.md, manifest.json, content_list.json, and extracted assets. Sync is optional and default-off. Existing local caches sync only when you request it from the MinerU tab, and synced packages can restore a missing local cache when needed. The repair path validates package metadata and content hashes, prunes duplicate packages for the same source PDF, removes orphaned local caches, and restores usable local cache folders from synced ZIP packages.


Privacy and Data Flow


Troubleshooting

Symptom What to check
Test Connection fails Confirm the base URL, API key, model name, and provider protocol.
The assistant cannot see a paper Reopen the PDF tab, then send a new message so the plugin can rebuild context.
WebChat shows a red dot Keep a ChatGPT or DeepSeek tab open and confirm the Sync for Zotero extension is loaded.
Codex App Server fails Run codex login, confirm codex is on PATH, then click Test connection again.
Claude Code mode hangs Restart the bridge and check curl -fsS http://127.0.0.1:19787/healthz.
MinerU parsing fails Add a personal MinerU API key for cloud mode, or confirm your local mineru-api server responds at /health, then retry Test Connection.

For bugs or unclear failures, please open an issue.


Roadmap


Frequently Asked Questions

Is it free to use?

Yes, the plugin is completely free and open source (AGPL v3). You only pay for API calls to your chosen provider. With Codex App Server, ChatGPT Plus subscribers can use their existing subscription without a separate API key.

Does it work with local models?

Yes. As long as the local model provides an OpenAI-compatible HTTP API, you can connect it by entering the appropriate API Base URL and key in settings.

Is my data used to train models?

The plugin does not train models. Data handling depends on the backend you choose: your configured API provider, local model, WebChat, Codex, Claude Code, or MinerU.

How do I report a bug or request a feature?

Please open an issue on GitHub.


Contributing & Support

Contributions are welcome! Whether it’s bug reports, feature requests, or pull requests — feel free to open an issue or submit a PR on GitHub.

If you find this plugin helpful, consider:


Star History

Star History Chart