ArXiv MCP Server
MCP ServerFreeSearch and read arXiv academic papers and abstracts via MCP.
Capabilities9 decomposed
arxiv paper search with category and date filtering
Medium confidenceQueries the arXiv API with structured filters for subject categories, date ranges, and keywords, returning paginated results with metadata (title, authors, abstract, publication date). Implements async HTTP requests to arXiv's REST API with configurable result limits and sorting options, enabling AI assistants to discover relevant papers programmatically without manual web browsing.
Implements MCP-native search tool that wraps arXiv's REST API with structured category and date filtering, allowing AI assistants to invoke searches as native tools rather than requiring web scraping or manual API calls. Uses async/await patterns for non-blocking I/O during paper discovery.
Simpler than building custom web scrapers and more reliable than regex-based parsing because it uses arXiv's official API; integrates directly into MCP protocol for seamless AI assistant access without additional HTTP client setup.
pdf-to-markdown paper conversion with local caching
Medium confidenceDownloads papers from arXiv as PDFs, converts them to markdown format for LLM-friendly processing, and stores converted papers locally to avoid redundant downloads and API calls. Uses a PDF extraction library (likely PyPDF2 or pdfplumber) to parse document structure, preserving sections, equations, and references while converting to plain text markdown. Local storage layer caches papers by arXiv ID, enabling fast retrieval on subsequent reads.
Implements a two-stage paper retrieval system: download-once-cache-forever pattern with automatic PDF-to-markdown conversion, allowing MCP clients to treat papers as queryable text resources rather than binary blobs. Caching layer is transparent to the caller — subsequent requests for the same paper ID return cached markdown without re-downloading.
More efficient than naive approaches that re-download papers on every access; better for LLM processing than raw PDFs because markdown is token-efficient and structurally clearer than binary PDF content.
local paper inventory listing with metadata indexing
Medium confidenceScans the local paper cache directory, indexes all downloaded papers by arXiv ID, and returns a structured list of available papers with metadata (title, authors, abstract, download date, file size). Implements a filesystem-based inventory system that reads paper metadata from cached files or maintains a separate index file, enabling quick enumeration of the local research library without querying arXiv.
Provides a lightweight filesystem-based inventory system that mirrors the local paper cache, enabling quick enumeration without network I/O. Metadata is extracted from cached paper files or stored in a companion index file, making the listing operation O(n) in the number of cached papers rather than O(n) network requests.
Faster than querying arXiv for paper metadata because it operates entirely on local disk; enables offline-first workflows where the research library is self-contained and does not require network connectivity.
paper content retrieval with structured reading interface
Medium confidenceRetrieves the full text of a previously downloaded paper from the local cache and returns it as markdown-formatted content, optionally with section-level metadata (headings, abstract, introduction, conclusion). Implements a read operation that maps arXiv IDs to cached markdown files and parses the markdown structure to enable section-aware access. Supports both full-paper retrieval and section-specific queries (e.g., 'return only the abstract and conclusion').
Implements a structured reading interface that treats papers as queryable documents with section-level granularity, rather than monolithic text blobs. Parses markdown heading structure to enable section-aware retrieval, allowing LLM agents to request specific parts of papers (e.g., 'get the abstract and methodology') without loading the entire document.
More flexible than simple file-read operations because it understands paper structure; enables context-aware paper analysis where agents can request relevant sections rather than blindly loading full papers that may exceed context limits.
mcp protocol tool registration and request routing
Medium confidenceImplements the Model Context Protocol (MCP) server specification, registering the four paper management tools (search, download, list, read) as callable MCP tools and routing incoming tool-call requests from AI assistants to the appropriate handler functions. Uses MCP's tool schema system to define input/output types and validation rules, enabling type-safe tool invocation from Claude, other LLMs, or MCP-compatible clients. Handles async request/response cycles and error propagation according to MCP specification.
Implements full MCP server compliance with tool schema registration, async request handling, and error propagation. Tools are registered with structured schemas that define input parameters, output types, and descriptions, enabling AI assistants to understand and invoke tools with type safety. Uses stdio transport for communication, making it compatible with Claude and other MCP clients.
More standardized than custom HTTP APIs because it uses the MCP protocol, enabling seamless integration with Claude and other MCP-compatible tools without custom client code; provides type safety and automatic input validation that REST APIs require manual implementation for.
deep paper analysis prompt workflow
Medium confidenceProvides a structured prompt template for comprehensive paper analysis that guides AI assistants through a multi-step workflow: abstract extraction, methodology review, results interpretation, and key findings synthesis. Implements a prompt system that chains multiple analysis steps, with context management to handle long papers by breaking them into sections. The prompt includes structured output formatting (JSON or markdown) to make analysis results machine-readable and suitable for downstream processing.
Implements a multi-step analysis prompt that breaks paper reading into discrete stages (abstract → methodology → results → synthesis), with context management to handle papers that exceed LLM context limits. Prompt is registered as an MCP resource, making it accessible to AI assistants as a reusable workflow template rather than a one-off instruction.
More systematic than ad-hoc prompting because it enforces a consistent analysis structure; enables reproducible paper analysis across multiple papers and researchers, making it suitable for building research knowledge bases.
async http request handling with error recovery
Medium confidenceImplements async/await patterns for non-blocking I/O during arXiv API calls and PDF downloads, using Python's asyncio library to handle multiple concurrent requests without blocking the MCP server. Includes retry logic with exponential backoff for transient failures (network timeouts, rate limits), timeout handling to prevent hanging requests, and structured error propagation to MCP clients. Manages connection pooling to reuse HTTP connections across multiple requests.
Uses Python asyncio for non-blocking I/O, allowing the MCP server to handle multiple concurrent paper operations without spawning threads or processes. Implements exponential backoff retry logic that respects arXiv rate limits while recovering from transient failures. Connection pooling reuses HTTP connections across requests, reducing overhead.
More efficient than synchronous HTTP calls because it doesn't block the event loop during network I/O; enables the MCP server to handle multiple concurrent clients without thread management overhead.
resource-based paper metadata caching
Medium confidenceImplements MCP's resource system to expose downloaded papers and their metadata as queryable resources, enabling AI assistants to reference papers by URI (e.g., 'arxiv://2301.12345') and access metadata without repeated tool calls. Caches paper metadata (title, authors, abstract, download date) in memory or on disk, reducing lookup latency for frequently accessed papers. Resources are registered with the MCP server and can be subscribed to for change notifications.
Leverages MCP's resource system to expose papers as first-class resources with URIs, enabling AI assistants to reference papers by identifier rather than re-invoking search or download tools. Metadata is cached in memory or on disk, reducing lookup latency for frequently accessed papers. Resources can be subscribed to for change notifications, enabling reactive workflows.
More efficient than repeated tool calls because resources are cached and referenced by URI; enables AI assistants to maintain paper context across multiple turns without re-fetching metadata.
configuration management for storage paths and api settings
Medium confidenceProvides a configuration system for customizing the local paper storage directory, arXiv API settings (base URL, timeout, rate limits), and MCP server settings (stdio vs. HTTP transport, logging level). Configuration is loaded from environment variables, config files (YAML/JSON), or command-line arguments, with sensible defaults for all settings. Enables users to customize the server behavior without modifying source code.
Implements a multi-source configuration system that supports environment variables, config files, and command-line arguments with a clear precedence order. Provides sensible defaults for all settings, enabling the server to run without configuration while allowing customization for advanced use cases.
More flexible than hardcoded settings because it supports multiple configuration sources; enables the server to be deployed in different environments without code changes.
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with ArXiv MCP Server, ranked by overlap. Discovered automatically through the match graph.
arxiv-mcp-server
A Model Context Protocol server for searching and analyzing arXiv papers
daily-arXiv-ai-enhanced
Automatically crawl arXiv papers daily and summarize them using AI. Illustrating them using GitHub Pages.
arxiv-mcp-server
A Model Context Protocol server for searching and analyzing arXiv papers
alphaXiv
Discuss, discover, and read arXiv papers.
Paper Search
Search and download academic papers from arXiv, PubMed, bioRxiv, medRxiv, Google Scholar, Semantic Scholar, and IACR. Fetch PDFs and extract full text to accelerate literature reviews. Get consistent metadata for easier filtering, citation, and analysis.
Latex MCP Server
** - MCP Server to compile latex, download/organize/read cited papers, run visualization scripts and add figures/tables to latex.
Best For
- ✓AI researchers building literature discovery agents
- ✓Academic writing assistants that need to cite recent work
- ✓Teams building research-aware LLM applications
- ✓Researchers running local LLM agents that need persistent paper storage
- ✓Teams building offline-capable research tools
- ✓Cost-conscious projects that want to minimize repeated API calls
- ✓Researchers maintaining a persistent local paper library
- ✓Offline-first research tools that need to work without network access
Known Limitations
- ⚠arXiv API rate limits apply (typically 3 requests per second per IP)
- ⚠Search results limited to arXiv's indexing — does not cover papers from other repositories (IEEE, ACM, etc.)
- ⚠Date filtering operates on submission date, not publication date, which may differ by months
- ⚠No full-text search capability — only searches metadata (title, abstract, authors)
- ⚠PDF-to-markdown conversion may lose formatting (tables, figures, complex equations) — output is best-effort text extraction
- ⚠Large papers (100+ pages) produce markdown files that may exceed LLM context windows (requires chunking or summarization)
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
About
Community MCP server for arXiv academic paper repository. Provides tools to search papers by topic, read abstracts and metadata, download PDFs, and query recent submissions in specific categories.
Categories
Alternatives to ArXiv MCP Server
Search the Supabase docs for up-to-date guidance and troubleshoot errors quickly. Manage organizations, projects, databases, and Edge Functions, including migrations, SQL, logs, advisors, keys, and type generation, in one flow. Create and manage development branches to iterate safely, confirm costs
Compare →AI-optimized web search and content extraction via Tavily MCP.
Compare →Scrape websites and extract structured data via Firecrawl MCP.
Compare →Are you the builder of ArXiv MCP Server?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →