What can arxiv-mcp-server do?

arxiv paper full-text search with query parsing, paper metadata extraction and structured formatting, mcp tool registration and schema definition, query parameter filtering and advanced search syntax translation, pagination and result batching for large result sets, error handling and api resilience with graceful degradation, context-aware paper recommendation based on search history, abstract summarization and key insight extraction

arxiv-mcp-server

MCP ServerFree

A Model Context Protocol server for searching and analyzing arXiv papers

Open Source

signed passport verify →

/ 100

8 capabilities

Best for: arxiv paper full-text search with query parsing, paper metadata extraction and structured formatting, mcp tool registration and schema definition
Type: MCP Server · Free
Score: 43/100
Best alternative: AWS MCP Servers
Agent-compatible: Yes — MCP protocol

Capabilities8 decomposed

arxiv paper full-text search with query parsing

Medium confidence

Implements MCP tool interface to query arXiv's REST API with support for advanced search syntax (author, title, category filters, date ranges). Parses user natural language queries into arXiv API query strings, handles pagination, and returns structured metadata including abstracts, authors, publication dates, and PDF URLs. Uses HTTP client to communicate with arXiv's public API endpoint without authentication.

Solves for

Search for papers by specific authors across all arXiv categoriesFind papers published in a date range matching keyword criteriaDiscover papers in specific arXiv categories (cs.AI, physics.quant-ph, etc.)Get structured metadata for papers to feed into downstream analysis

Best for

Researchers building LLM-powered literature review tools

AI agents that need to autonomously search academic papers

Claude/GPT users wanting integrated arXiv access without leaving their chat interface

Requires

Python 3.8+

MCP client (Claude Desktop, Cline, or compatible tool)

Network access to arXiv.org API endpoint

Limitations

arXiv API rate-limited to ~3 requests per second per IP; no built-in backoff/retry logic visible

Search results capped at arXiv's default pagination (typically 10-100 results per query)

No full-text search of paper contents — only metadata (title, abstract, authors)

What makes it unique

Exposes arXiv search as an MCP tool callable by Claude/GPT, enabling LLMs to autonomously discover papers without context switching; integrates query parsing to translate natural language into arXiv's advanced search syntax

vs alternatives

Tighter integration with LLM workflows than direct arXiv API calls, and more discoverable than browser-based search for AI agents

paper metadata extraction and structured formatting

Medium confidence

Parses arXiv API JSON responses and extracts key metadata fields (title, authors, abstract, publication date, categories, PDF URL, arXiv ID) into a consistent structured format. Formats results for readability in chat contexts, handling multi-author lists, category hierarchies, and URL encoding. Implements field mapping to normalize arXiv's native response schema into a developer-friendly output structure.

Solves for

Extract structured paper metadata to feed into RAG or knowledge base systemsFormat paper results for human-readable display in chat interfacesBuild citation data structures from arXiv metadataCollect paper URLs and IDs for batch processing or archival

Best for

Developers building knowledge management systems on top of arXiv

Teams creating literature review automation pipelines

LLM application builders needing normalized paper data structures

Requires

Python 3.8+

arXiv API response (JSON format)

MCP server runtime

Limitations

Only extracts metadata available via arXiv API — no full-text content extraction

Author affiliations not included in standard arXiv API response

Citation counts and impact metrics not available (requires external sources like Semantic Scholar)

What makes it unique

Normalizes arXiv's native API response into a consistent schema optimized for LLM consumption, with special handling for multi-author lists and category hierarchies that are common in academic papers

vs alternatives

More structured than raw arXiv API responses and more accessible to LLMs than unformatted text, enabling downstream agents to reliably parse and act on paper metadata

mcp tool registration and schema definition

Medium confidence

Registers arXiv search and retrieval functions as MCP tools with JSON Schema definitions that describe input parameters (query, filters, result limits) and output structure. Implements the MCP protocol's tool interface, allowing Claude, Cline, and other MCP clients to discover available tools, understand their parameters, and invoke them with proper type validation. Handles tool invocation routing and response serialization back to the MCP client.

Solves for

Enable Claude Desktop and other MCP clients to discover and call arXiv search functionsDefine parameter schemas so LLMs understand what search filters are availableIntegrate arXiv search seamlessly into multi-tool agent workflowsAllow non-technical users to access arXiv search through their LLM chat interface

Best for

Claude Desktop users wanting native arXiv integration

Developers building multi-tool MCP agent systems

Teams standardizing on MCP for LLM tool orchestration

Requires

Python 3.8+

MCP SDK for Python

MCP-compatible client (Claude Desktop 0.1.0+, Cline, etc.)

Limitations

MCP protocol overhead adds ~50-100ms latency per tool call

Tool schemas must be manually maintained in sync with backend implementation

No built-in tool result caching — each invocation hits arXiv API

What makes it unique

Implements full MCP protocol compliance for tool registration, including JSON Schema validation and proper error handling, enabling seamless integration with Claude and other MCP clients without custom adapters

vs alternatives

More standardized than custom API wrappers and more discoverable than direct function calls, allowing LLMs to autonomously understand and invoke arXiv search without hardcoded instructions

query parameter filtering and advanced search syntax translation

Medium confidence

Translates natural language search queries into arXiv's advanced search syntax, supporting filters for author names, paper titles, publication date ranges, and arXiv categories (cs.AI, physics.quant-ph, etc.). Implements parameter validation and escaping to prevent API errors, handles multi-value filters (e.g., multiple authors OR'd together), and constructs properly formatted query strings for the arXiv API. Supports both simple keyword search and complex boolean queries.

Solves for

Search for papers by specific author names without knowing arXiv's query syntaxFilter papers by publication date range (e.g., 'papers from 2023')Restrict search to specific arXiv categories (e.g., 'machine learning papers only')Combine multiple filters into a single query (author AND category AND date range)

Best for

Non-technical researchers using Claude to search arXiv

LLM agents that need to construct complex arXiv queries programmatically

Literature review tools that need to support advanced filtering

Requires

Python 3.8+

Understanding of arXiv's query syntax (or reliance on LLM to translate)

arXiv API access

Limitations

arXiv query syntax has quirks (e.g., author names must match exactly); no fuzzy matching

Date filtering limited to arXiv's submission date, not publication date

No support for proximity search or phrase matching within abstracts

What makes it unique

Abstracts arXiv's non-intuitive query syntax from users, allowing natural language filter specifications that are automatically translated into valid arXiv API queries with proper escaping and validation

vs alternatives

More user-friendly than requiring users to learn arXiv's query syntax directly, and more robust than naive string concatenation which can produce malformed queries

pagination and result batching for large result sets

Medium confidence

Implements pagination logic to handle arXiv API's result limits (typically 10-100 results per request), allowing users to retrieve large result sets across multiple API calls. Manages offset/limit parameters, accumulates results across batches, and provides mechanisms to control result count (e.g., 'get top 50 papers'). Handles empty result sets and API errors gracefully without losing previously fetched results.

Solves for

Retrieve more than 100 papers from a single search queryBatch-process large result sets without overwhelming the APIControl memory usage by limiting result count for large searchesImplement result streaming for long-running searches

Best for

Researchers conducting comprehensive literature reviews

Batch processing pipelines that need to fetch thousands of papers

Memory-constrained environments (e.g., edge devices, serverless functions)

Requires

Python 3.8+

arXiv API access

Patience for rate-limited API (3 req/sec max)

Limitations

arXiv API rate-limiting (3 req/sec) makes large batch fetches slow (~30 seconds for 300 papers)

No built-in result caching between paginated requests

Pagination state not persisted — if connection drops, must restart from beginning

What makes it unique

Transparently handles arXiv's pagination constraints within the MCP tool interface, allowing users to request arbitrary result counts without manually managing offset/limit parameters

vs alternatives

Simpler than manually constructing paginated API calls, and more efficient than fetching all results upfront which can exceed memory limits

error handling and api resilience with graceful degradation

Medium confidence

Implements error handling for common arXiv API failures (rate limiting, timeouts, malformed queries, network errors) with appropriate HTTP status code interpretation and user-friendly error messages. Provides retry logic with exponential backoff for transient failures, validates input parameters before API calls to prevent unnecessary requests, and returns partial results when possible rather than failing completely. Logs errors for debugging while maintaining MCP protocol compliance.

Solves for

Handle rate-limit errors gracefully without crashing the MCP serverRetry failed searches automatically without user interventionProvide clear error messages when queries are malformedMaintain service availability during arXiv API outages

Best for

Production LLM applications that need reliable arXiv access

Long-running research agents that may encounter transient API failures

Teams deploying arXiv-MCP in multi-user environments with shared rate limits

Requires

Python 3.8+

HTTP client library (requests or similar)

MCP server runtime

Limitations

Retry logic may still fail if arXiv API is down for extended periods

No circuit breaker pattern — will continue retrying even during prolonged outages

Error messages may be cryptic if arXiv API returns non-standard error responses

What makes it unique

Implements MCP-aware error handling that preserves protocol compliance while providing retry logic and graceful degradation, ensuring the server remains responsive even when arXiv API is unreliable

vs alternatives

More robust than naive API calls that fail immediately on errors, and more transparent than silent failures that leave users confused about why searches aren't working

context-aware paper recommendation based on search history

Medium confidence

Tracks previous search queries and results within an MCP session, using this history to inform subsequent searches and recommendations. Analyzes patterns in user searches (e.g., frequently searched authors, categories, keywords) and suggests related papers or refined queries. Implements lightweight session state management to maintain search context across multiple tool invocations without requiring external persistence.

Solves for

Get paper recommendations based on previous searches in the same sessionRefine searches based on patterns in search historyDiscover related papers without explicitly specifying new search termsBuild a coherent research narrative across multiple searches

Best for

Researchers conducting iterative literature reviews

LLM agents building comprehensive knowledge bases on specific topics

Interactive research sessions where context matters

Requires

Python 3.8+

MCP server with session management

Multiple tool invocations within same session

Limitations

Session state lost when MCP server restarts — no persistent history

Recommendation logic is heuristic-based, not ML-powered — may miss relevant papers

No cross-session learning — each new session starts with blank history

What makes it unique

Maintains lightweight session-scoped context of search history within the MCP server, enabling recommendations and query refinement without requiring external knowledge bases or persistent storage

vs alternatives

More contextual than stateless API calls, and simpler than full RAG systems while still providing some recommendation capability

abstract summarization and key insight extraction

Medium confidence

Processes paper abstracts returned from arXiv searches and extracts key insights, research questions, and methodologies using pattern matching and NLP heuristics. Generates concise summaries suitable for quick scanning by researchers, highlighting novel contributions and relevance to search context. Integrates with Claude's native capabilities when available, delegating summarization to the LLM client rather than implementing custom NLP.

Solves for

Quickly scan paper abstracts to assess relevance without reading full papersExtract research questions and methodologies from abstractsGenerate one-sentence summaries of papers for literature review notesIdentify papers with novel contributions vs. incremental work

Best for

Researchers conducting rapid literature reviews

LLM agents filtering large result sets for relevance

Teams building automated research summaries

Requires

Python 3.8+

arXiv paper metadata (abstract field)

Optional: Claude API for LLM-powered summarization

Limitations

Summarization quality depends on abstract quality — some papers have vague abstracts

Pattern matching heuristics may miss nuanced contributions

No access to full paper text — can only summarize abstracts

What makes it unique

Delegates summarization to Claude when available (leveraging the LLM client's capabilities) while providing fallback heuristic-based extraction, avoiding redundant LLM calls and keeping the MCP server lightweight

vs alternatives

More efficient than requiring separate LLM calls for each abstract, and more intelligent than simple keyword extraction

Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.

Related Artifactssharing capabilities

Artifacts that share capabilities with arxiv-mcp-server, ranked by overlap. Discovered automatically through the match graph.

MCP Server26

@seacolour/openalex-mcp-server-tool

MCP server for querying OpenAlex papers

structured paper metadata extraction and filteringmcp tool schema registration and invocation routingopenalex paper search via mcp protocol

3 shared capabilities

MCP Server26

arxiv-paper

MCP server: arxiv-paper

arxiv paper metadata extractionarxiv paper retrieval via mcp

2 shared capabilities

MCP Server52

Paper Search

Search and download academic papers from arXiv, PubMed, bioRxiv, medRxiv, Google Scholar, Semantic Scholar, and IACR. Fetch PDFs and extract full text to accelerate literature reviews. Get consistent metadata for easier filtering, citation, and analysis.

multi-source academic paper search with unified query interfacemcp protocol integration for llm agent tool calling

2 shared capabilities

MCP Server26

scholarmcp

MCP server: scholarmcp

mcp-tool-schema-exposure-for-academic-queriesfull-text-search-with-advanced-filtering

2 shared capabilities

MCP Server29

BGPT MCP API

Search scientific papers with raw experimental data extracted from full-text studies. Returns methods, results, quality scores, and 25+ metadata fields per paper. 50 free searches, then $0.01/result with an API key.

structured scientific paper searchmetadata extraction from studies

2 shared capabilities

MCP Server23

arxiv-retrive

MCP server: arxiv-retrive

keyword-based search for arxiv papersarxiv metadata extraction

2 shared capabilities

Best For

✓Researchers building LLM-powered literature review tools
✓AI agents that need to autonomously search academic papers
✓Claude/GPT users wanting integrated arXiv access without leaving their chat interface
✓Developers building knowledge management systems on top of arXiv
✓Teams creating literature review automation pipelines
✓LLM application builders needing normalized paper data structures
✓Claude Desktop users wanting native arXiv integration
✓Developers building multi-tool MCP agent systems

Known Limitations

⚠arXiv API rate-limited to ~3 requests per second per IP; no built-in backoff/retry logic visible
⚠Search results capped at arXiv's default pagination (typically 10-100 results per query)
⚠No full-text search of paper contents — only metadata (title, abstract, authors)
⚠Depends on arXiv API availability; no fallback or caching mechanism
⚠Only extracts metadata available via arXiv API — no full-text content extraction
⚠Author affiliations not included in standard arXiv API response

Requirements

Python 3.8+MCP client (Claude Desktop, Cline, or compatible tool)Network access to arXiv.org API endpointNo API key required (public API)arXiv API response (JSON format)MCP server runtimeMCP SDK for PythonMCP-compatible client (Claude Desktop 0.1.0+, Cline, etc.)

Input / Output

Accepts: natural language query string, structured filter parameters (author, category, date range), JSON response from arXiv API, MCP tool invocation with JSON parameters, query string, result limit (integer), page size (integer), any query or API request, search query, previous search history (implicit), paper abstract (text)

Produces: JSON structured data with paper metadata, plain text summaries of results, JSON structured metadata, formatted text for display, MCP tool result (JSON or text), arXiv API query string, validated parameter dictionary, list of paper metadata objects, pagination metadata (total count, current offset), error message (string), partial results (if available), retry metadata, recommended papers, suggested refined queries, search context summary, summary (text), key insights (list of strings), relevance score (float 0-1)

UnfragileRank

Adoption52%(25% weight)

Quality26%(25% weight)

Ecosystem60%(15% weight)

Match Graph25%(23% weight)

Freshness75%(12% weight)

UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.

Type: MCP Server

8 capabilities

Visit arxiv-mcp-server→

Repository Details

2,645

Stars

214

Forks

Python

Language

Apache-2.0

License

Topics

aiarxivclaude-aigptllmmcp-servermodel-context-protocolpaperspythonresearch

Last commit: Apr 26, 2026

About

A Model Context Protocol server for searching and analyzing arXiv papers

Alternatives to arxiv-mcp-server

AWS MCP Servers59MCP Server

AWS Labs' official MCP suite — docs, CDK, Bedrock KB, cost, Lambda and more as agent tools.

Compare →

Zapier MCP62MCP Server

Zapier's hosted MCP — 8,000+ app integrations exposed as allowlisted agent tools.

Compare →

Hugging Face MCP Server61MCP Server

Official Hugging Face MCP — search models/datasets/Spaces/papers and call Spaces as tools.

Compare →

Atlassian Remote MCP Server61MCP Server

Atlassian's official hosted MCP — Jira + Confluence with OAuth, permission-bounded agent access.

Compare →

See all alternatives to arxiv-mcp-server→

Are you the builder of arxiv-mcp-server?

Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.

Continue with GitHub or claim by email

Get the weekly brief

New tools, rising stars, and what's actually worth your time. No spam.

Data Sources

mcp registry

Looking for something else?

Search →

Capabilities8 decomposed

arxiv paper full-text search with query parsing

Medium confidence

Solves for

Best for

Researchers building LLM-powered literature review tools

AI agents that need to autonomously search academic papers

Claude/GPT users wanting integrated arXiv access without leaving their chat interface

Requires

Python 3.8+

MCP client (Claude Desktop, Cline, or compatible tool)

Network access to arXiv.org API endpoint

Limitations

arXiv API rate-limited to ~3 requests per second per IP; no built-in backoff/retry logic visible

Search results capped at arXiv's default pagination (typically 10-100 results per query)

No full-text search of paper contents — only metadata (title, abstract, authors)

What makes it unique

vs alternatives

Tighter integration with LLM workflows than direct arXiv API calls, and more discoverable than browser-based search for AI agents

paper metadata extraction and structured formatting

Medium confidence

Solves for

Best for

Developers building knowledge management systems on top of arXiv

Teams creating literature review automation pipelines

LLM application builders needing normalized paper data structures

Requires

Python 3.8+

arXiv API response (JSON format)

MCP server runtime

Limitations

Only extracts metadata available via arXiv API — no full-text content extraction

Author affiliations not included in standard arXiv API response

Citation counts and impact metrics not available (requires external sources like Semantic Scholar)

What makes it unique

Normalizes arXiv's native API response into a consistent schema optimized for LLM consumption, with special handling for multi-author lists and category hierarchies that are common in academic papers

vs alternatives

More structured than raw arXiv API responses and more accessible to LLMs than unformatted text, enabling downstream agents to reliably parse and act on paper metadata

mcp tool registration and schema definition

Medium confidence

Solves for

Best for

Claude Desktop users wanting native arXiv integration

Developers building multi-tool MCP agent systems

Teams standardizing on MCP for LLM tool orchestration

Requires

Python 3.8+

MCP SDK for Python

MCP-compatible client (Claude Desktop 0.1.0+, Cline, etc.)

Limitations

MCP protocol overhead adds ~50-100ms latency per tool call

Tool schemas must be manually maintained in sync with backend implementation

No built-in tool result caching — each invocation hits arXiv API

What makes it unique

vs alternatives

More standardized than custom API wrappers and more discoverable than direct function calls, allowing LLMs to autonomously understand and invoke arXiv search without hardcoded instructions

query parameter filtering and advanced search syntax translation

Medium confidence

Solves for

Best for

Non-technical researchers using Claude to search arXiv

LLM agents that need to construct complex arXiv queries programmatically

Literature review tools that need to support advanced filtering

Requires

Python 3.8+

Understanding of arXiv's query syntax (or reliance on LLM to translate)

arXiv API access

Limitations

arXiv query syntax has quirks (e.g., author names must match exactly); no fuzzy matching

Date filtering limited to arXiv's submission date, not publication date

No support for proximity search or phrase matching within abstracts

What makes it unique

vs alternatives

More user-friendly than requiring users to learn arXiv's query syntax directly, and more robust than naive string concatenation which can produce malformed queries

pagination and result batching for large result sets

Medium confidence

Solves for

Best for

Researchers conducting comprehensive literature reviews

Batch processing pipelines that need to fetch thousands of papers

Memory-constrained environments (e.g., edge devices, serverless functions)

Requires

Python 3.8+

arXiv API access

Patience for rate-limited API (3 req/sec max)

Limitations

arXiv API rate-limiting (3 req/sec) makes large batch fetches slow (~30 seconds for 300 papers)

No built-in result caching between paginated requests

Pagination state not persisted — if connection drops, must restart from beginning

What makes it unique

Transparently handles arXiv's pagination constraints within the MCP tool interface, allowing users to request arbitrary result counts without manually managing offset/limit parameters

vs alternatives

Simpler than manually constructing paginated API calls, and more efficient than fetching all results upfront which can exceed memory limits

error handling and api resilience with graceful degradation

Medium confidence

Solves for

Best for

Production LLM applications that need reliable arXiv access

Long-running research agents that may encounter transient API failures

Teams deploying arXiv-MCP in multi-user environments with shared rate limits

Requires

Python 3.8+

HTTP client library (requests or similar)

MCP server runtime

Limitations

Retry logic may still fail if arXiv API is down for extended periods

No circuit breaker pattern — will continue retrying even during prolonged outages

Error messages may be cryptic if arXiv API returns non-standard error responses

What makes it unique

Implements MCP-aware error handling that preserves protocol compliance while providing retry logic and graceful degradation, ensuring the server remains responsive even when arXiv API is unreliable

vs alternatives

More robust than naive API calls that fail immediately on errors, and more transparent than silent failures that leave users confused about why searches aren't working

context-aware paper recommendation based on search history

Medium confidence

Solves for

Best for

Researchers conducting iterative literature reviews

LLM agents building comprehensive knowledge bases on specific topics

Interactive research sessions where context matters

Requires

Python 3.8+

MCP server with session management

Multiple tool invocations within same session

Limitations

Session state lost when MCP server restarts — no persistent history

Recommendation logic is heuristic-based, not ML-powered — may miss relevant papers

No cross-session learning — each new session starts with blank history

What makes it unique

Maintains lightweight session-scoped context of search history within the MCP server, enabling recommendations and query refinement without requiring external knowledge bases or persistent storage

vs alternatives

More contextual than stateless API calls, and simpler than full RAG systems while still providing some recommendation capability

abstract summarization and key insight extraction

Medium confidence

Solves for

Best for

Researchers conducting rapid literature reviews

LLM agents filtering large result sets for relevance

Teams building automated research summaries

Requires

Python 3.8+

arXiv paper metadata (abstract field)

Optional: Claude API for LLM-powered summarization

Limitations

Summarization quality depends on abstract quality — some papers have vague abstracts

Pattern matching heuristics may miss nuanced contributions

No access to full paper text — can only summarize abstracts

What makes it unique

vs alternatives

More efficient than requiring separate LLM calls for each abstract, and more intelligent than simple keyword extraction

Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.

Alternatives to arxiv-mcp-server

AWS MCP Servers59MCP Server

AWS Labs' official MCP suite — docs, CDK, Bedrock KB, cost, Lambda and more as agent tools.

Compare →

Zapier MCP62MCP Server

Zapier's hosted MCP — 8,000+ app integrations exposed as allowlisted agent tools.

Compare →

Hugging Face MCP Server61MCP Server

Official Hugging Face MCP — search models/datasets/Spaces/papers and call Spaces as tools.

Compare →

Atlassian Remote MCP Server61MCP Server

Atlassian's official hosted MCP — Jira + Confluence with OAuth, permission-bounded agent access.

Compare →

See all alternatives to arxiv-mcp-server→

arxiv-mcp-server

Capabilities8 decomposed

arxiv paper full-text search with query parsing

paper metadata extraction and structured formatting

mcp tool registration and schema definition

query parameter filtering and advanced search syntax translation

pagination and result batching for large result sets

error handling and api resilience with graceful degradation

context-aware paper recommendation based on search history

abstract summarization and key insight extraction

Related Artifactssharing capabilities

@seacolour/openalex-mcp-server-tool

arxiv-paper

Paper Search

scholarmcp

BGPT MCP API

arxiv-retrive

Best For

Known Limitations

Requirements

Input / Output

UnfragileRank

Repository Details

About

Categories

Alternatives to arxiv-mcp-server

Are you the builder of arxiv-mcp-server?

Get the weekly brief

Data Sources

arxiv-mcp-server

Capabilities8 decomposed

arxiv paper full-text search with query parsing

paper metadata extraction and structured formatting

mcp tool registration and schema definition

query parameter filtering and advanced search syntax translation

pagination and result batching for large result sets

error handling and api resilience with graceful degradation

context-aware paper recommendation based on search history

abstract summarization and key insight extraction

Related Artifactssharing capabilities

@seacolour/openalex-mcp-server-tool

arxiv-paper

Paper Search

scholarmcp

BGPT MCP API

arxiv-retrive

Best For

Known Limitations

Requirements

Input / Output

UnfragileRank

Repository Details

About

Categories

Alternatives to arxiv-mcp-server

Are you the builder of arxiv-mcp-server?

Get the weekly brief

Data Sources