XHS-Downloader

Q: What can XHS-Downloader do?

xiaohongshu work url parsing and metadata extraction, watermark-free media download with format conversion, sqlite-based download history and work metadata persistence, cookie-based authentication with automatic session refresh, customizable file naming and folder organization with template variables, batch processing with rate limiting and error recovery, configuration management with settings.json persistence and validation, user profile link extraction and work collection aggregation, search result link extraction and filtering, multi-interface request routing and execution mode dispatch, terminal user interface (tui) with clipboard monitoring and interactive settings, command-line interface (cli) with selective media index specification, browser userscript integration with dual-mode operation, rest api server with json request/response protocol, model context protocol (mcp) server integration for ai assistants

MCP ServerFree

小红书（XiaoHongShu、RedNote）链接提取/作品采集工具：提取账号发布、收藏、点赞、专辑作品链接；提取搜索结果作品、用户链接；采集小红书作品信息；提取小红书作品下载地址；下载小红书作品文件

Open Source

/ 100

15 capabilities

Capabilities15 decomposed

xiaohongshu work url parsing and metadata extraction

Medium confidence

Parses XiaoHongShu (RedNote) work URLs to extract structured metadata including post ID, author information, caption text, image/video URLs, and engagement metrics. Uses HTTP request interception with cookie-based authentication to bypass platform anti-scraping measures and retrieve JSON API responses from XHS endpoints, then deserializes and normalizes the response into a standardized work object with media asset references.

Solves for

Extract post metadata from a single XHS work URL without downloading filesRetrieve all media asset URLs from a post for programmatic processingGet author profile information and engagement stats from a work linkBatch extract metadata from multiple XHS URLs for analysis or migration

Best for

Developers building content aggregation tools targeting XHS

Data analysts collecting social media metrics from Chinese platforms

Teams migrating content from XHS to other platforms

Requires

Python 3.9+

Valid XiaoHongShu account with active session cookies

httpx library for async HTTP requests

Limitations

Requires valid XHS session cookies; authentication fails if cookies expire or are revoked

Rate-limited by XHS servers; batch extraction of 100+ URLs may trigger temporary IP blocks

Cannot extract private/deleted posts or content from suspended accounts

What makes it unique

Implements cookie-based session authentication with automatic refresh logic and XHS-specific JSON API endpoint targeting, rather than HTML parsing or Selenium-based browser automation, enabling 10-50x faster extraction with lower resource overhead

vs alternatives

Faster and more reliable than browser automation tools (Selenium, Puppeteer) because it directly calls XHS JSON APIs after cookie authentication, avoiding DOM parsing and browser overhead

watermark-free media download with format conversion

Medium confidence

Downloads image and video files from XiaoHongShu work URLs and removes platform watermarks by fetching clean media assets directly from XHS CDN endpoints. Supports batch downloading with customizable file naming patterns (template-based: {work_id}_{index}_{timestamp}), automatic format conversion (MP4 video codec normalization, JPEG/PNG image optimization), and resumable downloads with partial file recovery using HTTP range requests.

Solves for

Download a single XHS post's images/videos without watermarks to local storageBatch download all media from multiple XHS URLs with organized folder structureConvert downloaded videos to standard MP4 format for compatibilityResume interrupted downloads without re-downloading completed files

Best for

Content creators archiving their own XHS posts

Researchers collecting media datasets from XHS

Teams building content backup/migration pipelines

Requires

Python 3.9+

FFmpeg installed and in system PATH (for video codec conversion)

Sufficient disk space for media files (videos: 50-500MB each, images: 2-10MB each)

Limitations

Video codec conversion adds 30-120 seconds per video depending on resolution and duration

XHS CDN may serve region-locked content; downloads may fail from certain geographic locations

Large batch downloads (1000+ files) require careful rate-limiting to avoid IP blocking

What makes it unique

Implements a dedicated Download Manager class with resumable HTTP range request support and FFmpeg-based codec normalization, rather than simple file.write() operations, enabling recovery from network interruptions and guaranteed output format compatibility

vs alternatives

More robust than generic download tools because it handles XHS-specific CDN authentication, implements resumable downloads with partial file tracking, and automatically normalizes video codecs for cross-platform compatibility

sqlite-based download history and work metadata persistence

Medium confidence

Stores all downloaded works, extracted links, and search results in a SQLite database with tables for works (work_id, title, author, media_urls, download_status), downloads (download_id, work_id, timestamp, file_paths), and searches (search_query, result_count, timestamp). Implements deduplication logic to prevent re-downloading the same work, tracks download status (pending, completed, failed), and enables querying download history by date range, author, or content type. Database schema includes indexes on frequently-queried columns (work_id, timestamp) for performance.

Solves for

Track which XHS works have been downloaded to avoid duplicatesQuery download history by date range, author, or search queryAnalyze download patterns and content preferences over timeResume interrupted downloads by checking database for partial completion

Best for

Content creators maintaining a personal archive of their own posts

Researchers analyzing download patterns and content trends

Teams building content management systems with XHS integration

Requires

Python 3.9+ (SQLite3 included in standard library)

Writable directory for database file (default: ./data/xhs.db)

Sufficient disk space for database (typically <100MB for 10,000 records)

Limitations

SQLite is single-writer; concurrent downloads from multiple processes may cause database locks

Database file grows over time; 10,000+ download records may impact query performance without maintenance

No built-in data export; exporting to CSV/JSON requires custom queries

What makes it unique

Implements SQLite schema with deduplication indexes and download status tracking, enabling efficient duplicate detection and resumable downloads, rather than simple file-based logging

vs alternatives

More reliable than file-based logging because it provides structured querying, deduplication, and transactional consistency, enabling complex analysis and preventing accidental re-downloads

cookie-based authentication with automatic session refresh

Medium confidence

Manages XiaoHongShu session authentication by storing and refreshing cookies in a persistent cookie jar. Reads cookies from browser storage (via browser extension or manual export) or accepts cookies as configuration input. Implements automatic cookie refresh logic that detects expired sessions (HTTP 401 responses) and attempts to refresh cookies using stored refresh tokens or re-authentication flow. Validates cookie freshness before each request and logs authentication failures for debugging.

Solves for

Authenticate with XHS using stored session cookies from browserAutomatically refresh expired cookies without user interventionHandle authentication failures gracefully and provide clear error messagesSupport multiple XHS accounts by switching cookie sets

Best for

Users who want to automate XHS downloads without manual re-authentication

Teams running long-running download jobs that span multiple days

Developers building XHS integration tools that need reliable authentication

Requires

Python 3.9+

Valid XHS session cookies (exported from browser or provided manually)

Secure storage for cookies (file permissions, encryption recommended)

Limitations

Cookies expire after 30-90 days; long-running processes may require periodic manual cookie refresh

Cookie refresh logic is XHS-specific; changes to XHS authentication flow may break refresh mechanism

Storing cookies in configuration files is a security risk; cookies should be protected with file permissions

What makes it unique

Implements automatic cookie refresh detection (HTTP 401 response handling) with fallback re-authentication flow, rather than requiring manual cookie updates, enabling long-running processes without user intervention

vs alternatives

More reliable than manual cookie management because it automatically detects and refreshes expired sessions, reducing authentication failures and enabling unattended operation

customizable file naming and folder organization with template variables

Medium confidence

Supports template-based file naming and folder organization using variable substitution. Naming templates can include variables like {work_id}, {author}, {title}, {timestamp}, {index} which are replaced with actual values from work metadata. Implements folder structure templates (e.g., {author}/{timestamp}/{work_id}) for organizing downloads into hierarchical directories. Validates template syntax and provides default templates for common use cases (flat structure, author-based organization, date-based organization).

Solves for

Organize downloaded files by author, date, or custom folder structureUse meaningful file names that include work ID, author, or title for easy identificationImplement consistent naming conventions across all downloadsOrganize downloads into date-based folders (e.g., 2024-01-15/work_123.mp4)

Best for

Content creators organizing large archives of downloaded posts

Researchers organizing datasets by metadata (author, date, topic)

Teams implementing consistent file organization standards

Requires

Python 3.9+

Configuration file (settings.json) with naming_template and folder_template fields

Valid work metadata (work_id, author, title, timestamp)

Limitations

Template variables are limited to work metadata; custom variables require code modification

File name length is limited by OS (255 chars on most systems); long titles may be truncated

Special characters in titles may cause file system errors; sanitization is automatic but may produce unexpected names

What makes it unique

Implements variable substitution with metadata-driven template expansion and automatic special character sanitization, rather than fixed naming schemes, enabling flexible organization without code changes

vs alternatives

More flexible than tools with fixed naming schemes because it supports arbitrary folder hierarchies and file naming patterns, enabling users to organize downloads according to their own preferences

batch processing with rate limiting and error recovery

Medium confidence

Supports batch downloading of multiple XHS URLs with configurable rate limiting to avoid triggering XHS anti-scraping measures. Implements exponential backoff retry logic for failed downloads (retry up to 3 times with increasing delays), tracks download progress across the batch, and provides detailed error reports for failed items. Rate limiting is configurable (requests per second, delay between downloads) and can be adjusted based on observed XHS response patterns.

Solves for

Download 100+ XHS posts in a single batch operationAutomatically retry failed downloads without manual interventionAvoid IP blocking by implementing rate limiting and backoffGet detailed reports on which downloads succeeded and which failed

Best for

Content creators backing up large numbers of posts

Researchers collecting large datasets from XHS

Teams running scheduled batch download jobs

Requires

Python 3.9+

List of XHS URLs (file or array)

Configuration for rate limiting (requests_per_second, retry_count, backoff_factor)

Limitations

Rate limiting is conservative; very aggressive settings may trigger IP blocking

Exponential backoff can make batch processing very slow for large batches with high failure rates

Error recovery is automatic but may not work for all error types (e.g., account suspension)

What makes it unique

Implements exponential backoff retry logic with configurable rate limiting and detailed error tracking, rather than simple sequential processing, enabling robust batch operations that recover from transient failures

vs alternatives

More reliable than simple batch scripts because it automatically retries failed downloads, implements rate limiting to avoid IP blocking, and provides detailed error reports for debugging

configuration management with settings.json persistence and validation

Medium confidence

Manages all user-configurable parameters through a settings.json file with schema validation and default values. Supports configuration hierarchy: command-line arguments override settings.json, which overrides built-in defaults. Implements configuration validation (type checking, range validation for numeric fields, enum validation for choice fields) and provides clear error messages for invalid configurations. Automatically migrates settings.json schema when application version changes, preserving user settings while adding new fields.

Solves for

Configure XHS-Downloader without modifying code (output directory, naming template, rate limiting)Override settings for individual downloads via command-line argumentsMigrate settings to new application versions without losing configurationValidate configuration before starting downloads to catch errors early

Best for

Non-technical users who want to configure settings through a file

DevOps teams deploying XHS-Downloader with environment-specific configuration

Developers building tools that wrap XHS-Downloader and need to pass configuration

Requires

Python 3.9+

settings.json file in application directory or specified via --config argument

Valid JSON syntax (common source of errors)

Limitations

JSON format is not user-friendly for complex configurations; YAML or TOML might be better

Configuration validation is basic; complex validation rules require custom code

Schema migration is manual; breaking changes in settings.json require user intervention

What makes it unique

Implements configuration hierarchy (CLI args > settings.json > defaults) with schema validation and automatic migration, rather than hard-coded defaults, enabling flexible configuration without code changes

vs alternatives

More maintainable than tools with hard-coded configuration because it supports persistent settings, command-line overrides, and automatic schema migration, reducing user friction and supporting multiple deployment scenarios

user profile link extraction and work collection aggregation

Medium confidence

Extracts and aggregates work links from XiaoHongShu user profiles across multiple collection types: published works, bookmarked/saved posts, liked posts, and custom albums. Uses paginated API requests to the XHS user profile endpoint with cursor-based pagination, iterating through all available pages to build a complete inventory of work URLs. Stores extracted links in SQLite database with metadata (collection type, extraction timestamp, user ID) for deduplication and tracking.

Solves for

Extract all published works from a specific XHS user profileGet all bookmarked/saved posts from a user's collectionRetrieve all liked posts from a user's activity historyExport a user's entire work portfolio as a list of URLs for backup or analysis

Best for

Content creators backing up their own profile data

Researchers analyzing user posting patterns and content preferences

Teams building content discovery tools for XHS

Requires

Python 3.9+

Valid XHS session cookies with profile viewing permissions

SQLite3 (included in Python standard library)

Limitations

Pagination may be limited by XHS; profiles with 10,000+ works may not expose all links through API

Bookmarked/liked collections are only accessible if the authenticated user has permission (private profiles blocked)

Cursor-based pagination requires maintaining state across requests; session interruption requires restart

What makes it unique

Implements cursor-based pagination state management with SQLite deduplication tracking, rather than simple list accumulation, enabling recovery from interruptions and prevention of duplicate URL extraction across multiple runs

vs alternatives

More complete than manual profile browsing because it automatically handles pagination across all work collections and stores results persistently, avoiding manual copy-paste and enabling batch processing of multiple profiles

search result link extraction and filtering

Medium confidence

Executes XiaoHongShu search queries and extracts work and user links from paginated search results. Sends search requests to XHS search API endpoint with query parameters (keyword, filters, sort order), processes paginated JSON responses containing work and user cards, and extracts URLs with optional filtering by content type, engagement metrics, or publication date. Results are stored in SQLite with search metadata for reproducibility.

Solves for

Extract all work links matching a specific search keyword from XHSFind user profiles related to a search topicFilter search results by engagement metrics (likes, comments) before extractionBuild a dataset of trending content around a specific topic

Best for

Market researchers analyzing trending topics on XHS

Content creators finding inspiration and competitive content

Teams building content recommendation systems

Requires

Python 3.9+

Valid XHS session cookies

Search query string (Chinese or English, depending on XHS language setting)

Limitations

Search results are ranked by XHS algorithm; extraction order does not guarantee completeness or representativeness

XHS search API may return different results based on user profile, location, and time; results are not reproducible across different accounts

Search filters (date range, engagement metrics) are limited to what XHS API exposes; custom filtering requires post-processing

What makes it unique

Implements search result pagination with XHS-specific ranking algorithm awareness and stores search metadata (query, timestamp, result count) in SQLite, enabling reproducible search result tracking and trend analysis over time

vs alternatives

More systematic than manual search browsing because it automates pagination, stores results persistently, and enables filtering and analysis of search trends across multiple queries and time periods

multi-interface request routing and execution mode dispatch

Medium confidence

Implements a single entry point (main.py) that dispatches execution to five distinct user interfaces based on command-line arguments and configuration: Terminal UI (TUI) for interactive use, CLI for single-command automation, Browser UserScript for in-browser convenience, REST API Server for programmatic integration, and MCP Server for AI assistant integration. Each interface converges on the core XHS class, which coordinates content extraction, download, and storage operations through a shared processing pipeline.

Solves for

Run XHS-Downloader interactively with a terminal UI for manual downloadsExecute single-command downloads via CLI for scripting and automationIntegrate XHS-Downloader with browser workflows using Tampermonkey UserScriptBuild custom applications using REST API endpoints+1 more

Best for

Solo developers building LLM agents that need XHS content access

DevOps teams deploying XHS-Downloader as a containerized microservice

Non-technical users who prefer interactive terminal UI

Requires

Python 3.9+

Command-line arguments or configuration file (settings.json)

For TUI: Terminal with 80x24 minimum dimensions

Limitations

TUI requires terminal with ANSI color support; may not work in some CI/CD environments

CLI mode processes one URL at a time; batch processing requires external shell scripting

Browser UserScript requires Tampermonkey/Greasemonkey extension; not compatible with all browsers

What makes it unique

Implements a unified architecture where five distinct interfaces (TUI, CLI, UserScript, REST API, MCP) all converge on a single XHS core class, rather than maintaining separate codebases, enabling consistent behavior and simplified maintenance across all deployment modes

vs alternatives

More flexible than single-interface tools because it supports interactive, scripted, browser-based, programmatic, and AI-integrated workflows from the same codebase, reducing deployment complexity and enabling seamless switching between use cases

terminal user interface (tui) with clipboard monitoring and interactive settings

Medium confidence

Provides an interactive terminal-based UI built with the Textual framework that monitors system clipboard for XHS URLs, displays real-time download progress, and allows users to configure settings without editing JSON files. Implements a multi-panel layout with URL input field, download queue display, progress bars, and settings editor. Clipboard monitoring runs in a background thread that detects new XHS URLs and automatically queues them for download when enabled.

Solves for

Monitor clipboard and automatically download XHS content as URLs are copiedView real-time download progress with visual progress bars and status messagesConfigure download settings (output directory, naming template, format preferences) through interactive UIManage download queue (pause, resume, cancel individual downloads)

Best for

Non-technical users who prefer graphical interfaces over command-line

Content creators who frequently download XHS posts and want clipboard automation

Users who want to configure settings without manually editing JSON

Requires

Python 3.9+

Textual library (Python TUI framework)

OS-specific clipboard access tool (pyperclip for cross-platform, or native tools)

Limitations

Requires terminal with ANSI color support; may not render correctly in older terminals or CI/CD environments

Clipboard monitoring is OS-specific; implementation differs for Windows (pyperclip), macOS (pbpaste), Linux (xclip/xsel)

TUI performance degrades with very large download queues (100+ items); UI responsiveness may lag

What makes it unique

Implements background clipboard monitoring thread with Textual event loop integration, enabling real-time URL detection and automatic queue management without blocking the UI, rather than polling-based clipboard checks that freeze the interface

vs alternatives

More user-friendly than CLI-only tools because it provides visual feedback, real-time progress tracking, and interactive settings management without requiring JSON file editing or command-line knowledge

command-line interface (cli) with selective media index specification

Medium confidence

Provides a command-line interface for single-command downloads with support for specifying which images/videos to download from a multi-media post. Accepts XHS URL as argument, optional image index range (e.g., --images 1-3 to download only images 1-3), and configuration overrides (--output-dir, --naming-template). Executes download synchronously and exits with status code indicating success/failure, suitable for shell scripts and CI/CD pipelines.

Solves for

Download a single XHS post from command line without interactive UISpecify which images from a multi-image post to download (e.g., only first 3 images)Integrate XHS downloads into shell scripts and automation workflowsOverride configuration settings for a single download without modifying settings.json

Best for

DevOps engineers building CI/CD pipelines that need XHS content

System administrators automating content backup tasks

Developers integrating XHS downloads into shell scripts

Requires

Python 3.9+

XHS URL as command-line argument

Valid XHS session cookies configured in settings.json

Limitations

CLI processes one URL at a time; batch processing requires external loop (for, xargs, etc.)

Image index specification only works for multi-image posts; single-image posts ignore index parameter

No progress feedback during download; long downloads appear to hang without output

What makes it unique

Implements selective media index parsing with range syntax support (1-3, 1,2,3) and validates indices against actual post media count before download, rather than blindly accepting any index specification

vs alternatives

More scriptable than interactive tools because it supports non-blocking execution, exit status codes, and configuration overrides, enabling seamless integration into shell scripts and CI/CD pipelines

browser userscript integration with dual-mode operation

Medium confidence

Provides a Tampermonkey/Greasemonkey UserScript that runs in the browser and enables in-browser downloads or server-push mode. In standalone mode, the script uses browser APIs (Fetch API, Blob) to download files directly to the user's Downloads folder. In server-push mode (script_server=true in config), the script detects XHS URLs on the page, extracts work IDs, and sends download requests to a running XHS-Downloader instance via HTTP POST, offloading processing to the server while providing browser-side convenience.

Solves for

Download XHS posts directly from the browser without leaving the pageExtract XHS work links from search results and profile pages for batch processingPush download tasks to a running XHS-Downloader server from the browserIntegrate XHS downloads into browser workflows without CLI or TUI

Best for

Content creators who frequently browse XHS and want one-click downloads

Teams running a centralized XHS-Downloader server and want browser-based task submission

Users who prefer browser-based workflows over command-line tools

Requires

Firefox or Chrome browser

Tampermonkey or Greasemonkey extension installed

Valid XHS session cookies in browser

Limitations

Standalone mode downloads are limited by browser security (CORS, same-origin policy); may fail for some media URLs

Server-push mode requires a running XHS-Downloader instance on localhost:5556; network connectivity issues cause failures

UserScript is browser-specific; Firefox and Chrome implementations may differ slightly

What makes it unique

Implements dual-mode operation where the same UserScript can function standalone (browser-based downloads) or server-push (task delegation), with configuration-driven mode selection, rather than requiring separate scripts for each mode

vs alternatives

More convenient than CLI/TUI tools because it integrates directly into the browser workflow, enabling one-click downloads without switching windows or opening terminals

rest api server with json request/response protocol

Medium confidence

Exposes XHS-Downloader functionality as a REST API server running on port 5556 with FastAPI framework. Provides endpoints for work detail retrieval (/xhs/detail), download submission (/xhs/download), and download status polling (/xhs/status). Accepts JSON request bodies with XHS URLs and configuration parameters, processes requests asynchronously using a task queue, and returns JSON responses with download status, file paths, and error messages. Supports CORS for cross-origin requests from web applications.

Solves for

Build custom web applications that integrate XHS content extractionSubmit download tasks from external applications and poll for completionRetrieve work metadata via REST API for integration with other servicesDeploy XHS-Downloader as a microservice in a larger application architecture

Best for

Full-stack developers building web applications that need XHS integration

Teams deploying XHS-Downloader as a containerized microservice

Developers building custom dashboards or management interfaces

Requires

Python 3.9+

FastAPI library

Port 5556 available and not blocked by firewall

Limitations

API Server requires separate process management; no built-in process supervisor or auto-restart

Task queue is in-memory; download tasks are lost if the server crashes (no persistence)

No authentication/authorization; anyone with network access can submit download requests

What makes it unique

Implements FastAPI-based REST API with asynchronous task queue and CORS support, rather than simple HTTP server, enabling concurrent request handling and cross-origin web application integration

vs alternatives

More scalable than CLI/TUI tools because it supports concurrent requests, enables programmatic integration with web applications, and can be deployed as a containerized microservice

model context protocol (mcp) server integration for ai assistants

Medium confidence

Exposes XHS-Downloader as an MCP Server running on port 5556, enabling AI assistants (Claude, ChatGPT with plugins, etc.) to call XHS-Downloader functions as tools. Implements MCP protocol handlers for work extraction, download submission, and status checking. AI assistants can invoke these tools within their reasoning loops, enabling autonomous content extraction and download workflows orchestrated by the AI model.

Solves for

Enable AI assistants to autonomously extract and download XHS content as part of larger tasksBuild AI-powered content analysis workflows that fetch XHS posts and analyze themAllow AI assistants to help users download XHS content through natural language commandsIntegrate XHS content extraction into AI agent reasoning loops

Best for

Solo developers building LLM agents that need XHS content access

Teams building AI-powered content analysis platforms

Researchers exploring AI agent capabilities for social media content extraction

Requires

Python 3.9+

MCP protocol library (implementation details in DeepWiki)

AI assistant with MCP plugin support (Claude with MCP, etc.)

Limitations

MCP Server requires AI assistant with MCP plugin support; not all AI platforms support MCP

Tool invocation is limited by AI model's context window; large download batches may exceed token limits

No built-in rate limiting; AI models may submit excessive requests and trigger XHS IP blocking

What makes it unique

Implements MCP Server protocol handlers that expose XHS-Downloader as callable tools for AI assistants, enabling autonomous content extraction within AI reasoning loops, rather than requiring manual user invocation

vs alternatives

Enables AI-driven automation that CLI/TUI/API tools cannot achieve; AI assistants can autonomously decide when to extract content, analyze results, and adapt workflows based on outcomes

Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.

Related Artifactssharing capabilities

Artifacts that share capabilities with XHS-Downloader, ranked by overlap. Discovered automatically through the match graph.

MCP Server47

XHS-Downloader

watermark-free media download with format conversionxiaohongshu work metadata extraction and parsingsqlite-based download history and metadata persistencebatch url extraction from user profiles, collections, and search results

4 shared capabilities

MCP Server40

xiaohongshu-mcp

MCP for xiaohongshu.com

dom-based data extraction and parsing with brittle resiliencefull-text search across xiaohongshu posts with result rankingfeed retrieval and pagination with cursor-based navigation

3 shared capabilities

Dataset28

img2dataset

Easily turn a set of image urls to an image dataset

concurrent http image downloading with thread poolingmulti-format url list parsing and metadata extraction

2 shared capabilities

MCP Server45

Agent-Reach

Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.

xiaohongshu-content-reading-and-search

1 shared capability

Benchmark48

local-deep-research

Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini). Supports local and cloud LLMs (Ollama, Google, Anthropic, ...). Searches 10+ sources - arXiv, PubMed, web, and your private documents. Everything Local & Encrypted.

document download and management with automatic metadata extraction

1 shared capability

MCP Server29

mcp-smart-crawler

A command-line tool acting as an MCP (ModelContextProtocol) server, using Playwright to crawl web content for AI models.

xiaohongshu (little red book) platform-specific content extraction

1 shared capability

Best For

✓Developers building content aggregation tools targeting XHS
✓Data analysts collecting social media metrics from Chinese platforms
✓Teams migrating content from XHS to other platforms
✓Content creators archiving their own XHS posts
✓Researchers collecting media datasets from XHS
✓Teams building content backup/migration pipelines
✓Content creators maintaining a personal archive of their own posts
✓Researchers analyzing download patterns and content trends

Known Limitations

⚠Requires valid XHS session cookies; authentication fails if cookies expire or are revoked
⚠Rate-limited by XHS servers; batch extraction of 100+ URLs may trigger temporary IP blocks
⚠Cannot extract private/deleted posts or content from suspended accounts
⚠Metadata structure may change with XHS platform updates, requiring code maintenance
⚠Video codec conversion adds 30-120 seconds per video depending on resolution and duration
⚠XHS CDN may serve region-locked content; downloads may fail from certain geographic locations

Requirements

Python 3.9+Valid XiaoHongShu account with active session cookieshttpx library for async HTTP requestsNetwork access to XHS API endpoints (not blocked by firewall/VPN restrictions)FFmpeg installed and in system PATH (for video codec conversion)Sufficient disk space for media files (videos: 50-500MB each, images: 2-10MB each)Valid XHS session cookies with download permissionsPython 3.9+ (SQLite3 included in standard library)

Input / Output

Accepts: XHS work URL (string, format: https://www.xiaohongshu.com/explore/[work_id]), XHS short URL (format: https://xhs.com/[short_code]), XHS work URL (string), List of XHS URLs (JSON array or newline-delimited text), Download configuration object (output directory, naming template, format preferences), Work metadata (work_id, title, author, media_urls), Download completion events (work_id, file_paths, timestamp), Search queries and results, Cookie string (format: 'cookie_name=value; cookie_name2=value2'), Cookie file (Netscape format, exported from browser), Configuration file (settings.json with cookies field), Naming template string (e.g., '{work_id}_{author}_{timestamp}'), Folder template string (e.g., '{author}/{timestamp}'), Work metadata (work_id, author, title, timestamp, index), List of XHS URLs (JSON array, newline-delimited text, or CSV file), Rate limiting configuration (requests_per_second, retry_count, backoff_factor), Error handling preferences (fail_fast vs continue_on_error), settings.json file (JSON format), Command-line arguments (--output-dir, --naming-template, etc.), Environment variables (optional, for containerized deployment), XHS user profile URL (string, format: https://www.xiaohongshu.com/user/[user_id]), XHS user ID (string), Collection type filter (enum: 'published', 'bookmarked', 'liked', 'albums'), Search query string (text, e.g., '小红书美妆教程'), Search filter parameters (JSON object with optional: sort_order, content_type, date_range), Pagination limit (integer, default 50 results per page), Command-line arguments (--url, --mode, --config), Configuration file (settings.json with JSON structure), HTTP requests (for API Server mode), MCP protocol messages (for MCP Server mode), User keyboard input (URL paste, menu navigation, settings input), System clipboard content (XHS URLs), Configuration file (settings.json for initial state), XHS URL (positional argument), Image index range (--images flag, format: '1-3' or '1,2,3'), Configuration overrides (--output-dir, --naming-template, etc.), XHS work URLs (detected from page content or user-selected links), Configuration (script_server flag, server URL, download preferences), JSON request body with XHS URL and optional configuration (output_dir, naming_template, image_indices), HTTP GET/POST parameters, MCP tool call requests from AI assistant (work_url, download_config parameters), Natural language commands from user (processed by AI assistant into MCP calls)

Produces: JSON object with work metadata (title, description, author, media URLs, stats), Structured Python dict with normalized field names, Downloaded image files (JPEG, PNG), Downloaded video files (MP4, WebM), Download manifest JSON (metadata about downloaded files, timestamps, file paths), SQLite database tables with structured data, Query results (JSON, Python dict, or raw SQL rows), Download history reports (CSV export, JSON export), Authenticated HTTP requests with cookie headers, Authentication status messages (success, expired, refresh failed), Updated cookie jar with refreshed cookies, Generated file names (string), Generated folder paths (string), Downloaded files organized in specified folder structure, Downloaded files organized in output directory, Batch processing report (JSON or CSV with success/failure status for each URL), Error log with details on failed downloads, Validated configuration object (Python dict), Error messages for invalid configuration, Updated settings.json file (after schema migration), List of XHS work URLs (JSON array or newline-delimited text), SQLite database table with extracted links and metadata, CSV export of work links with collection type and extraction timestamp, List of work URLs from search results (JSON array), List of user profile URLs from search results (JSON array), SQLite database table with search results, query metadata, and extraction timestamp, Downloaded files (images, videos), Terminal UI interactive prompts and status updates, REST API JSON responses, MCP protocol responses, Terminal UI display (text, progress bars, status messages), Updated settings.json file (when user saves settings), Downloaded image/video files, Exit status code (0 for success, non-zero for failure), Stdout/stderr messages (optional, if --verbose flag enabled), Downloaded files in browser Downloads folder (standalone mode), HTTP POST requests to XHS-Downloader server (server-push mode), Extracted work links for manual processing, JSON response with download status, file paths, and metadata, HTTP status codes (200 for success, 400 for invalid input, 500 for server errors), MCP tool response messages (download status, file paths, metadata), Structured data for AI assistant to process and respond to user

UnfragileRank

Adoption35%(30% weight)

Quality53%(25% weight)

Ecosystem60%(25% weight)

Match Graph10%(15% weight)

Freshness75%(5% weight)

UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.

Type: MCP Server

15 capabilities

Visit XHS-Downloader→

Repository Details

10,898

Stars

1,660

Forks

Python

Language

GPL-3.0

License

Topics

apidockerdownloadfastapihttpxjavascriptjsonlinuxmacosmcp-serverpyinstallerpythonrednoteserversqlitetampermonkeytextualuserscriptwindowsxiaohongshu

Last commit: Apr 21, 2026

About

Alternatives to XHS-Downloader

IntelliCode50Extension

AI-assisted development

Compare →

GitHub Copilot Chat53Extension

AI chat features powered by Copilot

Compare →

GitHub Copilot52Extension

Your AI pair programmer

Compare →

Claude Code for VS Code52Extension

Claude Code for VS Code: Harness the power of Claude Code without leaving your IDE

Compare →

Are you the builder of XHS-Downloader?

Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.

Claim this artifact →Verification via email

Get the weekly brief

New tools, rising stars, and what's actually worth your time. No spam.

Data Sources

github

Looking for something else?

Search →

Capabilities15 decomposed

xiaohongshu work url parsing and metadata extraction

Medium confidence

Solves for

Best for

Developers building content aggregation tools targeting XHS

Data analysts collecting social media metrics from Chinese platforms

Teams migrating content from XHS to other platforms

Requires

Python 3.9+

Valid XiaoHongShu account with active session cookies

httpx library for async HTTP requests

Limitations

Requires valid XHS session cookies; authentication fails if cookies expire or are revoked

Rate-limited by XHS servers; batch extraction of 100+ URLs may trigger temporary IP blocks

Cannot extract private/deleted posts or content from suspended accounts

What makes it unique

vs alternatives

Faster and more reliable than browser automation tools (Selenium, Puppeteer) because it directly calls XHS JSON APIs after cookie authentication, avoiding DOM parsing and browser overhead

watermark-free media download with format conversion

Medium confidence

Solves for

Best for

Content creators archiving their own XHS posts

Researchers collecting media datasets from XHS

Teams building content backup/migration pipelines

Requires

Python 3.9+

FFmpeg installed and in system PATH (for video codec conversion)

Sufficient disk space for media files (videos: 50-500MB each, images: 2-10MB each)

Limitations

Video codec conversion adds 30-120 seconds per video depending on resolution and duration

XHS CDN may serve region-locked content; downloads may fail from certain geographic locations

Large batch downloads (1000+ files) require careful rate-limiting to avoid IP blocking

What makes it unique

vs alternatives

sqlite-based download history and work metadata persistence

Medium confidence

Solves for

Best for

Content creators maintaining a personal archive of their own posts

Researchers analyzing download patterns and content trends

Teams building content management systems with XHS integration

Requires

Python 3.9+ (SQLite3 included in standard library)

Writable directory for database file (default: ./data/xhs.db)

Sufficient disk space for database (typically <100MB for 10,000 records)

Limitations

SQLite is single-writer; concurrent downloads from multiple processes may cause database locks

Database file grows over time; 10,000+ download records may impact query performance without maintenance

No built-in data export; exporting to CSV/JSON requires custom queries

What makes it unique

Implements SQLite schema with deduplication indexes and download status tracking, enabling efficient duplicate detection and resumable downloads, rather than simple file-based logging

vs alternatives

More reliable than file-based logging because it provides structured querying, deduplication, and transactional consistency, enabling complex analysis and preventing accidental re-downloads

cookie-based authentication with automatic session refresh

Medium confidence

Solves for

Best for

Users who want to automate XHS downloads without manual re-authentication

Teams running long-running download jobs that span multiple days

Developers building XHS integration tools that need reliable authentication

Requires

Python 3.9+

Valid XHS session cookies (exported from browser or provided manually)

Secure storage for cookies (file permissions, encryption recommended)

Limitations

Cookies expire after 30-90 days; long-running processes may require periodic manual cookie refresh

Cookie refresh logic is XHS-specific; changes to XHS authentication flow may break refresh mechanism

Storing cookies in configuration files is a security risk; cookies should be protected with file permissions

What makes it unique

vs alternatives

More reliable than manual cookie management because it automatically detects and refreshes expired sessions, reducing authentication failures and enabling unattended operation

customizable file naming and folder organization with template variables

Medium confidence

Solves for

Best for

Content creators organizing large archives of downloaded posts

Researchers organizing datasets by metadata (author, date, topic)

Teams implementing consistent file organization standards

Requires

Python 3.9+

Configuration file (settings.json) with naming_template and folder_template fields

Valid work metadata (work_id, author, title, timestamp)

Limitations

Template variables are limited to work metadata; custom variables require code modification

File name length is limited by OS (255 chars on most systems); long titles may be truncated

Special characters in titles may cause file system errors; sanitization is automatic but may produce unexpected names

What makes it unique

vs alternatives

More flexible than tools with fixed naming schemes because it supports arbitrary folder hierarchies and file naming patterns, enabling users to organize downloads according to their own preferences

batch processing with rate limiting and error recovery

Medium confidence

Solves for

Best for

Content creators backing up large numbers of posts

Researchers collecting large datasets from XHS

Teams running scheduled batch download jobs

Requires

Python 3.9+

List of XHS URLs (file or array)

Configuration for rate limiting (requests_per_second, retry_count, backoff_factor)

Limitations

Rate limiting is conservative; very aggressive settings may trigger IP blocking

Exponential backoff can make batch processing very slow for large batches with high failure rates

Error recovery is automatic but may not work for all error types (e.g., account suspension)

What makes it unique

vs alternatives

More reliable than simple batch scripts because it automatically retries failed downloads, implements rate limiting to avoid IP blocking, and provides detailed error reports for debugging

configuration management with settings.json persistence and validation

Medium confidence

Solves for

Best for

Non-technical users who want to configure settings through a file

DevOps teams deploying XHS-Downloader with environment-specific configuration

Developers building tools that wrap XHS-Downloader and need to pass configuration

Requires

Python 3.9+

settings.json file in application directory or specified via --config argument

Valid JSON syntax (common source of errors)

Limitations

JSON format is not user-friendly for complex configurations; YAML or TOML might be better

Configuration validation is basic; complex validation rules require custom code

Schema migration is manual; breaking changes in settings.json require user intervention

What makes it unique

vs alternatives

user profile link extraction and work collection aggregation

Medium confidence

Solves for

Best for

Content creators backing up their own profile data

Researchers analyzing user posting patterns and content preferences

Teams building content discovery tools for XHS

Requires

Python 3.9+

Valid XHS session cookies with profile viewing permissions

SQLite3 (included in Python standard library)

Limitations

Pagination may be limited by XHS; profiles with 10,000+ works may not expose all links through API

Bookmarked/liked collections are only accessible if the authenticated user has permission (private profiles blocked)

Cursor-based pagination requires maintaining state across requests; session interruption requires restart

What makes it unique

vs alternatives

search result link extraction and filtering

Medium confidence

Solves for

Best for

Market researchers analyzing trending topics on XHS

Content creators finding inspiration and competitive content

Teams building content recommendation systems

Requires

Python 3.9+

Valid XHS session cookies

Search query string (Chinese or English, depending on XHS language setting)

Limitations

Search results are ranked by XHS algorithm; extraction order does not guarantee completeness or representativeness

XHS search API may return different results based on user profile, location, and time; results are not reproducible across different accounts

Search filters (date range, engagement metrics) are limited to what XHS API exposes; custom filtering requires post-processing

What makes it unique

vs alternatives

More systematic than manual search browsing because it automates pagination, stores results persistently, and enables filtering and analysis of search trends across multiple queries and time periods

multi-interface request routing and execution mode dispatch

Medium confidence

Solves for

Best for

Solo developers building LLM agents that need XHS content access

DevOps teams deploying XHS-Downloader as a containerized microservice

Non-technical users who prefer interactive terminal UI

Requires

Python 3.9+

Command-line arguments or configuration file (settings.json)

For TUI: Terminal with 80x24 minimum dimensions

Limitations

TUI requires terminal with ANSI color support; may not work in some CI/CD environments

CLI mode processes one URL at a time; batch processing requires external shell scripting

Browser UserScript requires Tampermonkey/Greasemonkey extension; not compatible with all browsers

What makes it unique

vs alternatives

terminal user interface (tui) with clipboard monitoring and interactive settings

Medium confidence

Solves for

Best for

Non-technical users who prefer graphical interfaces over command-line

Content creators who frequently download XHS posts and want clipboard automation

Users who want to configure settings without manually editing JSON

Requires

Python 3.9+

Textual library (Python TUI framework)

OS-specific clipboard access tool (pyperclip for cross-platform, or native tools)

Limitations

Requires terminal with ANSI color support; may not render correctly in older terminals or CI/CD environments

Clipboard monitoring is OS-specific; implementation differs for Windows (pyperclip), macOS (pbpaste), Linux (xclip/xsel)

TUI performance degrades with very large download queues (100+ items); UI responsiveness may lag

What makes it unique

vs alternatives

command-line interface (cli) with selective media index specification

Medium confidence

Solves for

Best for

DevOps engineers building CI/CD pipelines that need XHS content

System administrators automating content backup tasks

Developers integrating XHS downloads into shell scripts

Requires

Python 3.9+

XHS URL as command-line argument

Valid XHS session cookies configured in settings.json

Limitations

CLI processes one URL at a time; batch processing requires external loop (for, xargs, etc.)

Image index specification only works for multi-image posts; single-image posts ignore index parameter

No progress feedback during download; long downloads appear to hang without output

What makes it unique

vs alternatives

More scriptable than interactive tools because it supports non-blocking execution, exit status codes, and configuration overrides, enabling seamless integration into shell scripts and CI/CD pipelines

browser userscript integration with dual-mode operation

Medium confidence

Solves for

Best for

Content creators who frequently browse XHS and want one-click downloads

Teams running a centralized XHS-Downloader server and want browser-based task submission

Users who prefer browser-based workflows over command-line tools

Requires

Firefox or Chrome browser

Tampermonkey or Greasemonkey extension installed

Valid XHS session cookies in browser

Limitations

Standalone mode downloads are limited by browser security (CORS, same-origin policy); may fail for some media URLs

Server-push mode requires a running XHS-Downloader instance on localhost:5556; network connectivity issues cause failures

UserScript is browser-specific; Firefox and Chrome implementations may differ slightly

What makes it unique

vs alternatives

More convenient than CLI/TUI tools because it integrates directly into the browser workflow, enabling one-click downloads without switching windows or opening terminals

rest api server with json request/response protocol

Medium confidence

Solves for

Best for

Full-stack developers building web applications that need XHS integration

Teams deploying XHS-Downloader as a containerized microservice

Developers building custom dashboards or management interfaces

Requires

Python 3.9+

FastAPI library

Port 5556 available and not blocked by firewall

Limitations

API Server requires separate process management; no built-in process supervisor or auto-restart

Task queue is in-memory; download tasks are lost if the server crashes (no persistence)

No authentication/authorization; anyone with network access can submit download requests

What makes it unique

Implements FastAPI-based REST API with asynchronous task queue and CORS support, rather than simple HTTP server, enabling concurrent request handling and cross-origin web application integration

vs alternatives

More scalable than CLI/TUI tools because it supports concurrent requests, enables programmatic integration with web applications, and can be deployed as a containerized microservice

model context protocol (mcp) server integration for ai assistants

Medium confidence

Solves for

Best for

Solo developers building LLM agents that need XHS content access

Teams building AI-powered content analysis platforms

Researchers exploring AI agent capabilities for social media content extraction

Requires

Python 3.9+

MCP protocol library (implementation details in DeepWiki)

AI assistant with MCP plugin support (Claude with MCP, etc.)

Limitations

MCP Server requires AI assistant with MCP plugin support; not all AI platforms support MCP

Tool invocation is limited by AI model's context window; large download batches may exceed token limits

No built-in rate limiting; AI models may submit excessive requests and trigger XHS IP blocking

What makes it unique

vs alternatives

Enables AI-driven automation that CLI/TUI/API tools cannot achieve; AI assistants can autonomously decide when to extract content, analyze results, and adapt workflows based on outcomes

Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.

Alternatives to XHS-Downloader

IntelliCode50Extension

AI-assisted development

Compare →

GitHub Copilot Chat53Extension

AI chat features powered by Copilot

Compare →

GitHub Copilot52Extension

Your AI pair programmer

Compare →

Claude Code for VS Code52Extension

Claude Code for VS Code: Harness the power of Claude Code without leaving your IDE

Compare →

XHS-Downloader

Capabilities15 decomposed

xiaohongshu work url parsing and metadata extraction

watermark-free media download with format conversion

sqlite-based download history and work metadata persistence

cookie-based authentication with automatic session refresh

customizable file naming and folder organization with template variables

batch processing with rate limiting and error recovery

configuration management with settings.json persistence and validation

user profile link extraction and work collection aggregation

search result link extraction and filtering

multi-interface request routing and execution mode dispatch

terminal user interface (tui) with clipboard monitoring and interactive settings

command-line interface (cli) with selective media index specification

browser userscript integration with dual-mode operation

rest api server with json request/response protocol

model context protocol (mcp) server integration for ai assistants

Related Artifactssharing capabilities

XHS-Downloader

xiaohongshu-mcp

img2dataset

Agent-Reach

local-deep-research

mcp-smart-crawler

Best For

Known Limitations

Requirements

Input / Output

UnfragileRank

Repository Details

About

Categories

Alternatives to XHS-Downloader

Are you the builder of XHS-Downloader?

Get the weekly brief

Data Sources

XHS-Downloader

Capabilities15 decomposed

xiaohongshu work url parsing and metadata extraction

watermark-free media download with format conversion

sqlite-based download history and work metadata persistence

cookie-based authentication with automatic session refresh

customizable file naming and folder organization with template variables

batch processing with rate limiting and error recovery

configuration management with settings.json persistence and validation

user profile link extraction and work collection aggregation

search result link extraction and filtering

multi-interface request routing and execution mode dispatch

terminal user interface (tui) with clipboard monitoring and interactive settings

command-line interface (cli) with selective media index specification

browser userscript integration with dual-mode operation

rest api server with json request/response protocol

model context protocol (mcp) server integration for ai assistants

Related Artifactssharing capabilities

XHS-Downloader

xiaohongshu-mcp

img2dataset

Agent-Reach

local-deep-research

mcp-smart-crawler

Best For

Known Limitations

Requirements

Input / Output

UnfragileRank

Repository Details

About

Categories

Alternatives to XHS-Downloader

Are you the builder of XHS-Downloader?

Get the weekly brief

Data Sources