datagouv-mcp
MCP ServerFreeOfficial data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.
Capabilities12 decomposed
keyword-based dataset discovery via federated search
Medium confidenceExposes the data.gouv.fr API v1 GET /1/datasets/ endpoint through an MCP tool that accepts free-text search queries and returns paginated dataset metadata (title, description, organization, tags, update frequency). Implements client-side pagination and result ranking to surface the most relevant datasets from France's national open data catalog without requiring users to manually navigate the web interface.
Directly wraps data.gouv.fr's native search API through MCP protocol, enabling conversational dataset discovery without web scraping or custom indexing — the server acts as a thin, read-only proxy that preserves the platform's native ranking and filtering logic.
Unlike generic web search or manual catalog browsing, this provides structured, ranked results from the authoritative French government data platform with guaranteed freshness and official metadata.
full-dataset metadata retrieval with resource inventory
Medium confidenceFetches complete metadata for a single dataset by ID from data.gouv.fr API v1 GET /1/datasets/{id}/, returning title, description, organization, tags, creation/update timestamps, license, and a complete inventory of all associated resources (files). Uses a single API call per dataset to avoid N+1 queries and provides structured output suitable for downstream resource selection or analysis planning.
Provides a single atomic call to retrieve complete dataset context including all resources, avoiding the need for separate API calls per resource and enabling AI agents to make informed decisions about which files to query or download.
More efficient than iterating through individual resource endpoints; returns the full dataset graph in one call, reducing latency and simplifying agent planning logic compared to sequential resource lookups.
docker containerization and cloud-ready deployment
Medium confidenceProvides a Dockerfile and Docker Compose configuration for containerized deployment, enabling the MCP server to run in Kubernetes, Docker Swarm, or any container orchestration platform. The container exposes port 8000 (HTTP) and includes health check configuration (GET /health endpoint) for orchestrator integration. Supports environment variable configuration for API endpoints, logging levels, and other runtime parameters, enabling deployment across development, staging, and production environments without code changes.
Provides production-ready Docker configuration with health check integration and environment variable support, enabling seamless deployment to any container orchestration platform without modification — the server is stateless and horizontally scalable.
Ready-to-deploy container image reduces operational overhead compared to manual installation; stateless design enables horizontal scaling and zero-downtime updates.
environment-driven configuration and multi-instance deployment
Medium confidenceCentralizes all runtime configuration (API endpoints, logging levels, server port, CORS settings, etc.) in environment variables, enabling the same Docker image or Python process to run in different environments without code changes. Configuration is loaded at startup via a dedicated configuration module that validates and provides defaults. Supports multi-instance deployments where each instance can be configured independently via environment variables, enabling load-balanced and highly-available setups.
Uses environment variables for all configuration, enabling the same codebase and Docker image to run in any environment without modification — this is a cloud-native best practice (12-factor app methodology).
Simpler and more portable than configuration files or hardcoded settings; integrates seamlessly with container orchestration platforms (Kubernetes, Docker Swarm) that manage environment variables.
resource-level metadata and tabular api availability detection
Medium confidenceQueries data.gouv.fr API v2 GET /2/datasets/resources/{id}/ to retrieve detailed metadata for a single file/resource, including format (CSV, XLSX, JSON, etc.), file size, MIME type, and critically, whether the resource supports the Tabular API (a data.gouv.fr feature enabling row-level querying without full download). Returns structured metadata that allows agents to decide between streaming/parsing (for unsupported formats) or direct tabular queries (for supported formats).
Explicitly surfaces Tabular API availability as a first-class capability, enabling agents to make intelligent routing decisions between direct querying and download-then-parse workflows — this is unique to data.gouv.fr's architecture and not exposed by generic data APIs.
Provides format-aware capability detection that generic file metadata APIs lack; allows agents to optimize for latency and bandwidth by choosing the most efficient access pattern per resource.
paginated row-level data querying via tabular api
Medium confidenceExecutes structured queries against CSV and XLSX resources using data.gouv.fr's Tabular API, supporting row filtering, column selection, sorting, and pagination. Implements client-side parameter validation and result streaming to handle large datasets within practical limits (respects data.gouv.fr rate limits and payload size constraints). Queries are executed without downloading the entire file, enabling efficient exploration of large datasets within a single conversation turn.
Leverages data.gouv.fr's native Tabular API to enable server-side filtering and pagination without full file download, reducing bandwidth and latency compared to download-then-filter approaches — the MCP server translates natural query parameters into Tabular API calls.
More efficient than downloading entire CSV files for exploration; supports server-side filtering and pagination that generic file download APIs do not provide, enabling interactive data exploration at scale.
large-file streaming and format-agnostic parsing
Medium confidenceDownloads and parses CSV, XLSX, JSON, and other resource formats that do not support the Tabular API, streaming the file to avoid memory exhaustion and applying format-specific parsers (csv.DictReader for CSV, openpyxl for XLSX, json.load for JSON). Implements chunked reading and result truncation to respect practical limits on response size within MCP protocol constraints. Enables agents to access data from any format without requiring external download tools.
Implements streaming and chunked parsing to handle large files without loading entire datasets into memory, with format-specific parsers (csv.DictReader, openpyxl, json.load) that preserve data types and structure — this is distinct from naive download-and-parse approaches that fail on large files.
Supports format-agnostic parsing with streaming to handle files larger than available memory; more robust than generic HTTP download tools because it applies format-specific parsing logic and respects MCP payload constraints.
dataservice discovery and metadata retrieval
Medium confidenceQueries data.gouv.fr's dataservice catalog (API endpoints, web services, and data APIs exposed by organizations) via dedicated MCP tools that search and retrieve dataservice metadata. Enables agents to discover and understand available APIs and services without manual catalog browsing, returning service descriptions, endpoints, and usage documentation. Complements dataset discovery by surfacing programmatic access methods.
Exposes data.gouv.fr's dataservice catalog as a first-class MCP tool, enabling agents to discover and reason about APIs and web services in addition to static datasets — most data discovery tools focus only on datasets and ignore programmatic access methods.
Provides unified discovery of both datasets and dataservices through a single MCP interface, whereas typical data portals require separate browsing for static files vs. APIs.
platform metrics and usage statistics retrieval
Medium confidenceExposes a dedicated MCP tool that retrieves aggregate metrics about the data.gouv.fr platform, including total dataset count, organization count, resource count, and platform-wide usage statistics. Provides agents with context about the scale and activity of the platform without requiring manual inspection of the website. Useful for generating summaries or understanding data availability trends.
Provides platform-level metrics as a dedicated MCP tool, enabling agents to contextualize individual datasets within the broader ecosystem — most data discovery tools do not expose platform statistics.
Allows agents to generate informed summaries about data availability without requiring external analytics queries or manual website inspection.
mcp protocol bridging via streamable http transport
Medium confidenceImplements the Model Context Protocol (MCP) server specification using FastMCP framework with Streamable HTTP transport (POST /mcp endpoint), enabling any MCP-compatible client (ChatGPT, Claude Desktop, Gemini, Cursor, VS Code, etc.) to invoke data.gouv.fr tools through a standardized interface. Handles MCP request/response serialization, tool schema registration, and error handling transparently. The server exposes exactly two HTTP endpoints: /mcp for tool invocation and /health for liveness probes.
Implements MCP server using FastMCP framework with Streamable HTTP transport, providing a lightweight, stateless bridge between any MCP-compatible client and data.gouv.fr APIs — the three-layer architecture (main.py → tools/ → helpers/) ensures clean separation between protocol handling, tool logic, and API client code.
Standardizes on MCP protocol rather than building custom integrations for each client; enables any MCP-compatible tool (ChatGPT, Claude, Cursor, etc.) to access data.gouv.fr without client-specific code.
three-layer architecture with strict dependency isolation
Medium confidenceOrganizes the codebase into three strictly separated layers with one-way dependency flow: main.py (entry point and ASGI middleware) → tools/ (MCP tool implementations) → helpers/ (external API clients). Each layer has a single responsibility: entry point creates the FastMCP instance and runs uvicorn; tools implement MCP tool logic; helpers wrap external HTTP APIs with typed async functions. This architecture enables independent testing, easy addition of new tools, and clear separation between protocol handling and business logic.
Enforces strict one-way dependency flow (main.py → tools/ → helpers/) with no circular imports or cross-layer coupling, making the codebase highly modular and testable — this is a deliberate architectural choice documented in .cursor/rules/developer-notes.md.
Cleaner and more maintainable than monolithic or loosely-coupled architectures; enables independent testing of tools and API clients without mocking the entire MCP layer.
centralized logging and observability
Medium confidenceImplements a single shared logger named 'datagouv_mcp' created in main.py and referenced by name across all modules, providing centralized log aggregation and debugging. Logs are emitted at appropriate levels (DEBUG for API calls, INFO for tool invocations, ERROR for failures) and can be redirected to stdout, files, or external logging services via standard Python logging configuration. Enables operators to monitor server health and troubleshoot issues without instrumenting individual modules.
Uses a single named logger ('datagouv_mcp') created at startup and referenced by name across all modules, enabling centralized configuration and aggregation without dependency injection or global state — this is a standard Python logging pattern that simplifies configuration.
Simpler and more maintainable than per-module loggers or global logging state; integrates seamlessly with standard Python logging infrastructure and external logging services.
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with datagouv-mcp, ranked by overlap. Discovered automatically through the match graph.
Wand Enterprise
Revolutionize business with AI-driven collaboration and data...
OpenMetadata
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
pesoz
Dataset by Kthera. 5,82,735 downloads.
Qatalog
Centralize real-time data access, enhance decision-making...
resona
Semantic embeddings and vector search - find concepts that resonate
Supervisely
Enterprise computer vision platform for teams.
Best For
- ✓Data analysts building research queries in AI chatbots
- ✓Non-technical users exploring French open data programmatically
- ✓Teams building data discovery workflows into LLM agents
- ✓Data engineers evaluating dataset fitness before integration
- ✓Researchers verifying dataset provenance and licensing
- ✓LLM agents planning multi-step data workflows
- ✓DevOps teams deploying the server to cloud platforms
- ✓Organizations running containerized AI infrastructure
Known Limitations
- ⚠Search is keyword-based only — no semantic/vector search across dataset descriptions
- ⚠Results are limited to data.gouv.fr catalog; does not federate with other European open data portals
- ⚠Pagination is client-side; large result sets require multiple sequential API calls
- ⚠Returns metadata only — does not preview actual data rows or sample values
- ⚠Resource list includes URLs but does not validate whether files are accessible or up-to-date
- ⚠No support for filtering or sorting resources within a dataset
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
Repository Details
Last commit: Apr 20, 2026
About
Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.
Categories
Alternatives to datagouv-mcp
Are you the builder of datagouv-mcp?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →