mcp-standardized browser control via stdio transport
Exposes Chrome DevTools capabilities through the Model Context Protocol (MCP) using STDIO transport, enabling AI agents to invoke browser operations as structured tool calls. The server implements a single-threaded execution model with Mutex-based synchronization to prevent race conditions during concurrent tool invocations, ensuring deterministic browser state transitions. Requests flow through a standardized MCP schema that maps natural language intents to typed tool parameters, with responses formatted as token-optimized JSON for LLM consumption.
Unique: Implements MCP as the primary integration layer rather than REST/WebSocket APIs, with Mutex-based single-threaded execution ensuring deterministic state management across concurrent agent requests. Directly exposes Chrome DevTools Protocol (CDP) capabilities through standardized MCP tool schemas, eliminating custom integration code per AI platform.
vs alternatives: Provides agent-agnostic browser control via MCP standard (vs Puppeteer's Node.js-only SDK or Playwright's language-specific bindings), enabling seamless integration across Claude, Gemini, and Cursor without platform-specific adapters.
multi-strategy browser connection and lifecycle management
Supports three distinct browser connection strategies (launch new instance, auto-connect to existing, HTTP debug protocol) configured via CLI arguments, with automatic lifecycle management including headless mode, isolated profiles, and custom user data directories. The system implements ensureBrowserLaunched() and ensureBrowserConnected() methods that handle connection establishment, validation, and recovery without requiring manual browser startup. Connection strategy is determined at server initialization and persists for the server's lifetime, enabling both managed and unmanaged browser scenarios.
Unique: Implements three distinct connection strategies (launch, auto-connect, HTTP debug) as first-class patterns rather than ad-hoc options, with automatic discovery of existing Chrome instances via user data directory scanning. Decouples browser lifecycle from MCP server lifecycle, enabling both managed (server launches browser) and unmanaged (server attaches to existing) scenarios.
vs alternatives: Offers more flexible connection strategies than Puppeteer's default launch-only approach, and provides auto-discovery of existing Chrome instances without requiring manual URL configuration, reducing setup friction for agent developers.
cookie and storage management
Reads, sets, and deletes cookies, localStorage, and sessionStorage across the page and domain. The system uses Chrome DevTools Protocol's Storage domain to access persistent storage and the Runtime domain to access in-memory storage (localStorage, sessionStorage). Storage operations are scoped to the current page's origin, preventing cross-origin access. This enables agents to manage authentication state, test storage-dependent behavior, and clear state between test cases.
Unique: Provides unified access to cookies, localStorage, and sessionStorage via Chrome DevTools Protocol, enabling agents to manage all storage types without separate APIs or custom JavaScript execution.
vs alternatives: Offers transparent storage management (vs Puppeteer's JavaScript-based localStorage access), enabling agents to set cookies and manage session state without custom code, improving reliability for authentication-dependent workflows.
viewport and scroll management
Manages viewport size, scroll position, and page dimensions. The system uses Chrome DevTools Protocol's Emulation domain to set viewport size and the Runtime domain to control scroll position via window.scrollTo(). Viewport changes trigger page reflow and may affect responsive design behavior. Scroll operations enable agents to access content below the fold and verify lazy-loading behavior.
Unique: Provides both viewport resizing (via Emulation domain) and scroll control (via Runtime domain) in a single tool, enabling agents to manage page dimensions and scroll position without separate API calls.
vs alternatives: Offers viewport resizing capability (vs Puppeteer's setViewport which is page-specific), enabling agents to test responsive design across breakpoints, though requiring separate server instances for persistent multi-viewport testing.
wait and synchronization primitives
Provides blocking wait operations for page state changes (navigation, element visibility, network idle, custom conditions). The system uses Chrome DevTools Protocol's Page and Network domains to detect state changes, with configurable timeouts and polling intervals. Wait operations block the agent until the condition is met or timeout is exceeded, enabling agents to synchronize with asynchronous page behavior without explicit polling logic.
Unique: Provides multiple wait primitives (navigation, element, networkIdle, custom) via Chrome DevTools Protocol, enabling agents to synchronize with different types of page state changes without custom polling logic.
vs alternatives: Offers more granular wait conditions than Puppeteer's waitForNavigation/waitForSelector (supports networkIdle and custom expressions), enabling agents to handle complex async patterns without explicit polling.
error handling and state recovery
Implements graceful error handling for failed operations (selector resolution, navigation timeouts, network errors) with detailed error messages and recovery suggestions. The system catches exceptions from Chrome DevTools Protocol operations and returns structured error responses with error type, message, and context. Failed operations do not crash the server or corrupt browser state, enabling agents to handle errors and retry with different approaches.
Unique: Implements structured error handling with detailed error types and recovery context, enabling agents to understand failure reasons and retry with different approaches, rather than generic exception propagation.
vs alternatives: Provides more detailed error information than Puppeteer's exception handling (includes error type, context, recovery suggestions), enabling agents to implement intelligent retry logic and error recovery strategies.
accessibility snapshot capture and dom state extraction
Captures structured accessibility trees and DOM snapshots from the current page, extracting semantic information about interactive elements, text content, and page structure in a format optimized for LLM reasoning. The system uses Chrome DevTools Protocol's accessibility domain to build a tree representation of the page, filtering for user-visible elements and computing bounding boxes for spatial reasoning. Snapshots are serialized as JSON with element IDs, roles, labels, and coordinates, enabling agents to understand page structure without visual rendering.
Unique: Leverages Chrome DevTools Protocol's accessibility domain to extract semantic trees rather than parsing raw HTML or screenshots, providing structured element metadata (roles, labels, coordinates) optimized for LLM reasoning without visual processing overhead.
vs alternatives: Provides semantic accessibility information (vs Puppeteer's raw DOM queries or Playwright's visual locators), enabling agents to reason about page structure without screenshots or visual analysis, reducing token consumption and improving reasoning accuracy.
performance tracing and metrics analysis with devtools integration
Captures Chrome DevTools performance traces (CPU, memory, network, rendering) and analyzes them using chrome-devtools-frontend components to extract high-level metrics like Largest Contentful Paint (LCP), First Input Delay (FID), and memory usage. The system records traces during page load or user interactions, then parses the trace data to compute performance insights without requiring external APM tools. Traces are formatted as structured JSON with timeline events, metric summaries, and bottleneck identification for agent-driven performance optimization.
Unique: Integrates chrome-devtools-frontend for trace analysis rather than relying on raw CDP trace data, enabling high-level metric extraction (LCP, FID, CLS) and bottleneck identification without custom parsing logic. Provides token-optimized summaries of trace data for LLM consumption.
vs alternatives: Offers deeper performance insights than Puppeteer's basic timing APIs (vs simple navigation.timing), and provides structured metric extraction without external APM tools or cloud dependencies, enabling offline performance analysis.
+6 more capabilities