codebase-memory-mcp

MCP ServerFree

High-performance code intelligence MCP server. Indexes codebases into a persistent knowledge graph — average repo in milliseconds. 66 languages, sub-ms queries, 99% fewer tokens. Single static binary, zero dependencies.

Open Source

/ 100

15 capabilities

Capabilities15 decomposed

multi-language ast parsing and entity extraction with tree-sitter

Medium confidence

Parses source code in 66 languages using tree-sitter grammar bindings (vendored C components) to extract structural entities: function/method definitions, class hierarchies, variable declarations, imports, and type annotations. The parsing engine operates as the first pass in a 7-pass indexing pipeline, converting raw source text into an intermediate AST representation that feeds downstream semantic analysis. Uses tree-sitter's incremental parsing to avoid re-parsing unchanged file regions during incremental reindexing.

Solves for

I need to understand what functions and classes exist in a codebase across multiple languagesI want to extract all definitions and their signatures without manually reading filesI need to identify all imports and dependencies declared in source code

Best for

AI agents analyzing polyglot codebases (Python + Go + TypeScript stacks)

Teams using Claude Code, Cursor, or Windsurf that need language-agnostic code intelligence

Developers building code analysis tools that require structural understanding across 66+ languages

Requires

Source code files in one of 66 supported languages (Python, Go, TypeScript, Rust, Java, C++, C#, Kotlin, Lua, Haskell, OCaml, Swift, Dart, MATLAB, Lean 4, Wolfram, etc.)

Filesystem read access to codebase directory

Limitations

Tree-sitter grammars may have edge cases with non-standard or legacy syntax variants

Parsing latency scales with file size; very large files (>100KB) may add milliseconds per file

Type inference is best-effort and language-dependent — dynamically typed languages (Python, JavaScript) have less precise type information than statically typed ones

What makes it unique

Uses vendored tree-sitter C bindings compiled into a single static binary, enabling 66-language support without external dependencies or grammar downloads. Integrates incremental parsing to avoid re-parsing unchanged regions during content-hash-based reindexing, achieving ~4× faster incremental updates than full-scan approaches.

vs alternatives

Supports 66 languages in a single binary with zero external dependencies, whereas LSP-based approaches require per-language server installations and Regex-based tools are limited to 5-10 languages with poor structural accuracy.

persistent sqlite knowledge graph with cypher query engine

Medium confidence

Builds and maintains a queryable knowledge graph stored in SQLite WAL mode at ~/.cache/codebase-memory-mcp/codebase-memory.db. The graph schema models code entities (functions, classes, modules) as nodes and relationships (calls, inheritance, imports, type references) as edges. Exposes a Cypher query engine (src/store/store.c) for graph traversal, enabling sub-millisecond queries for structural patterns like 'find all callers of function X' or 'trace inheritance chain for class Y'. Supports incremental updates via content-hash-based change detection — only modified files trigger re-parsing and graph updates.

Solves for

I want to query the codebase structure without re-parsing files every timeI need to find all callers of a function or all subclasses of a type in millisecondsI want to understand call graphs and dependency chains across the entire codebase

Best for

AI agents that need repeated structural queries on the same codebase

Teams using Claude Code or Cursor that want sub-millisecond response times for code intelligence

Developers building impact analysis tools (what breaks if I change this function?)

Requires

SQLite 3.8+ (included in most systems)

Filesystem write access to ~/.cache/codebase-memory-mcp/ for graph persistence

Initial indexing pass (one-time, ~milliseconds for average repo)

Limitations

Graph schema is optimized for structural queries; semantic queries (what does this code do?) still require LLM analysis

SQLite WAL mode adds ~5-10MB overhead per indexed codebase; very large graphs (>1GB) may experience slower traversal

Cypher query engine is read-only; no mutation support for dynamic graph updates during agent execution

What makes it unique

Implements a Cypher query engine in C within a single static binary, achieving sub-millisecond query latency on graphs with thousands of nodes. Uses content-hash-based incremental indexing to detect file changes and update only affected graph regions, enabling ~4× faster re-indexing than full-scan approaches. Stores graph in SQLite WAL mode for ACID compliance and concurrent read access.

vs alternatives

Delivers sub-millisecond Cypher queries on local graphs without network latency, whereas cloud-based code intelligence services (GitHub Copilot, Tabnine) incur 100-500ms round-trip latency and require sending code to external servers.

community detection and architectural clustering

Medium confidence

Performs community detection on the code graph to identify clusters of related entities (functions, classes, modules) that form logical architectural components. The indexing pipeline (Pass 6) uses graph clustering algorithms to group entities based on call frequency, shared dependencies, and module boundaries. Results are stored in the graph as 'BELONGS_TO_COMMUNITY' relationships, queryable via tools like 'find_communities' and 'find_community_members'. Useful for understanding codebase architecture, identifying tightly coupled components, and visualizing system structure.

Solves for

I want to understand the architectural structure of my codebase without reading documentationI need to identify tightly coupled components that should be refactoredI want to visualize logical groupings of related functions and classes

Best for

Architects and tech leads understanding codebase organization

Teams performing large-scale refactoring and needing to understand component boundaries

AI agents generating architectural recommendations

Requires

Indexed codebase with call graph (automatic via indexing pipeline)

Sufficient call graph density for meaningful clustering (sparse graphs produce poor results)

Limitations

Community detection is heuristic-based; results may not align with intended architectural boundaries

Clustering quality depends on call graph density; sparse graphs may produce poor clusters

Results are sensitive to algorithm parameters (edge weight thresholds, cluster size); tuning required for different codebases

What makes it unique

Uses graph clustering algorithms on the call graph to automatically identify architectural components without manual configuration or domain knowledge. Results are stored in the graph for efficient querying and visualization.

vs alternatives

Automatic community detection requires no manual configuration or domain knowledge, whereas manual architecture documentation is often outdated. Faster and more objective than manual architectural analysis.

test coverage mapping and test-to-code linking

Medium confidence

Identifies test functions and links them to the code they test by analyzing test file naming conventions, test decorators, and assertion patterns. The indexing pipeline (Pass 7) detects test functions (e.g., functions starting with 'test_', methods in classes ending with 'Test', functions decorated with @test or @pytest.mark) and attempts to link them to the functions they test based on naming patterns and call graph analysis. Results are stored in the graph as 'TESTS' relationships, queryable via tools like 'find_tests_for_function' and 'find_tested_functions'.

Solves for

I want to find all tests for a specific function to understand test coverageI need to identify functions with no test coverageI want to understand which tests exercise a specific code path

Best for

QA teams understanding test coverage and identifying gaps

Developers refactoring code and needing to update related tests

AI agents generating test cases for untested functions

Requires

Indexed codebase with test detection pass completed (automatic via indexing pipeline)

Test files following standard naming conventions (test_*.py, *_test.go, *.test.ts, etc.)

Limitations

Test linking is heuristic-based (naming patterns, call graph analysis); complex test setups may not be detected

Test coverage is not measured (line coverage, branch coverage); only test-to-function linking is provided

Parameterized tests and test fixtures are not fully resolved; may undercount test coverage

What makes it unique

Automatically links test functions to code under test using naming patterns and call graph analysis, without requiring explicit test annotations or coverage instrumentation. Works across multiple testing frameworks (pytest, unittest, Jest, Go testing, etc.) in a single indexing pass.

vs alternatives

Automatic test linking requires no instrumentation or coverage tools, whereas coverage tools (pytest-cov, Istanbul) require test execution and only measure line coverage. Faster than manual test discovery and works for untested code.

file content access and code snippet retrieval

Medium confidence

Provides direct access to source code files and code snippets via tools like 'get_file_content' and 'get_code_snippet'. Supports retrieving entire files or specific line ranges, with optional syntax highlighting and context expansion. Useful for AI agents that need to read actual code after identifying relevant functions via graph queries. Integrates with graph queries to provide seamless navigation from structural queries (find_callers) to actual code inspection.

Solves for

I want to read the implementation of a function after finding it via call graph queryI need to get a specific line range from a file for contextI want to retrieve code snippets for multiple functions to understand their implementations

Best for

AI agents that need to read code after structural queries

Developers navigating codebases and needing quick access to file contents

Tools integrating code intelligence with code reading capabilities

Requires

Filesystem read access to codebase directory

File path or function location from graph query

Limitations

File access is read-only; no write or edit capabilities

Large files (>1MB) may be slow to retrieve; pagination recommended

Binary files are not supported; only text files

What makes it unique

Provides direct file access integrated with graph queries, enabling seamless navigation from structural queries (find_callers) to actual code inspection. Supports line-range retrieval and context expansion for efficient code reading.

vs alternatives

Integrated file access eliminates separate file reading steps and enables efficient context expansion, whereas separate file reading tools require manual path construction and context management.

configuration file and dependency link detection

Medium confidence

Detects references to configuration files, environment variables, and external dependencies by analyzing code patterns, imports, and config file references. The indexing pipeline (Pass 5) identifies config file paths (e.g., 'config.yaml', 'settings.json'), environment variable references (e.g., 'os.getenv("DATABASE_URL")'), and external dependencies (e.g., 'import requests', 'require("express")') and links them to the code that references them. Results are stored in the graph as 'REFERENCES_CONFIG', 'USES_ENV_VAR', and 'DEPENDS_ON' relationships.

Solves for

I want to find all code that references a specific configuration fileI need to identify all environment variables used in my codebaseI want to understand all external dependencies and where they are used

Best for

DevOps and infrastructure teams understanding configuration dependencies

Developers performing security audits on environment variable usage

Teams managing dependency upgrades and understanding impact

Requires

Indexed codebase with config link detection pass completed (automatic via indexing pipeline)

Limitations

Config file detection is pattern-based; non-standard config file names may be missed

Environment variable detection uses regex patterns; dynamic variable names (e.g., 'os.getenv(var_name)') are not resolved

Dependency detection is based on import statements; transitive dependencies are not resolved

What makes it unique

Automatically detects configuration file, environment variable, and dependency references using pattern matching and AST analysis, linking them to code locations in the graph. Works across multiple languages and frameworks without requiring explicit annotations.

vs alternatives

Automatic detection of config and dependency references requires no manual configuration, whereas dependency analysis tools (npm audit, pip-audit) only check for known vulnerabilities and don't link to code locations. Faster than manual dependency tracking.

polyglot codebase indexing with language-specific semantics

Medium confidence

Indexes codebases containing multiple programming languages (Python, Go, TypeScript, Rust, Java, C++, C#, Kotlin, Lua, Haskell, OCaml, Swift, Dart, MATLAB, Lean 4, Wolfram, and 48 more) in a single unified indexing pass. Each language is parsed using language-specific tree-sitter grammars, and semantic analysis (call resolution, type inference, HTTP route detection) is adapted to each language's semantics. Results are stored in a unified graph that enables cross-language queries (e.g., 'find all Python functions that call Go functions').

Solves for

I want to understand the structure of my polyglot codebase (Python + Go + TypeScript) in a single viewI need to find cross-language dependencies (Python calling Go, TypeScript calling Rust)I want to perform impact analysis across language boundaries

Best for

Teams with polyglot codebases (microservices, multi-language projects)

Developers understanding cross-language dependencies and integration points

AI agents analyzing complex systems with multiple languages

Requires

Codebase with files in one or more of 66 supported languages

Explicit integration points between languages (imports, FFI, HTTP calls) for cross-language queries

Limitations

Cross-language type inference is limited; type information is not shared across language boundaries

Call resolution across languages requires explicit integration points (FFI, gRPC, HTTP); implicit calls are not resolved

Language-specific idioms and patterns may not be recognized in other languages

What makes it unique

Indexes 66 languages in a single unified graph with language-specific semantic analysis, enabling cross-language queries without separate per-language tools. Each language's semantics (Python type hints, Go explicit types, TypeScript annotations) are respected in a unified indexing pipeline.

vs alternatives

Single unified indexing pass for 66 languages eliminates the need for per-language tool setup, whereas LSP-based approaches require separate server configuration for each language. Cross-language queries are impossible with language-specific tools.

7-pass semantic indexing pipeline with call resolution and type inference

Medium confidence

Executes a multi-stage indexing pipeline (src/pipeline/pipeline.c) that progressively enriches the graph: Pass 1 extracts structure (definitions, imports), Pass 2 resolves calls to their definitions, Pass 3 infers types and inheritance, Pass 4 detects HTTP links and routes, Pass 5 identifies config file references, Pass 6 performs community detection (clustering related entities), Pass 7 indexes test coverage. Each pass operates on the graph built by previous passes, enabling sophisticated analyses like 'find all functions that handle HTTP POST requests' or 'identify dead code by tracing reachability from entry points'. Type inference uses language-specific heuristics (e.g., Python type hints, Go explicit types, TypeScript annotations) to build a best-effort type map.

Solves for

I want to understand which functions are called by which other functions across the entire codebaseI need to identify which HTTP routes are handled by which handlersI want to find dead code by tracing reachability from entry pointsI need to understand type relationships and inheritance hierarchies

Best for

AI agents performing impact analysis (what breaks if I change this?)

Teams using Claude Code or Cursor that need to understand call graphs and dependencies

Developers building refactoring tools that require precise call resolution across languages

Requires

Completed AST parsing pass (automatic prerequisite)

Codebase with identifiable entry points or root functions for reachability analysis

Limitations

Call resolution is best-effort; dynamic calls (reflection, eval, function pointers) are not resolved

Type inference is incomplete for dynamically typed languages (Python, JavaScript) without explicit annotations

Community detection uses heuristic clustering; results may not align with actual architectural boundaries

What makes it unique

Implements a 7-pass pipeline that progressively enriches the graph with semantic information (calls, types, HTTP routes, communities, tests) in a single indexing run. Each pass operates on the graph state from previous passes, enabling sophisticated cross-cutting analyses without re-parsing. Uses language-specific heuristics for call resolution and type inference, adapting to each language's semantics (Python type hints, Go explicit types, TypeScript annotations).

vs alternatives

Provides call resolution and type inference in a single indexing pass without requiring LSP servers or language-specific analysis tools, whereas LSP-based approaches require per-language server setup and multiple round-trips for semantic information.

incremental reindexing with content-hash change detection

Medium confidence

Detects file changes using content hashing (comparing SHA-256 hashes of file contents) and re-parses only modified files during incremental reindexing. The file watcher (src/pipeline/pipeline_incremental.c) polls the filesystem with adaptive intervals (5-60 seconds) and triggers incremental re-indexing when changes are detected. This approach avoids full-codebase re-parsing on every change, achieving ~4× faster reindexing than scanning all files. The graph is updated in-place, preserving query performance for unchanged portions of the codebase.

Solves for

I want the codebase graph to stay in sync with my editor without full re-indexing on every keystrokeI need fast reindexing when I make changes to a few files in a large codebaseI want background indexing that doesn't block my AI agent from querying the graph

Best for

Developers using Claude Code, Cursor, or Windsurf with large codebases (>10K files)

Teams with continuous development workflows that need graph freshness without re-indexing latency

AI agents that need to query the graph while files are being edited

Requires

Filesystem read access to codebase directory

Background process with CPU and I/O resources for polling and re-parsing

Limitations

File watcher polling interval is adaptive (5-60s); changes may not be reflected immediately in the graph

Content-hash comparison adds ~1-5ms per file; very large codebases (>100K files) may experience noticeable polling overhead

Concurrent file modifications during indexing may result in stale graph state; no locking mechanism for multi-process access

What makes it unique

Uses content-hash-based change detection (SHA-256 comparison) instead of filesystem watchers or timestamps, enabling reliable detection of actual code changes without false positives from build artifacts or temporary files. Adaptive polling intervals (5-60s) balance freshness with CPU overhead. Achieves ~4× faster reindexing than full-scan approaches by re-parsing only modified files.

vs alternatives

Content-hash detection is more reliable than filesystem timestamps (which can be unreliable across network mounts) and more efficient than full-codebase re-parsing, whereas LSP-based approaches require per-language server integration and may miss cross-language dependencies.

mcp tool exposure with stdio transport and cli fallback

Medium confidence

Exposes 14 code intelligence tools via the Model Context Protocol (MCP) specification, communicating with MCP clients (Claude Code, Cursor, Windsurf, Gemini CLI, VS Code, Zed) over stdio transport. The MCP server (src/mcp/mcp.c) implements a single-threaded event loop that parses incoming JSON-RPC requests, routes them to tool handlers, and returns structured results. Each tool maps to a specific graph query or code access operation (e.g., 'find_callers', 'find_callees', 'get_file_content'). Additionally, exposes all tools via CLI mode (codebase-memory-mcp cli <tool_name>) for scripting and testing without an MCP client.

Solves for

I want to integrate code intelligence into Claude Code, Cursor, or Windsurf without custom pluginsI need to call code intelligence tools from scripts or CI/CD pipelinesI want to test and debug code intelligence queries without an MCP client

Best for

AI agents using Claude Code, Cursor, Windsurf, or other MCP-compatible clients

Teams building custom AI workflows that need code intelligence as a tool

Developers scripting code analysis tasks in CI/CD pipelines

Requires

MCP-compatible client (Claude Code, Cursor, Windsurf, Gemini CLI, VS Code, Zed) OR shell access for CLI mode

Indexed codebase (automatic via initial indexing pass)

Limitations

Single-threaded event loop means concurrent tool calls are serialized; high-concurrency workloads may experience queueing

Stdio transport has no built-in authentication; assumes trusted local execution

MCP specification is relatively new; some clients may have incomplete tool support or schema validation issues

What makes it unique

Implements MCP server in C with a single-threaded event loop using yyjson for fast JSON parsing, enabling low-latency tool calls from MCP clients. Dual-mode exposure (MCP + CLI) allows integration with AI agents and scripting without requiring separate adapters. Single static binary with zero dependencies simplifies deployment to any MCP-compatible client.

vs alternatives

Native MCP integration eliminates the need for custom plugins or adapters, whereas REST API approaches require additional HTTP server infrastructure and introduce network latency. CLI mode enables scripting without MCP client setup, whereas LSP-based approaches require language-specific server configuration.

graph visualization and interactive exploration ui

Medium confidence

Provides a web-based graph visualization UI (docs/index.html) that renders the indexed knowledge graph as an interactive node-link diagram. Users can click nodes to expand relationships, search for entities by name, and explore call graphs, inheritance hierarchies, and dependency chains visually. The UI queries the SQLite graph via the MCP tools, enabling real-time exploration without re-indexing. Useful for understanding codebase structure, identifying architectural patterns, and communicating code organization to team members.

Solves for

I want to visualize the call graph for a function to understand its dependenciesI need to explore the inheritance hierarchy for a class visuallyI want to show my team the architecture of our codebase in an interactive diagram

Best for

Teams onboarding new developers who need to understand codebase structure

Architects and tech leads communicating code organization to stakeholders

Developers debugging complex dependency chains or circular dependencies

Requires

Web browser with JavaScript support

Indexed codebase (automatic via initial indexing pass)

HTTP server or local file access to serve UI HTML

Limitations

Graph visualization can become cluttered for large codebases (>10K nodes); filtering and clustering are essential

Interactive exploration is slower than programmatic queries; suitable for exploration, not real-time analysis

Web UI requires browser access; not suitable for headless or remote-only environments

What makes it unique

Provides a lightweight web-based graph visualization that queries the local SQLite graph via MCP tools, enabling interactive exploration without external services or graph databases. Renders call graphs, inheritance hierarchies, and dependency chains in a single unified interface.

vs alternatives

Local graph visualization eliminates dependency on cloud-based visualization services (which require uploading code) and provides instant rendering without network latency, whereas GitHub's dependency graph requires cloud hosting and Graphviz-based tools require manual graph generation.

project-level code intelligence queries (find_callers, find_callees, trace_calls)

Medium confidence

Exposes graph query tools that answer project-level structural questions by traversing the knowledge graph. 'find_callers' returns all functions/methods that call a given function; 'find_callees' returns all functions called by a given function; 'trace_calls' returns the full call path from a source function to a target function. These queries operate on the pre-built graph, returning results in milliseconds without re-parsing. Supports filtering by language, module, or file path to narrow results in large codebases.

Solves for

I need to find all callers of a function to understand its impactI want to trace the call path from a user-facing endpoint to a database queryI need to identify all functions called by a specific handler to understand its dependencies

Best for

AI agents performing impact analysis (what breaks if I change this function?)

Developers refactoring code and needing to understand call dependencies

Teams debugging issues by tracing execution paths from entry points

Requires

Indexed codebase with resolved call graph (automatic via indexing pipeline)

Function/method name to query (exact match or regex pattern)

Limitations

Call resolution is best-effort; dynamic calls (reflection, eval, function pointers) are not resolved

Results may include false positives if multiple functions share the same name (name collisions)

Filtering by module or file path requires exact matches; partial matching not supported

What makes it unique

Executes call graph queries on the pre-built SQLite graph in sub-millisecond time, returning precise structural results without re-parsing or file I/O. Supports filtering by language, module, and file path to narrow results in polyglot codebases. Handles name collisions by returning all matching functions with their locations.

vs alternatives

Sub-millisecond call graph queries are 100-1000× faster than grep-based approaches and more accurate than LSP-based tools which require per-language server setup. Handles polyglot codebases in a single query, whereas language-specific tools require multiple queries.

http route and handler mapping with pattern detection

Medium confidence

Detects HTTP routes and their handler functions by analyzing route definitions (decorators, function calls, config files) across frameworks (Express, FastAPI, Django, Spring, etc.). The indexing pipeline (Pass 4) uses pattern matching and AST analysis to identify route declarations (e.g., @app.route('/api/users'), router.get('/users/:id')) and link them to their handler functions. Supports extracting route parameters, HTTP methods, and middleware chains. Results are stored in the graph and queryable via tools like 'find_routes' and 'find_route_handlers'.

Solves for

I want to find which handler function handles a specific HTTP routeI need to understand the full middleware chain for an endpointI want to identify all API routes in my codebase and their handlers

Best for

Backend developers understanding API structure and request handling

AI agents generating API documentation or test cases

Teams performing security audits on API endpoints

Requires

Indexed codebase with HTTP link detection pass completed (automatic via indexing pipeline)

Web framework using standard route declaration patterns (decorators, function calls, config files)

Limitations

Route detection is pattern-based; complex routing logic (middleware, conditional routes, dynamic route generation) may be missed

Framework-specific decorators and conventions are supported for popular frameworks (Express, FastAPI, Django, Spring); custom frameworks may not be detected

Route parameters and middleware chains are extracted as strings; semantic understanding requires LLM analysis

What makes it unique

Uses AST analysis and pattern matching to detect HTTP routes across multiple frameworks (Express, FastAPI, Django, Spring, etc.) in a single indexing pass, without requiring framework-specific plugins. Links routes to handler functions in the graph, enabling queries like 'find all handlers for POST /api/users' without manual mapping.

vs alternatives

Framework-agnostic route detection works across polyglot codebases without per-framework setup, whereas framework-specific tools (Swagger/OpenAPI generators) require explicit annotations and don't work for undocumented APIs. Faster than manual API exploration or grep-based searching.

type hierarchy and inheritance chain resolution

Medium confidence

Resolves type relationships and inheritance hierarchies by analyzing class definitions, interface implementations, and type annotations. The indexing pipeline (Pass 3) uses language-specific heuristics to identify parent classes, implemented interfaces, and type parameters. Results are stored in the graph as 'INHERITS_FROM' and 'IMPLEMENTS' relationships, queryable via tools like 'find_subclasses', 'find_implementations', and 'trace_inheritance'. Supports both single and multiple inheritance, generic types, and mixins (where applicable).

Solves for

I want to find all subclasses of a base class to understand the inheritance hierarchyI need to identify all implementations of an interfaceI want to trace the type hierarchy for a generic class with type parameters

Best for

Developers refactoring class hierarchies and needing to understand all subclasses

Teams performing impact analysis on interface changes

AI agents generating code that respects type hierarchies

Requires

Indexed codebase with type inference pass completed (automatic via indexing pipeline)

Class or interface name to query (exact match or regex pattern)

Limitations

Type inference is best-effort; dynamically typed languages (Python, JavaScript) without explicit type annotations have incomplete hierarchies

Generic types and type parameters are partially resolved; complex type algebra (union types, intersection types) may be incomplete

Mixins and trait-based inheritance (Rust, Scala) are partially supported; exact semantics depend on language

What makes it unique

Resolves type hierarchies across 66 languages using language-specific heuristics (Python type hints, Go explicit types, TypeScript annotations, Java class declarations), storing results in the graph for sub-millisecond queries. Handles single and multiple inheritance, generic types, and mixins without requiring external type checkers.

vs alternatives

Language-agnostic type hierarchy resolution works across polyglot codebases in a single query, whereas LSP-based approaches require per-language server setup and TypeScript-specific tools only work for TypeScript. Faster than manual inheritance chain exploration.

dead code and reachability analysis

Medium confidence

Identifies unreachable code by performing reachability analysis from entry points (main functions, exported APIs, test entry points). The indexing pipeline uses graph traversal to mark all functions reachable from entry points, identifying functions with no incoming edges as potentially dead code. Supports filtering by module, file path, or function type (private vs. public) to reduce false positives. Results are queryable via tools like 'find_dead_code' and 'find_unreachable_functions'.

Solves for

I want to identify unused functions in my codebase to clean up dead codeI need to find functions that are never called from any entry pointI want to understand which functions are reachable from a specific entry point

Best for

Teams performing code cleanup and refactoring

Developers optimizing codebase size and complexity

AI agents generating code removal suggestions

Requires

Indexed codebase with call graph and entry point detection (automatic via indexing pipeline)

Identifiable entry points (main functions, exported APIs, test entry points)

Limitations

Reachability analysis is based on static call graph; dynamic calls (reflection, eval, function pointers) are not resolved, leading to false positives

Exported APIs (public functions) are assumed reachable; may miss unused public functions

Test functions and fixtures are often marked as reachable even if not called from main code

What makes it unique

Performs reachability analysis on the pre-built call graph by traversing from identified entry points, marking all reachable functions and identifying unreachable ones. Supports filtering by module and function type to reduce false positives from exported APIs and test fixtures.

vs alternatives

Graph-based reachability analysis is more accurate than regex-based dead code detection and faster than manual code review. Handles polyglot codebases in a single analysis, whereas language-specific linters require per-language setup.

Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.

Related Artifactssharing capabilities

Artifacts that share capabilities with codebase-memory-mcp, ranked by overlap. Discovered automatically through the match graph.

MCP Server49

code-review-graph

Local knowledge graph for Claude Code. Builds a persistent map of your codebase so Claude reads only what matters — 6.8× fewer tokens on reviews and up to 49× on daily coding tasks.

multi-language support with language-agnostic graph schematree-sitter-based incremental codebase parsing with sha-256 change trackinggraph storage and persistence with sqlite backend

3 shared capabilities

MCP Server41

CodeGraphContext

An MCP server plus a CLI tool that indexes local code into a graph database to provide context to AI assistants.

multi-language code parsing with tree-sitter ast extractionlanguage-agnostic entity normalization and schema mapping

2 shared capabilities

Repository26

Scaffold

** - Scaffold is a Retrieval-Augmented Generation (RAG) system designed to structural understanding of large codebases. It transforms your source code into a living knowledge graph, allowing for precise, context-aware interactions that go far beyond simple file retrieval.

multi-language source code parsing with ast extractiondual-database knowledge graph persistence (postgresql + neo4j)

2 shared capabilities

MCP Server38

code-index-mcp

A Model Context Protocol (MCP) server that helps large language models index, search, and analyze code repositories with minimal setup

tree-sitter ast parsing with language-specific symbol extractionlanguage-specific parsing strategy selection with fallback chains

2 shared capabilities

CLI Tool23

Repo Map

** -🐧 🪟 🍎 - An MCP server (and command-line tool) to provide a dynamic map of chat-related files from the repository with their function prototypes and related files in order of relevance. Based on the "Repo Map" functionality in Aider.chat

language-agnostic code entity extraction with configurable language supporttree-sitter-based code definition extraction with language-specific query files

2 shared capabilities

MCP Server22

Sourcerer

** - MCP for semantic code search & navigation that reduces token waste

tree-sitter based code parsing and semantic chunkingmulti-language code analysis with language-specific extraction

2 shared capabilities

Best For

✓AI agents analyzing polyglot codebases (Python + Go + TypeScript stacks)
✓Teams using Claude Code, Cursor, or Windsurf that need language-agnostic code intelligence
✓Developers building code analysis tools that require structural understanding across 66+ languages
✓AI agents that need repeated structural queries on the same codebase
✓Teams using Claude Code or Cursor that want sub-millisecond response times for code intelligence
✓Developers building impact analysis tools (what breaks if I change this function?)
✓Architects and tech leads understanding codebase organization
✓Teams performing large-scale refactoring and needing to understand component boundaries

Known Limitations

⚠Tree-sitter grammars may have edge cases with non-standard or legacy syntax variants
⚠Parsing latency scales with file size; very large files (>100KB) may add milliseconds per file
⚠Type inference is best-effort and language-dependent — dynamically typed languages (Python, JavaScript) have less precise type information than statically typed ones
⚠Graph schema is optimized for structural queries; semantic queries (what does this code do?) still require LLM analysis
⚠SQLite WAL mode adds ~5-10MB overhead per indexed codebase; very large graphs (>1GB) may experience slower traversal
⚠Cypher query engine is read-only; no mutation support for dynamic graph updates during agent execution

Requirements

Source code files in one of 66 supported languages (Python, Go, TypeScript, Rust, Java, C++, C#, Kotlin, Lua, Haskell, OCaml, Swift, Dart, MATLAB, Lean 4, Wolfram, etc.)Filesystem read access to codebase directorySQLite 3.8+ (included in most systems)Filesystem write access to ~/.cache/codebase-memory-mcp/ for graph persistenceInitial indexing pass (one-time, ~milliseconds for average repo)Indexed codebase with call graph (automatic via indexing pipeline)Sufficient call graph density for meaningful clustering (sparse graphs produce poor results)Indexed codebase with test detection pass completed (automatic via indexing pipeline)

Input / Output

Accepts: source code files (any of 66 supported languages), Cypher query strings (e.g., 'MATCH (f:Function)-[:CALLS]->(g:Function) WHERE f.name = "foo" RETURN g'), graph traversal parameters (start node, depth, relationship type), optional filters (module, file path), function name (string), file path (string), optional line range (start_line, end_line), config file name (string), environment variable name (string), or dependency name (string), codebase directory with mixed-language files, parsed AST from tree-sitter (automatic input from parsing pass), filesystem events (file modifications detected via content hashing), JSON-RPC requests (MCP mode) or CLI arguments (CLI mode), user interactions (node clicks, search queries, filter selections), function name (string), optional filters (language, module, file path), route path (string), optional HTTP method filter, class/interface name (string), optional language filter, optional filters (module, file path, function type)

Produces: structured AST nodes (definitions, calls, imports, type annotations), structured graph results (node lists, relationship paths, aggregated counts), list of communities with member functions/classes, inter-community call frequencies, list of test functions with locations, test-to-function relationships, file content (string), syntax-highlighted code (optional), list of code locations that reference the config/env var/dependency, unified graph with language-specific entities and cross-language relationships, enriched graph with call relationships, type information, HTTP routes, communities, test coverage, updated SQLite graph with changes from modified files, JSON-structured tool results (function definitions, call graphs, file contents), interactive graph visualization (SVG or canvas rendering), list of function definitions with locations, call paths with intermediate functions, handler function definitions, middleware chain, route parameters, list of subclasses/implementations with locations, inheritance chains, type parameters, list of unreachable functions with locations, reachability paths from entry points

UnfragileRank

Adoption26%(30% weight)

Quality53%(25% weight)

Ecosystem60%(25% weight)

Match Graph10%(15% weight)

Freshness75%(5% weight)

UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.

Type: MCP Server

15 capabilities

Visit codebase-memory-mcp→

Repository Details

1,740

Stars

200

Forks

Language

MIT

License

Topics

aidercclaude-codecode-analysiscode-intelligencecodexcursordeveloper-toolsgemini-cligraph-visualizationkilocodeknowledge-graphmcpmcp-servermodel-context-protocolopencodeperformancesqlitetree-sitterwindsurf

Last commit: Apr 18, 2026

About

Alternatives to codebase-memory-mcp

IntelliCode50Extension

AI-assisted development

Compare →

GitHub Copilot Chat53Extension

AI chat features powered by Copilot

Compare →

GitHub Copilot52Extension

Your AI pair programmer

Compare →

Claude Code for VS Code52Extension

Claude Code for VS Code: Harness the power of Claude Code without leaving your IDE

Compare →

Are you the builder of codebase-memory-mcp?

Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.

Claim this artifact →Verification via email

Get the weekly brief

New tools, rising stars, and what's actually worth your time. No spam.

Data Sources

github

Looking for something else?

Search →

Capabilities15 decomposed

multi-language ast parsing and entity extraction with tree-sitter

Medium confidence

Solves for

Best for

AI agents analyzing polyglot codebases (Python + Go + TypeScript stacks)

Teams using Claude Code, Cursor, or Windsurf that need language-agnostic code intelligence

Developers building code analysis tools that require structural understanding across 66+ languages

Requires

Source code files in one of 66 supported languages (Python, Go, TypeScript, Rust, Java, C++, C#, Kotlin, Lua, Haskell, OCaml, Swift, Dart, MATLAB, Lean 4, Wolfram, etc.)

Filesystem read access to codebase directory

Limitations

Tree-sitter grammars may have edge cases with non-standard or legacy syntax variants

Parsing latency scales with file size; very large files (>100KB) may add milliseconds per file

Type inference is best-effort and language-dependent — dynamically typed languages (Python, JavaScript) have less precise type information than statically typed ones

What makes it unique

vs alternatives

persistent sqlite knowledge graph with cypher query engine

Medium confidence

Solves for

Best for

AI agents that need repeated structural queries on the same codebase

Teams using Claude Code or Cursor that want sub-millisecond response times for code intelligence

Developers building impact analysis tools (what breaks if I change this function?)

Requires

SQLite 3.8+ (included in most systems)

Filesystem write access to ~/.cache/codebase-memory-mcp/ for graph persistence

Initial indexing pass (one-time, ~milliseconds for average repo)

Limitations

Graph schema is optimized for structural queries; semantic queries (what does this code do?) still require LLM analysis

SQLite WAL mode adds ~5-10MB overhead per indexed codebase; very large graphs (>1GB) may experience slower traversal

Cypher query engine is read-only; no mutation support for dynamic graph updates during agent execution

What makes it unique

vs alternatives

community detection and architectural clustering

Medium confidence

Solves for

Best for

Architects and tech leads understanding codebase organization

Teams performing large-scale refactoring and needing to understand component boundaries

AI agents generating architectural recommendations

Requires

Indexed codebase with call graph (automatic via indexing pipeline)

Sufficient call graph density for meaningful clustering (sparse graphs produce poor results)

Limitations

Community detection is heuristic-based; results may not align with intended architectural boundaries

Clustering quality depends on call graph density; sparse graphs may produce poor clusters

Results are sensitive to algorithm parameters (edge weight thresholds, cluster size); tuning required for different codebases

What makes it unique

vs alternatives

test coverage mapping and test-to-code linking

Medium confidence

Solves for

I want to find all tests for a specific function to understand test coverageI need to identify functions with no test coverageI want to understand which tests exercise a specific code path

Best for

QA teams understanding test coverage and identifying gaps

Developers refactoring code and needing to update related tests

AI agents generating test cases for untested functions

Requires

Indexed codebase with test detection pass completed (automatic via indexing pipeline)

Test files following standard naming conventions (test_*.py, *_test.go, *.test.ts, etc.)

Limitations

Test linking is heuristic-based (naming patterns, call graph analysis); complex test setups may not be detected

Test coverage is not measured (line coverage, branch coverage); only test-to-function linking is provided

Parameterized tests and test fixtures are not fully resolved; may undercount test coverage

What makes it unique

vs alternatives

file content access and code snippet retrieval

Medium confidence

Solves for

Best for

AI agents that need to read code after structural queries

Developers navigating codebases and needing quick access to file contents

Tools integrating code intelligence with code reading capabilities

Requires

Filesystem read access to codebase directory

File path or function location from graph query

Limitations

File access is read-only; no write or edit capabilities

Large files (>1MB) may be slow to retrieve; pagination recommended

Binary files are not supported; only text files

What makes it unique

vs alternatives

Integrated file access eliminates separate file reading steps and enables efficient context expansion, whereas separate file reading tools require manual path construction and context management.

configuration file and dependency link detection

Medium confidence

Solves for

Best for

DevOps and infrastructure teams understanding configuration dependencies

Developers performing security audits on environment variable usage

Teams managing dependency upgrades and understanding impact

Requires

Indexed codebase with config link detection pass completed (automatic via indexing pipeline)

Limitations

Config file detection is pattern-based; non-standard config file names may be missed

Environment variable detection uses regex patterns; dynamic variable names (e.g., 'os.getenv(var_name)') are not resolved

Dependency detection is based on import statements; transitive dependencies are not resolved

What makes it unique

vs alternatives

polyglot codebase indexing with language-specific semantics

Medium confidence

Solves for

Best for

Teams with polyglot codebases (microservices, multi-language projects)

Developers understanding cross-language dependencies and integration points

AI agents analyzing complex systems with multiple languages

Requires

Codebase with files in one or more of 66 supported languages

Explicit integration points between languages (imports, FFI, HTTP calls) for cross-language queries

Limitations

Cross-language type inference is limited; type information is not shared across language boundaries

Call resolution across languages requires explicit integration points (FFI, gRPC, HTTP); implicit calls are not resolved

Language-specific idioms and patterns may not be recognized in other languages

What makes it unique

vs alternatives

7-pass semantic indexing pipeline with call resolution and type inference

Medium confidence

Solves for

Best for

AI agents performing impact analysis (what breaks if I change this?)

Teams using Claude Code or Cursor that need to understand call graphs and dependencies

Developers building refactoring tools that require precise call resolution across languages

Requires

Completed AST parsing pass (automatic prerequisite)

Codebase with identifiable entry points or root functions for reachability analysis

Limitations

Call resolution is best-effort; dynamic calls (reflection, eval, function pointers) are not resolved

Type inference is incomplete for dynamically typed languages (Python, JavaScript) without explicit annotations

Community detection uses heuristic clustering; results may not align with actual architectural boundaries

What makes it unique

vs alternatives

incremental reindexing with content-hash change detection

Medium confidence

Solves for

Best for

Developers using Claude Code, Cursor, or Windsurf with large codebases (>10K files)

Teams with continuous development workflows that need graph freshness without re-indexing latency

AI agents that need to query the graph while files are being edited

Requires

Filesystem read access to codebase directory

Background process with CPU and I/O resources for polling and re-parsing

Limitations

File watcher polling interval is adaptive (5-60s); changes may not be reflected immediately in the graph

Content-hash comparison adds ~1-5ms per file; very large codebases (>100K files) may experience noticeable polling overhead

Concurrent file modifications during indexing may result in stale graph state; no locking mechanism for multi-process access

What makes it unique

vs alternatives

mcp tool exposure with stdio transport and cli fallback

Medium confidence

Solves for

Best for

AI agents using Claude Code, Cursor, Windsurf, or other MCP-compatible clients

Teams building custom AI workflows that need code intelligence as a tool

Developers scripting code analysis tasks in CI/CD pipelines

Requires

MCP-compatible client (Claude Code, Cursor, Windsurf, Gemini CLI, VS Code, Zed) OR shell access for CLI mode

Indexed codebase (automatic via initial indexing pass)

Limitations

Single-threaded event loop means concurrent tool calls are serialized; high-concurrency workloads may experience queueing

Stdio transport has no built-in authentication; assumes trusted local execution

MCP specification is relatively new; some clients may have incomplete tool support or schema validation issues

What makes it unique

vs alternatives

graph visualization and interactive exploration ui

Medium confidence

Solves for

Best for

Teams onboarding new developers who need to understand codebase structure

Architects and tech leads communicating code organization to stakeholders

Developers debugging complex dependency chains or circular dependencies

Requires

Web browser with JavaScript support

Indexed codebase (automatic via initial indexing pass)

HTTP server or local file access to serve UI HTML

Limitations

Graph visualization can become cluttered for large codebases (>10K nodes); filtering and clustering are essential

Interactive exploration is slower than programmatic queries; suitable for exploration, not real-time analysis

Web UI requires browser access; not suitable for headless or remote-only environments

What makes it unique

vs alternatives

project-level code intelligence queries (find_callers, find_callees, trace_calls)

Medium confidence

Solves for

Best for

AI agents performing impact analysis (what breaks if I change this function?)

Developers refactoring code and needing to understand call dependencies

Teams debugging issues by tracing execution paths from entry points

Requires

Indexed codebase with resolved call graph (automatic via indexing pipeline)

Function/method name to query (exact match or regex pattern)

Limitations

Call resolution is best-effort; dynamic calls (reflection, eval, function pointers) are not resolved

Results may include false positives if multiple functions share the same name (name collisions)

Filtering by module or file path requires exact matches; partial matching not supported

What makes it unique

vs alternatives

http route and handler mapping with pattern detection

Medium confidence

Solves for

I want to find which handler function handles a specific HTTP routeI need to understand the full middleware chain for an endpointI want to identify all API routes in my codebase and their handlers

Best for

Backend developers understanding API structure and request handling

AI agents generating API documentation or test cases

Teams performing security audits on API endpoints

Requires

Indexed codebase with HTTP link detection pass completed (automatic via indexing pipeline)

Web framework using standard route declaration patterns (decorators, function calls, config files)

Limitations

Route detection is pattern-based; complex routing logic (middleware, conditional routes, dynamic route generation) may be missed

Framework-specific decorators and conventions are supported for popular frameworks (Express, FastAPI, Django, Spring); custom frameworks may not be detected

Route parameters and middleware chains are extracted as strings; semantic understanding requires LLM analysis

What makes it unique

vs alternatives

type hierarchy and inheritance chain resolution

Medium confidence

Solves for

Best for

Developers refactoring class hierarchies and needing to understand all subclasses

Teams performing impact analysis on interface changes

AI agents generating code that respects type hierarchies

Requires

Indexed codebase with type inference pass completed (automatic via indexing pipeline)

Class or interface name to query (exact match or regex pattern)

Limitations

Type inference is best-effort; dynamically typed languages (Python, JavaScript) without explicit type annotations have incomplete hierarchies

Generic types and type parameters are partially resolved; complex type algebra (union types, intersection types) may be incomplete

Mixins and trait-based inheritance (Rust, Scala) are partially supported; exact semantics depend on language

What makes it unique

vs alternatives

dead code and reachability analysis

Medium confidence

Solves for

Best for

Teams performing code cleanup and refactoring

Developers optimizing codebase size and complexity

AI agents generating code removal suggestions

Requires

Indexed codebase with call graph and entry point detection (automatic via indexing pipeline)

Identifiable entry points (main functions, exported APIs, test entry points)

Limitations

Reachability analysis is based on static call graph; dynamic calls (reflection, eval, function pointers) are not resolved, leading to false positives

Exported APIs (public functions) are assumed reachable; may miss unused public functions

Test functions and fixtures are often marked as reachable even if not called from main code

What makes it unique

vs alternatives

Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.

Alternatives to codebase-memory-mcp

IntelliCode50Extension

AI-assisted development

Compare →

GitHub Copilot Chat53Extension

AI chat features powered by Copilot

Compare →

GitHub Copilot52Extension

Your AI pair programmer

Compare →

Claude Code for VS Code52Extension

Claude Code for VS Code: Harness the power of Claude Code without leaving your IDE

Compare →

codebase-memory-mcp

Capabilities15 decomposed

multi-language ast parsing and entity extraction with tree-sitter

persistent sqlite knowledge graph with cypher query engine

community detection and architectural clustering

test coverage mapping and test-to-code linking

file content access and code snippet retrieval

configuration file and dependency link detection

polyglot codebase indexing with language-specific semantics

7-pass semantic indexing pipeline with call resolution and type inference

incremental reindexing with content-hash change detection

mcp tool exposure with stdio transport and cli fallback

graph visualization and interactive exploration ui

project-level code intelligence queries (find_callers, find_callees, trace_calls)

http route and handler mapping with pattern detection

type hierarchy and inheritance chain resolution

dead code and reachability analysis

Related Artifactssharing capabilities

code-review-graph

CodeGraphContext

Scaffold

code-index-mcp

Repo Map

Sourcerer

Best For

Known Limitations

Requirements

Input / Output

UnfragileRank

Repository Details

About

Categories

Alternatives to codebase-memory-mcp

Are you the builder of codebase-memory-mcp?

Get the weekly brief

Data Sources

codebase-memory-mcp

Capabilities15 decomposed

multi-language ast parsing and entity extraction with tree-sitter

persistent sqlite knowledge graph with cypher query engine

community detection and architectural clustering

test coverage mapping and test-to-code linking

file content access and code snippet retrieval

configuration file and dependency link detection

polyglot codebase indexing with language-specific semantics

7-pass semantic indexing pipeline with call resolution and type inference

incremental reindexing with content-hash change detection

mcp tool exposure with stdio transport and cli fallback

graph visualization and interactive exploration ui

project-level code intelligence queries (find_callers, find_callees, trace_calls)

http route and handler mapping with pattern detection

type hierarchy and inheritance chain resolution

dead code and reachability analysis

Related Artifactssharing capabilities

code-review-graph

CodeGraphContext

Scaffold

code-index-mcp

Repo Map

Sourcerer

Best For

Known Limitations

Requirements

Input / Output

UnfragileRank

Repository Details

About

Categories

Alternatives to codebase-memory-mcp

Are you the builder of codebase-memory-mcp?

Get the weekly brief

Data Sources