Semgrep CLI
CLI ToolFreeAI-powered static analysis for security.
Capabilities13 decomposed
pattern-based code vulnerability detection across 30+ languages
Medium confidenceSemgrep-core (OCaml engine) performs AST-based pattern matching against user-defined or curated rules to identify security vulnerabilities, code anti-patterns, and compliance violations. The engine parses source code into language-specific abstract syntax trees using tree-sitter and custom parsers, then matches patterns expressed in Semgrep's domain-specific language (YAML-based rule syntax) against the AST structure. This approach enables structural matching rather than regex-based detection, reducing false positives and enabling cross-language consistency.
Uses tree-sitter-based AST parsing with language-specific custom parsers for 30+ languages, enabling structural pattern matching that understands code semantics (function scope, variable binding, control flow) rather than relying on regex or token-based matching. The hybrid Python-OCaml architecture delegates computationally intensive matching to OCaml while maintaining a flexible Python CLI for workflow orchestration.
Faster and more accurate than regex-based tools (Grype, Trivy) because it matches against AST structure; more flexible than signature-based scanners because rules can express complex syntactic patterns; lighter-weight than full symbolic execution tools (Coverity, Fortify) while still catching many real vulnerabilities.
dataflow and taint analysis for cross-function vulnerability chaining
Medium confidenceSemgrep's taint analysis engine (available in Pro Engine) tracks data flow across function boundaries to detect vulnerability chains where untrusted input reaches a dangerous sink. The system constructs a dataflow graph by analyzing variable assignments, function parameters, return values, and object field mutations across the codebase. It identifies sources (user input, external data), sinks (SQL queries, command execution, file writes), and sanitizers (validation functions) to determine if tainted data can reach dangerous operations without proper sanitization.
Implements interprocedural taint analysis by constructing a dataflow graph from AST analysis, tracking variable bindings and function call chains to determine if untrusted data can reach dangerous sinks. The Pro Engine reduces false positives by ~25% and increases true positives by ~250% compared to single-function pattern matching by confirming actual reachability rather than just pattern presence.
More precise than pattern-only matching (which flags all SQL queries regardless of input source) and faster than full symbolic execution tools because it uses lightweight dataflow analysis rather than constraint solving.
language-specific parser support with graceful error handling
Medium confidenceSemgrep includes language-specific parsers (built on tree-sitter and custom OCaml implementations) for 30+ programming languages. Each parser converts source code into an AST that the pattern matching engine can analyze. The system implements graceful error handling where parse errors in individual files do not stop the scan; instead, errors are logged and scanning continues on other files. This enables Semgrep to scan heterogeneous codebases with mixed languages and syntax variations without failing on unparseable code.
Implements language-specific parsers using tree-sitter (for most languages) and custom OCaml implementations (for performance-critical languages), with graceful error handling that allows scanning to continue even if individual files fail to parse. This architecture enables Semgrep to support 30+ languages without requiring language-specific scanning tools.
More comprehensive language support than language-specific tools (like Pylint for Python or ESLint for JavaScript) because it handles multiple languages in a single tool; more robust than regex-based tools because it parses code into AST structure.
mcp (model context protocol) server for ai-assisted code analysis
Medium confidenceSemgrep includes an MCP server implementation that exposes scanning capabilities to AI models and LLM-based tools. The MCP server allows AI assistants to invoke Semgrep scans, retrieve findings, and analyze code patterns programmatically. This enables integration with AI-powered code review tools, automated remediation assistants, and LLM-based security analysis workflows. The server implements standard MCP protocols for tool invocation and result streaming.
Implements an MCP server that exposes Semgrep scanning capabilities to AI models and LLM-based tools, enabling integration with AI-powered code review and remediation workflows. The server implements standard MCP protocols for tool invocation, allowing AI assistants to invoke Semgrep scans and analyze findings programmatically.
Enables AI-assisted code analysis by exposing Semgrep as an MCP tool; more integrated than separate AI and scanning tools because findings are directly available to AI models for reasoning and remediation.
token and position tracking for precise finding location reporting
Medium confidenceSemgrep's OCaml engine tracks token positions and source locations during AST parsing and pattern matching, enabling precise reporting of finding locations (file, line, column, character offset). The system maintains a mapping between AST nodes and their source positions, allowing findings to be reported with exact character ranges. This enables IDE integration, inline code comments, and precise highlighting in web interfaces. The position tracking is implemented at the parser level and maintained through the entire analysis pipeline.
Maintains token and position tracking throughout the OCaml analysis pipeline, enabling precise character-level location reporting for findings. This architecture enables IDE integration, inline code highlighting, and automated remediation by providing exact token ranges rather than just line numbers.
More precise than tools reporting only line numbers because it provides character offsets; enables better IDE integration and automated fixes because exact token ranges are available.
multi-language rule definition and custom rule authoring
Medium confidenceSemgrep provides a YAML-based domain-specific language (DSL) for expressing code patterns that work across multiple programming languages. Rules are defined in YAML with pattern syntax that abstracts away language-specific details (e.g., a pattern for 'function call' works identically in Python, JavaScript, and Go). The pysemgrep CLI parses rule files, validates syntax, and passes compiled rules to semgrep-core for matching. Users can write custom rules targeting their codebase, organization standards, or specific vulnerability patterns without modifying the core engine.
Provides a language-agnostic YAML-based DSL that abstracts away language-specific syntax details, allowing a single rule to match equivalent patterns across Python, JavaScript, Go, Java, and 25+ other languages. Rules are compiled to an intermediate representation that semgrep-core interprets, enabling rapid rule iteration without recompiling the core engine.
More accessible than writing custom checkers in OCaml or C++ (as required by Clang Static Analyzer or Coverity) and more expressive than regex-based tools because rules can reference AST structure and semantic relationships.
ci/cd pipeline integration with policy enforcement and finding triage
Medium confidenceThe `semgrep ci` command integrates Semgrep into CI/CD workflows by scanning code, uploading findings to semgrep.dev, comparing against baseline scans, and enforcing organization-wide policies. The Python CLI (pysemgrep) orchestrates the workflow: it authenticates to Semgrep App using API tokens, fetches organization-specific rules and policies, runs the OCaml scanning engine, and reports results. The system can block CI builds based on policy rules (e.g., 'fail if critical vulnerabilities detected'), automatically triage findings based on organization rules, and track finding status across commits.
Implements a hybrid local-remote workflow where the OCaml scanning engine runs locally (fast, no data transmission) but policy enforcement and finding triage happen server-side via semgrep.dev API. This architecture enables organizations to enforce policies without exposing source code to the cloud while maintaining centralized policy management. The system tracks finding status across commits, enabling developers to see remediation progress.
More flexible than GitHub's native code scanning (which only supports GitHub-native rules) because it supports custom rules and cross-language patterns; more integrated than standalone SAST tools because it provides built-in CI/CD orchestration and finding management.
incremental scanning with baseline comparison and delta reporting
Medium confidenceSemgrep supports incremental scanning mode where it compares current scan results against a baseline (previous commit or main branch) to report only new or changed findings. The Python CLI manages baseline storage and comparison logic: it fetches the previous scan's JSON output, compares rule matches by file path and line number, and reports only findings that are new, moved, or changed in severity. This reduces noise in CI/CD by surfacing only actionable changes rather than all findings in the codebase.
Implements baseline comparison at the Python CLI layer by storing and comparing JSON scan results, enabling incremental reporting without requiring the OCaml engine to maintain state. This design allows flexible baseline sources (local files, semgrep.dev API, git history) while keeping the core scanning engine stateless.
Simpler than tools requiring full codebase re-analysis (like some SAST tools) because it compares results rather than re-running analysis; more practical than git-diff-based filtering because it handles line number shifts and can detect moved findings.
secrets detection with semantic validation and entropy analysis
Medium confidenceSemgrep includes built-in rules for detecting hardcoded secrets (API keys, passwords, tokens, private keys) using pattern matching combined with entropy analysis and semantic validation. The system matches common secret patterns (e.g., 'aws_access_key_id = ...', 'password: ...') and validates candidates using entropy scoring and format-specific checks (e.g., verifying AWS key format, checking if a string is a valid JWT). This reduces false positives compared to simple regex matching by confirming that detected patterns actually look like valid secrets.
Combines pattern matching with entropy analysis and format-specific validation to reduce false positives in secrets detection. The system uses Semgrep's rule language to express secret patterns (e.g., 'variable assignment with high-entropy value') and validates candidates against known secret formats (AWS key structure, JWT format, RSA key headers), enabling more accurate detection than regex-only tools.
More accurate than simple regex-based tools (like git-secrets) because it validates secret format and entropy; more flexible than signature-based scanners because it can detect custom secret patterns via rule authoring.
supply chain vulnerability scanning with reachability analysis
Medium confidenceSemgrep AppSec Platform includes supply chain scanning that detects vulnerable dependencies and determines if the vulnerability is actually reachable from application code. The system scans dependency manifests (package.json, requirements.txt, go.mod, pom.xml, etc.), identifies known vulnerable versions, and uses taint analysis to determine if the vulnerable function is actually called from application code. This reduces alert fatigue by filtering out vulnerabilities in unused dependencies or unreachable code paths.
Combines dependency scanning with reachability analysis to determine if vulnerable functions are actually called from application code. This two-stage approach reduces false positives by filtering out vulnerabilities in unused dependencies or unreachable code paths, enabling teams to prioritize remediation based on actual risk.
More precise than dependency-only scanners (like Dependabot, Snyk) because it performs reachability analysis to confirm actual impact; more integrated than standalone SCA tools because it uses the same OCaml engine and rule infrastructure as code scanning.
multi-format output and ci/cd tool integration (sarif, json, csv)
Medium confidenceSemgrep outputs findings in multiple formats to integrate with various CI/CD tools and reporting systems. The Python CLI supports JSON (for programmatic processing), SARIF (for GitHub Code Scanning, GitLab SAST, Azure DevOps), CSV (for spreadsheet analysis), and human-readable text. The output formatting layer (in pysemgrep) transforms the OCaml engine's internal finding representation into the requested format, including metadata like rule ID, severity, CWE, and remediation guidance.
Implements output formatting at the Python CLI layer, enabling flexible format conversion without modifying the OCaml core engine. The system supports SARIF (standardized for code scanning tools), JSON (for programmatic processing), and CSV (for reporting), allowing Semgrep to integrate with diverse CI/CD ecosystems.
More flexible than single-format tools because it supports multiple output formats; more standardized than custom JSON schemas because SARIF output enables integration with GitHub Code Scanning and other SARIF-compatible tools.
configuration resolution and rule discovery from multiple sources
Medium confidenceSemgrep's configuration resolver (pysemgrep) discovers and loads rules from multiple sources: local .semgrep.yml files, Semgrep Registry (curated rules), organization policies (from semgrep.dev), and command-line arguments. The resolver implements a precedence system where local rules override registry rules, and explicit CLI arguments override all defaults. It validates rule syntax, checks for conflicts, and reports errors if rules cannot be loaded. This enables flexible rule management from ad-hoc local testing to organization-wide policy enforcement.
Implements a multi-source configuration resolver that merges rules from local files, Semgrep Registry, and organization policies with a clear precedence system. The resolver validates rule syntax and reports conflicts, enabling flexible rule management from ad-hoc testing to organization-wide enforcement without requiring code changes.
More flexible than single-source rule systems because it supports local, registry, and organization-level rules; more integrated than external rule management because rules are resolved at CLI runtime rather than requiring separate configuration steps.
performance optimization with parallel scanning and caching
Medium confidenceSemgrep optimizes scanning performance through parallel file processing and result caching. The OCaml engine processes multiple files concurrently using worker threads, and the Python CLI implements caching of parse trees and rule compilation results. For large codebases, Semgrep can scan thousands of files in seconds by distributing work across CPU cores. The system also supports incremental scanning where only changed files are re-scanned, further reducing overhead in CI/CD workflows.
Combines OCaml-level parallel file processing with Python-level caching of parse trees and rule compilation results. The hybrid architecture enables fast scanning of large codebases by distributing work across CPU cores while maintaining a flexible Python CLI for workflow orchestration and caching management.
Faster than single-threaded SAST tools because it parallelizes file processing; more efficient than tools requiring full re-analysis because it caches parse trees and rule compilation across runs.
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with Semgrep CLI, ranked by overlap. Discovered automatically through the match graph.
UseTusk
AI-powered tool for automated bug detection and smart...
drift
Codebase intelligence for AI. Detects patterns & conventions + remembers decisions across sessions. MCP server for any IDE. Offline CLI.
GitHub Copilot X
AI-powered software developer
Ellipsis
(Previously BitBuilder) "Automated code reviews and bug fixes"
VSGuard
Add proactive OWASP ASVS security guidance to coding AI agents to write secure code from the start. Scan code for cybersecurity vulnerabilities across multiple languages and receive clear findings with remediation steps. Generate secure fixes with ASVS-mapped guidance and ready-to-use examples.
Kwaipilot: KAT-Coder-Pro V2
KAT-Coder-Pro V2 is the latest high-performance model in KwaiKAT’s KAT-Coder series, designed for complex enterprise-grade software engineering and SaaS integration. It builds on the agentic coding strengths of earlier versions,...
Best For
- ✓Security teams conducting code audits and vulnerability assessments
- ✓DevSecOps engineers integrating static analysis into CI/CD pipelines
- ✓Individual developers scanning their own code during development
- ✓Security teams requiring deep vulnerability analysis beyond pattern matching
- ✓Organizations using Semgrep AppSec Platform with Pro Engine subscription
- ✓Teams building custom rules that need to express data dependency relationships
- ✓Polyglot teams using multiple programming languages
- ✓Organizations with legacy code containing syntax variations or non-standard constructs
Known Limitations
- ⚠Community Edition limited to single-function pattern matching; cross-function analysis requires Pro Engine
- ⚠Pattern matching accuracy depends on rule quality; false positives possible with overly broad patterns
- ⚠No semantic understanding of business logic; cannot detect logic flaws or authorization bypass without explicit patterns
- ⚠Performance degrades on very large codebases (>1M LOC) without incremental scanning
- ⚠Pro Engine feature only; not available in Community Edition
- ⚠Cross-file analysis limited to explicitly imported modules; dynamic imports not fully supported
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
About
Lightweight static analysis tool for finding bugs, detecting security vulnerabilities, and enforcing code standards. Uses pattern-matching with AI-powered rules across 30+ languages.
Categories
Alternatives to Semgrep CLI
Are you the builder of Semgrep CLI?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →