faiss-cpu
RepositoryFreeA library for efficient similarity search and clustering of dense vectors.
Capabilities12 decomposed
dense-vector similarity search with multiple index types
Medium confidenceImplements approximate nearest neighbor (ANN) search across dense vector spaces using multiple indexing strategies (flat, IVF, HNSW, PQ) that trade off between speed, memory, and accuracy. The library uses quantization and hierarchical clustering techniques to enable sub-linear search time on billion-scale datasets without loading entire indices into memory. Supports both exact and approximate search modes with configurable recall-vs-speed tradeoffs.
Provides a unified C++ API with Python bindings supporting 10+ index types (flat, IVF, HNSW, PQ, OPQ, LSH, etc.) with automatic index selection heuristics, whereas competitors like Annoy or Hnswlib typically specialize in single index types. Uses product quantization with learned codebooks for extreme compression (96-bit vectors to 8-16 bits) enabling billion-scale search on commodity hardware.
Faster than Annoy for billion-scale datasets due to IVF partitioning and product quantization; more flexible than Hnswlib which only implements HNSW; more memory-efficient than Milvus for CPU-only deployments since it's a pure library without server overhead.
inverted-file index construction with clustering
Medium confidenceBuilds IVF (Inverted File) indices by partitioning the vector space into Voronoi cells using k-means clustering, then storing vectors in inverted lists keyed by their nearest cluster centroid. During search, only vectors in nearby clusters are examined, reducing search complexity from O(N) to O(N/nlist + nprobe*nlist/k). Supports training on a subset of data and adding vectors incrementally to pre-trained indices.
Implements k-means clustering with Faiss-specific optimizations like batch k-means and GPU-accelerated centroid updates (in GPU version), plus automatic handling of empty clusters and centroid reassignment. Integrates clustering directly into the search index rather than as a separate preprocessing step, enabling joint optimization of cluster quality and search performance.
More efficient than scikit-learn's k-means for large-scale vector clustering because it uses batch updates and avoids dense distance matrix computation; tighter integration with search than standalone clustering libraries, enabling co-optimization of index structure.
range search and threshold-based retrieval
Medium confidenceRetrieves all vectors within a specified distance threshold (radius search) rather than top-K nearest neighbors. Useful for clustering, outlier detection, and similarity thresholding. Supports both exact and approximate range search with configurable recall tradeoffs.
Supports range search across all index types with automatic result collection and threshold-based filtering. Provides both exact and approximate range search modes.
More flexible than top-K search for applications with similarity thresholds; enables variable-sized result sets appropriate for clustering and anomaly detection.
index cloning and copying
Medium confidenceCreates independent copies of trained indices, enabling parallel search operations or index modification without affecting the original. Supports both shallow copies (shared data structures) and deep copies (independent data). Useful for A/B testing different index configurations or maintaining multiple versions.
Provides both shallow and deep copy semantics with explicit control over data sharing, enabling flexible index management strategies.
More efficient than retraining indices for A/B testing; enables parallel access without external synchronization.
product-quantization vector compression
Medium confidenceCompresses high-dimensional vectors into compact codes by decomposing the vector space into M subspaces, quantizing each subspace independently to K centroids, and storing only the centroid indices (typically 8-16 bits per subspace). Enables distance computation in compressed space using lookup tables, reducing memory footprint by 10-100x while maintaining approximate search accuracy. Supports both PQ (product quantization) and OPQ (optimized PQ with learned rotation).
Implements both standard PQ and OPQ (with learned rotation) in a unified API, plus asymmetric distance computation (ADC) where queries remain in float space while database vectors are quantized, improving accuracy. Provides lookup table acceleration for distance computation, enabling 10-100x speedup vs naive quantized distance computation.
More memory-efficient than storing full float32 vectors and faster than post-hoc quantization approaches; OPQ variant outperforms standard PQ by learning optimal subspace decomposition, whereas competitors like Annoy use fixed random projections.
hierarchical-navigable-small-world graph indexing
Medium confidenceBuilds HNSW (Hierarchical Navigable Small World) indices by constructing a multi-layer graph where each layer is a navigable small-world network with logarithmic diameter. Search navigates from top layers (sparse, long-range connections) to bottom layers (dense, local connections), achieving O(log N) search complexity. Supports incremental insertion of new vectors without retraining, making it suitable for streaming workloads.
Implements HNSW with Faiss-specific optimizations including batch insertion, configurable layer assignment strategies, and integration with other Faiss index types (e.g., HNSW+PQ for memory-efficient dynamic indexing). Provides ef parameter for query-time recall tuning without index reconstruction.
More memory-efficient than Hnswlib (the reference implementation) due to tighter C++ integration; supports composition with quantization (HNSW+PQ) whereas Hnswlib doesn't, enabling billion-scale dynamic indexing on CPU.
composite-index chaining with automatic routing
Medium confidenceChains multiple index types together (e.g., IVF→PQ, HNSW→PQ) where the first index coarsely filters candidates and the second refines results, enabling automatic routing of queries through the pipeline. Supports index composition via IndexIVFPQ, IndexHNSWPQ, and custom composite indices. Allows fine-grained control over filtering thresholds and refinement strategies.
Provides pre-built composite index classes (IndexIVFPQ, IndexHNSWPQ) that automatically handle parameter passing and result routing between stages, eliminating manual pipeline orchestration. Enables composition of any two index types via the IndexPreTransform API for custom pipelines.
More convenient than manually chaining indices because parameter tuning and result routing are handled automatically; more flexible than single-index approaches because it enables joint optimization of filtering and refinement stages.
batch vector addition with automatic index updates
Medium confidenceAdds multiple vectors to an index in batches, automatically updating internal data structures (cluster assignments, quantization codebooks, graph connections) without full index reconstruction. Supports both exact indices (flat, IVF) and approximate indices (HNSW, PQ) with different update semantics. Provides options for synchronous updates (immediate consistency) or asynchronous updates (deferred consistency for throughput).
Provides index-type-specific batch insertion logic that preserves index structure (e.g., HNSW graph updates, IVF cluster assignments) without full reconstruction. Supports optional vector ID assignment for tracking and deletion.
More efficient than rebuilding indices from scratch for each batch; more flexible than append-only indices because it maintains search quality through structural updates.
index serialization and persistence
Medium confidenceSerializes trained indices to disk in a binary format optimized for fast loading, preserving all internal structures (cluster centroids, quantization codebooks, graph connections). Supports both complete index serialization and partial serialization (e.g., codebooks only). Enables index sharing across processes and machines via file transfer or network protocols.
Provides efficient binary serialization that preserves all index metadata and structures without requiring retraining. Supports partial serialization (e.g., saving only quantization codebooks) for memory-efficient loading.
Faster loading than retraining indices from scratch; more compact than JSON serialization due to binary format.
distance metric selection and custom metrics
Medium confidenceSupports multiple distance metrics (L2 Euclidean, inner product, cosine similarity, Hamming distance) and allows custom metric definition via metric_type parameter. Metrics are used consistently across all index types and search operations. Enables metric-specific optimizations (e.g., SIMD acceleration for L2 distance).
Provides unified metric interface across all index types with metric-specific SIMD optimizations (e.g., AVX2 for L2 distance). Supports both built-in metrics and custom metric registration via C++ API.
More flexible than libraries with fixed metrics (e.g., Annoy only supports Euclidean and Manhattan); more performant than generic metric implementations due to SIMD acceleration.
k-means clustering with batch updates
Medium confidenceImplements k-means clustering optimized for large-scale vector data using batch updates, Faiss-specific initializations (k-means++, random), and convergence criteria. Used internally for IVF index training but also exposed as a standalone API. Supports GPU acceleration (in GPU version) and multi-threaded CPU execution.
Implements batch k-means with Faiss-specific optimizations including efficient distance computation via BLAS, multi-threaded centroid updates, and automatic handling of empty clusters. Tightly integrated with IVF indexing for joint optimization.
Faster than scikit-learn's k-means for large-scale clustering due to batch updates and optimized distance computation; more integrated with search than standalone clustering libraries.
vector normalization and preprocessing
Medium confidenceProvides utilities for normalizing vectors (L2 normalization, unit normalization) and applying transformations (PCA, whitening) before indexing. Supports both in-memory and streaming preprocessing. Transformations can be applied consistently to both training and query vectors.
Provides IndexPreTransform API for composing preprocessing transformations with indices, enabling automatic application of normalization/PCA during search without manual pipeline orchestration.
More integrated with search than standalone preprocessing libraries; enables joint optimization of preprocessing and indexing.
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with faiss-cpu, ranked by overlap. Discovered automatically through the match graph.
pgvector
Vector search for PostgreSQL — HNSW indexes, similarity queries in SQL, use existing Postgres.
lancedb
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
zvec
A lightweight, lightning-fast, in-process vector database
Qdrant
Rust-based vector search engine — fast, payload filtering, quantization, horizontal scaling.
databend
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
RediSearch
A query and indexing engine for Redis, providing secondary indexing, full-text search, vector similarity search and aggregations.
Best For
- ✓ML engineers building semantic search or recommendation systems
- ✓Teams implementing RAG pipelines with large embedding collections
- ✓Researchers prototyping ANN algorithms and index structures
- ✓Production systems requiring sub-millisecond latency on billion-scale vector datasets
- ✓Production systems with streaming vector ingestion where retraining is expensive
- ✓Applications requiring tunable recall-vs-latency tradeoffs
- ✓Teams with limited GPU resources needing CPU-based indexing
- ✓Applications with similarity thresholds rather than fixed top-K requirements
Known Limitations
- ⚠Approximate indices (IVF, HNSW, PQ) have configurable but non-zero recall loss — exact search requires flat indices which don't scale beyond ~100M vectors
- ⚠Index construction is CPU-bound and can take hours for billion-scale datasets; no incremental indexing for most index types
- ⚠Quantization (PQ, OPQ) reduces vector dimensionality and precision, requiring careful tuning of codebook size and training data
- ⚠No built-in distributed indexing — scaling across multiple machines requires manual sharding or external orchestration
- ⚠CPU version has no GPU acceleration; GPU operations require separate faiss-gpu package with CUDA/cuDNN dependencies
- ⚠k-means training is sensitive to initialization and may converge to local optima; requires multiple restarts or careful seed selection
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
Repository Details
Package Details
About
A library for efficient similarity search and clustering of dense vectors.
Categories
Alternatives to faiss-cpu
Are you the builder of faiss-cpu?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →