multi-format document ingestion with automatic parsing and metadata attachment
Accepts 22+ file formats (PDF, DOCX, XLSX, PNG, EML, etc.) and URLs via SDK, automatically parses content into structured text, applies configurable chunking strategies, and attaches custom metadata per document. The ingestion pipeline processes files asynchronously with job status tracking, enabling bulk document onboarding without blocking application flow. Supports multimodal content including images, graphs, and tables with native extraction capabilities.
Unique: Supports 22+ file formats with native multimodal extraction (images, graphs, tables) in a single unified pipeline, unlike competitors that require separate OCR or table-extraction services. Metadata attachment at ingestion time enables downstream filtering without post-processing, and asynchronous job tracking prevents blocking on large document batches.
vs alternatives: Broader format support and native multimodal handling than Pinecone or Weaviate, which require external parsing; simpler than building custom ETL pipelines with Langchain or LlamaIndex.
semantic search with metadata filtering and reranking
Converts user queries into vector embeddings and performs similarity search across indexed documents, optionally filtering results by metadata predicates before retrieval. A reranking layer (algorithm unspecified) refines result precision after initial semantic matching. Supports hybrid search combining semantic and traditional retrieval mechanisms, though the hybrid implementation details are undocumented. Returns ranked results with relevance scores and source attribution.
Unique: Integrates metadata filtering at the retrieval stage (not post-processing), enabling efficient subset-before-rank patterns. Reranking layer is built-in rather than requiring external services, and local deployment eliminates cloud latency for real-time search applications.
vs alternatives: Faster than cloud-only solutions (Pinecone, Weaviate SaaS) for latency-sensitive applications due to local deployment option; more integrated than Langchain/LlamaIndex, which require manual reranking orchestration.
observability and logging for debugging and monitoring
Provides logging and observability features for tracking ingestion progress, search performance, RAG generation quality, and system errors. Logs include request/response traces, latency metrics, token usage, and error details. Observability data is accessible via API and optional dashboard for monitoring system health, identifying bottlenecks, and debugging issues. Supports integration with external monitoring platforms (DataDog, New Relic, etc.).
Unique: Built-in observability for RAG-specific metrics (generation quality, hallucination detection, token usage) rather than generic application monitoring. Integration with external platforms enables centralized monitoring across heterogeneous systems.
vs alternatives: More integrated than generic application monitoring (DataDog, New Relic) which lack RAG-specific insights; simpler than building custom logging infrastructure; enables proactive quality monitoring that cloud-only services don't provide.
tiered pricing with usage-based scaling (free, pro, enterprise)
Offers three pricing tiers with different feature sets and usage limits: Free tier (1,000 pages, 10,000 retrievals/month, no connectors), Pro tier ($49/month, 10,000 pages included, unlimited retrievals, per-connector charges), and Enterprise tier (custom pricing, BYOC/self-hosted, unlimited pages, custom features). Usage is measured in 'pages' (1,000 characters = 1 page) rather than documents, enabling predictable cost scaling. Connector costs ($100/month each on Pro) are separate from base subscription.
Unique: Page-based pricing (1,000 characters = 1 page) is more granular than document-based pricing, enabling cost predictability for variable-sized documents. Separate connector costs enable transparent pricing for multi-source setups. Free tier provides meaningful evaluation capability (1,000 pages) without credit card.
vs alternatives: More transparent than Pinecone or Weaviate (which use opaque 'pod' or 'vector' pricing); more flexible than fixed per-document pricing; simpler cost estimation than token-based pricing models.
simple rag (retrieval-augmented generation) with automatic citation
Chains semantic search results directly into an LLM prompt, grounding generated responses in retrieved documents. Automatically tracks and attributes citations to source documents, enabling end-users to inspect the evidence backing each answer. Supports pluggable LLM providers (OpenAI, Anthropic, Google, xAI, Azure, Cohere, Qwen, Mistral, DeepSeek) via configuration, abstracting provider-specific APIs. Reduces hallucinations by constraining generation to indexed knowledge.
Unique: Automatic citation tracking is built-in rather than requiring post-processing or custom prompt engineering. Multi-provider LLM abstraction (8+ providers) eliminates vendor lock-in and enables A/B testing across models without code changes. Local deployment option reduces latency for real-time RAG applications.
vs alternatives: Simpler than Langchain/LlamaIndex RAG chains (no manual retrieval orchestration); more transparent than vanilla LLMs due to automatic citations; faster than cloud-only RAG services due to local deployment option.
agentic rag with multi-hop reasoning and planning
Extends simple RAG with AI-driven planning and multi-hop retrieval, enabling the system to decompose complex queries into sub-questions, retrieve relevant documents iteratively, and reason across multiple sources. Integrates with Vercel's AI SDK for agent orchestration, allowing the LLM to decide when to search, what to search for, and how to synthesize results. Supports custom tool definitions and agentic reasoning loops without manual prompt engineering.
Unique: Integrates agentic reasoning directly into RAG pipeline via AI SDK, eliminating manual orchestration of retrieval loops. Supports autonomous decision-making about what to retrieve and when, rather than static top-k retrieval. Built-in planning layer decomposes complex queries without custom prompt engineering.
vs alternatives: More integrated than Langchain/LlamaIndex agent patterns (less boilerplate); more autonomous than simple RAG; supports multi-provider LLMs unlike some agent frameworks tied to specific models.
connector-based document synchronization from external sources
Automatically syncs documents from external data sources (Google Drive, SharePoint, Notion) into Agentset namespaces via pre-built connectors. Handles authentication, incremental updates, and metadata extraction from source systems. Connectors are charged per-connector on Pro tier ($100/month each), enabling organizations to maintain live links between source systems and RAG indexes without manual re-ingestion. Webhook events notify downstream systems of sync completion.
Unique: Pre-built connectors for major enterprise platforms (Google Drive, SharePoint, Notion) eliminate custom integration work. Webhook-driven event system enables downstream automation without polling. Metadata extraction from source systems preserves organizational context (ownership, timestamps, folder hierarchy).
vs alternatives: Simpler than building custom Langchain/LlamaIndex loaders for each source; more integrated than generic ETL tools (Zapier, Make) which lack RAG-specific optimizations; faster than manual document uploads for large repositories.
customizable chat interface with feedback collection
Generates shareable preview links to chat interfaces for RAG responses, enabling end-users to interact with grounded answers without accessing the backend system. Interfaces are customizable (branding, instructions, model selection) and collect user feedback (thumbs up/down, comments) for quality monitoring and model improvement. Feedback data is stored and accessible via API for analytics and fine-tuning workflows.
Unique: Built-in feedback collection and analytics eliminate need for external survey tools or custom logging. Customizable interface enables white-label deployments without forking code. Preview links provide secure, time-limited access without requiring backend API exposure.
vs alternatives: Simpler than building custom chat UIs with Langchain/LlamaIndex; more integrated feedback loop than generic analytics tools; faster deployment than custom Streamlit or Next.js chat applications.
+4 more capabilities