Onnx Based Local Ranking And Quality Scoring

1

mcp-memory-serviceMCP Server50/100

via “onnx-based-local-ranking-and-quality-scoring”

Open-source persistent memory for AI agent pipelines (LangGraph, CrewAI, AutoGen) and Claude. REST API + knowledge graph + autonomous consolidation.

Unique: Uses ONNX-based re-ranking (cross-encoder models) to improve search quality without external APIs, combining semantic similarity with metadata-based quality signals. Supports async scoring to avoid blocking retrieval operations, enabling real-time search with background quality improvements.

vs others: Cheaper and faster than Cohere Rerank API because it runs locally; more sophisticated than simple BM25 re-ranking because it uses neural models trained on relevance judgments.

2

vespaMCP Server50/100

via “multi-phase ranking with onnx model integration”

AI + Data, online. https://vespa.ai

Unique: Executes ONNX models natively on content nodes during query processing without external model serving infrastructure, with ranking expressions compiled to optimized C++ code. This eliminates network latency of calling external ML services and enables batched inference across candidate results.

vs others: Faster than calling external model serving APIs (Triton, KServe) because ONNX inference happens in-process on content nodes, eliminating network round-trips and enabling batched inference across top-K candidates in a single pass.

3

VespaProduct

via “ml-model-ranking-integration”

Top Matches

Also Known As

Company