Capability
Batch Embedding Generation With Onnx Acceleration
20 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
via “batch inference with automatic padding and tokenization”
sentence-similarity model by undefined. 1,28,43,377 downloads.
Unique: Automatic batch padding with attention masks and 2048-token context window (vs. 512 in standard sentence-transformers) enables efficient processing of variable-length documents without manual chunking or padding logic
vs others: Simpler API than raw transformers library (no manual tokenization/padding) and more efficient than sequential embedding (batching reduces per-token overhead by 10-20x), with explicit support for long documents that competitors require chunking for