Capability

Distilled Transformer Inference With Reduced Parameter Footprint

2 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

zero-shot-classification model by undefined. 2,28,990 downloads.

Unique: Distilled from RoBERTa-Large specifically for NLI tasks using knowledge distillation, achieving 15x parameter reduction while maintaining >90% of teacher model accuracy on SNLI/MultiNLI benchmarks — most lightweight NLI alternatives either use non-distilled architectures or sacrifice accuracy more severely

vs others: Faster CPU inference than full-size cross-encoders (RoBERTa-Large, BERT-Large) by 3-5x; more accurate than simple bi-encoder baselines on entailment tasks due to cross-encoder architecture, despite smaller size

Distilled Transformer Inference With Reduced Parameter Footprint

Top Matches

Also Known As

Company