Capability
Compression Metrics And Accuracy Evaluation Framework
18 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
Toolkit for LLM quantization, pruning, and distillation.
Unique: Implements integrated evaluation framework with support for standard benchmarks (MMLU, HellaSwag, TruthfulQA), task-specific metrics (perplexity, BLEU), and custom evaluation functions, enabling systematic accuracy assessment without external evaluation tools
vs others: More convenient than manual evaluation because benchmarks are pre-configured; more flexible than fixed metrics because custom functions are supported; more integrated than external evaluation tools because it's built into the compression pipeline