Beam Search Decoding With Configurable Search Width And Length Normalization

1

madlad400-3b-mtModel45/100

via “beam-search-decoding-with-length-penalty”

translation model by undefined. 4,72,848 downloads.

Unique: Implements standard T5 beam search with length normalization to address the length bias problem in sequence-to-sequence models; integrates with HuggingFace generate() API for configurable beam_width, num_beams, and length_penalty parameters

vs others: Produces higher-quality translations than greedy decoding at the cost of latency; more practical than exhaustive search while maintaining reasonable quality-latency tradeoffs

2

t5-3bModel45/100

via “efficient inference with configurable beam search decoding”

translation model by undefined. 8,75,782 downloads.

Unique: Configurable beam search with length normalization and early stopping enables fine-grained latency-quality tuning without model retraining; batching support with GPU acceleration optimizes throughput for production inference

vs others: More flexible than fixed-decoding models; supports both high-quality (beam_width=8) and low-latency (greedy) modes in single model unlike separate fast/accurate variants

3

opus-mt-en-deModel44/100

via “beam search decoding with configurable beam width and length penalties”

translation model by undefined. 8,14,426 downloads.

Unique: Marian's beam search implementation uses efficient batch processing to decode all beams in parallel on GPU, reducing per-beam overhead compared to sequential decoding. Length penalty is applied during beam search (not post-hoc), enabling early pruning of degenerate hypotheses.

vs others: Better translation quality than greedy decoding (1-3 BLEU points) with reasonable latency overhead; comparable to sampling-based decoding but more deterministic and reproducible; inferior to larger models (GPT-4) but with 100x lower latency and cost.

4

opus-mt-ko-enModel44/100

translation model by undefined. 5,45,011 downloads.

Unique: Marian's beam search implementation includes efficient batched computation of multiple hypotheses and length normalization specifically tuned for translation (not generic text generation), reducing the probability of pathological short translations common in other seq2seq models.

vs others: More efficient beam search than generic transformer implementations due to Marian's translation-specific optimizations, though less flexible than sampling-based approaches for exploring diverse translations.

5

t5-largeModel44/100

via “efficient inference with beam search decoding and length penalty control”

translation model by undefined. 4,73,953 downloads.

Unique: Configurable beam search with length penalty parameters enables dynamic output length control at inference time without retraining, allowing single model to generate variable-length summaries/translations. Length normalization via length penalty prevents beam search bias toward shorter sequences, improving quality of longer outputs.

vs others: More flexible than fixed-length generation (e.g., max_length only) due to length penalty tuning; faster than sampling-based decoding for deterministic applications while maintaining quality comparable to nucleus sampling

6

opus-mt-nl-enModel43/100

via “beam search decoding with configurable beam width and length penalties”

translation model by undefined. 8,97,699 downloads.

Unique: Marian's beam search implementation uses efficient C++ kernels via CTranslate2, enabling beam_width=8 with only 2-3x latency overhead instead of 4-8x typical in pure Python implementations; supports length normalization via configurable alpha parameter, allowing fine-grained control over translation length without retraining

vs others: Faster beam search than generic seq2seq implementations due to optimized inference backend; more flexible than single-hypothesis translation APIs (e.g., Google Translate) which don't expose beam alternatives or confidence scores

7

opus-mt-ru-enModel42/100

via “beam search decoding with configurable beam width and length penalties”

translation model by undefined. 2,43,797 downloads.

Unique: Implements Marian's optimized beam search with efficient batching and GPU memory management, allowing larger beam widths (8+) without proportional memory overhead. Supports length normalization specifically tuned for translation tasks, reducing the common problem of overly-short translations.

vs others: More efficient than naive beam search implementations because Marian uses fused CUDA kernels for attention computation; produces better translations than greedy decoding at the cost of latency, with tunable quality-speed tradeoff.

8

t5-small-booksumModel34/100

via “configurable-beam-search-decoding-with-length-constraints”

summarization model by undefined. 16,506 downloads.

Unique: Leverages HuggingFace transformers' native beam search implementation with T5-specific length normalization (alpha parameter) tuned for narrative text, avoiding custom decoding logic that would introduce maintenance overhead

vs others: Standard HuggingFace beam search is simpler to implement than custom constrained decoding libraries (e.g., Guidance, LMQL) but lacks hard length constraints; trade-off favors ease of use for most summarization workflows

9

rut5-base-summModel33/100

via “beam search decoding with configurable length penalties and early stopping”

summarization model by undefined. 10,019 downloads.

Unique: Uses transformers library's native beam search implementation with length normalization and early stopping, avoiding custom decoding logic. Supports batched beam search across multiple documents, enabling efficient GPU utilization for production inference.

vs others: More flexible than fixed-length truncation and more efficient than sampling-based decoding for deterministic, high-quality summaries.

10

faster-whisperRepository28/100

via “configurable beam search decoding with temperature fallback”

Faster Whisper transcription with CTranslate2

Unique: Implements automatic fallback from beam search to temperature sampling without user intervention, ensuring transcription robustness across edge-case audio. Beam width and temperature are configurable per-transcription, enabling dynamic strategy adjustment.

vs others: Automatic fallback mechanism eliminates transcription failures on problematic audio (vs. fixed beam search which may fail), and per-transcription configuration enables adaptive strategies without model reloading.

11

Whisper APIProduct

via “beam-search-size-configuration”

Top Matches

Also Known As

Company