Capability

Multi Gpu Distributed Inference With Tensor Parallelism And Pipeline Parallelism

20 artifacts provide this capability.

Want a personalized recommendation?

Top Matches

via “parallel and multi-device inference orchestration”

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Unique: Leverages PaddlePaddle's distributed inference framework to support heterogeneous hardware (NVIDIA GPU, Kunlun XPU, Ascend NPU) with automatic device selection and load balancing. Implements both data parallelism (batch processing) and pipeline parallelism (stage-wise distribution) without code changes. Includes dynamic batching to optimize throughput while managing memory constraints.

Multi Gpu Distributed Inference With Tensor Parallelism And Pipeline Parallelism

Top Matches

Also Known As

Company