Capability
Cuda And Rocm Kernel Compilation With Automatic Backend Selection
5 artifacts provide this capability.
Want a personalized recommendation?
Find the best match →Top Matches
GPTQ-based LLM quantization with fast CUDA inference.
Unique: Implements automatic GPU architecture detection and kernel compilation at install time, with fallback chains that gracefully degrade to generic CUDA kernels if specialized kernels (Marlin, Exllama) are unavailable. Supports both NVIDIA CUDA and AMD ROCm in a single build system without manual configuration.
vs others: More convenient than manual kernel compilation because it detects GPU architecture automatically, and more flexible than pre-built wheels because it supports custom CUDA/ROCm versions and GPU architectures. Fallback chains prevent installation failures on unsupported hardware.