vllm/csrc/quantization/marlin
Robert Shaw c0c2335ce0
Integrate Marlin Kernels for Int4 GPTQ inference (#2497)
Co-authored-by: Robert Shaw <114415538+rib-2@users.noreply.github.com>
Co-authored-by: alexm <alexm@neuralmagic.com>
2024-03-01 12:47:51 -08:00
..
LICENSE Integrate Marlin Kernels for Int4 GPTQ inference (#2497) 2024-03-01 12:47:51 -08:00
marlin_cuda_kernel.cu Integrate Marlin Kernels for Int4 GPTQ inference (#2497) 2024-03-01 12:47:51 -08:00