vllm/csrc/quantization/marlin/sparse
2024-05-30 21:02:11 -05:00
..
common [Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5) (#5136) 2024-05-30 21:02:11 -05:00
LICENSE Add GPTQ Marlin 2:4 sparse structured support (#4790) 2024-05-16 12:56:15 -04:00
marlin_24_cuda_kernel.cu Marlin 24 prefill performance improvement (about 25% better on average) (#4983) 2024-05-23 02:39:27 -04:00