vllm/csrc/quantization/marlin/sparse
2024-05-30 22:00:26 -07:00
..
common Revert "[Kernel] Marlin_24: Ensure the mma.sp instruction is using the ::ordered_metadata modifier (introduced with PTX 8.5)" (#5149) 2024-05-30 22:00:26 -07:00
LICENSE Add GPTQ Marlin 2:4 sparse structured support (#4790) 2024-05-16 12:56:15 -04:00
marlin_24_cuda_kernel.cu Marlin 24 prefill performance improvement (about 25% better on average) (#4983) 2024-05-23 02:39:27 -04:00