vllm/csrc/quantization/gptq_marlin
2024-05-16 09:55:29 -04:00
..
gptq_marlin_dtypes.cuh [Kernel] add bfloat16 support for gptq marlin kernel (#4788) 2024-05-16 09:55:29 -04:00
gptq_marlin_repack.cu [Kernel] Support running GPTQ 8-bit models in Marlin (#4533) 2024-05-02 12:56:22 -04:00
gptq_marlin.cu [Kernel] add bfloat16 support for gptq marlin kernel (#4788) 2024-05-16 09:55:29 -04:00
gptq_marlin.cuh [Kernel] Support running GPTQ 8-bit models in Marlin (#4533) 2024-05-02 12:56:22 -04:00