vllm/csrc/quantization/gguf
2024-08-16 14:00:11 -07:00
..
dequantize.cuh [Kernel] fix types used in aqlm and ggml kernels to support dynamo (#7596) 2024-08-16 14:00:11 -07:00
ggml-common.h [Core] Support loading GGUF model (#5191) 2024-08-05 17:54:23 -06:00
gguf_kernel.cu [Kernel] fix types used in aqlm and ggml kernels to support dynamo (#7596) 2024-08-16 14:00:11 -07:00
mmq.cuh [Core] Support loading GGUF model (#5191) 2024-08-05 17:54:23 -06:00
mmvq.cuh [Core] Support loading GGUF model (#5191) 2024-08-05 17:54:23 -06:00
vecdotq.cuh [Core] Support loading GGUF model (#5191) 2024-08-05 17:54:23 -06:00