vllm/quantization at 07feecde1a69859d565786a7ad64c0f604f17b28 - vllm

History

Tyler Michael Smith 348616ac4b [Kernel] Suppress mma.sp warning on CUDA 12.5 and later (#5401 )		2024-06-14 10:02:00 -07:00
..
aqlm	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00
awq	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00
compressed_tensors	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00
cutlass_w8a8	[Kernel] Fix CUTLASS 3.x custom broadcast load epilogue (#5516 )	2024-06-14 09:30:15 -07:00
fp8	[Kernel] Vectorized FP8 quantize kernel (#5396 )	2024-06-12 14:07:26 -07:00
gptq	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00
gptq_marlin	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00
marlin	[Kernel] Suppress mma.sp warning on CUDA 12.5 and later (#5401 )	2024-06-14 10:02:00 -07:00
squeezellm	[Kernel][Misc] Use TORCH_LIBRARY instead of PYBIND11_MODULE for custom ops (#5047 )	2024-06-09 16:23:30 -04:00