vllm/csrc
Robert Shaw c0c2335ce0
Integrate Marlin Kernels for Int4 GPTQ inference (#2497)
Co-authored-by: Robert Shaw <114415538+rib-2@users.noreply.github.com>
Co-authored-by: alexm <alexm@neuralmagic.com>
2024-03-01 12:47:51 -08:00
..
attention Fix compile error when using rocm (#2648) 2024-02-01 09:35:09 -08:00
moe Add fused top-K softmax kernel for MoE (#2769) 2024-02-05 17:38:02 -08:00
punica Add LoRA support for Gemma (#3050) 2024-02-28 13:03:28 -08:00
quantization Integrate Marlin Kernels for Int4 GPTQ inference (#2497) 2024-03-01 12:47:51 -08:00
activation_kernels.cu Optimize GeGLU layer in Gemma (#2975) 2024-02-21 20:17:52 -08:00
cache_kernels.cu [Minor] Remove gather_cached_kv kernel (#3043) 2024-02-26 15:00:54 -08:00
cache.h [Minor] Remove gather_cached_kv kernel (#3043) 2024-02-26 15:00:54 -08:00
cuda_compat.h Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836) 2023-12-07 23:16:52 -08:00
cuda_utils_kernels.cu [ROCm] add support to ROCm 6.0 and MI300 (#2274) 2024-01-26 12:41:10 -08:00
cuda_utils.h [ROCm] add support to ROCm 6.0 and MI300 (#2274) 2024-01-26 12:41:10 -08:00
custom_all_reduce_test.cu Implement custom all reduce kernels (#2192) 2024-01-27 12:46:35 -08:00
custom_all_reduce.cu Implement custom all reduce kernels (#2192) 2024-01-27 12:46:35 -08:00
custom_all_reduce.cuh No repeated IPC open (#2642) 2024-01-29 10:46:29 -08:00
dispatch_utils.h DeepseekMoE support with Fused MoE kernel (#2453) 2024-01-29 21:19:48 -08:00
layernorm_kernels.cu [FIX] Support non-zero CUDA devices in custom kernels (#1959) 2024-01-02 19:09:59 -08:00
moe_align_block_size_kernels.cu Fused MOE for Mixtral (#2542) 2024-01-29 22:43:37 -08:00
ops.h Integrate Marlin Kernels for Int4 GPTQ inference (#2497) 2024-03-01 12:47:51 -08:00
pos_encoding_kernels.cu [FIX] Support non-zero CUDA devices in custom kernels (#1959) 2024-01-02 19:09:59 -08:00
pybind.cpp Integrate Marlin Kernels for Int4 GPTQ inference (#2497) 2024-03-01 12:47:51 -08:00
reduction_utils.cuh Merge EmbeddedLLM/vllm-rocm into vLLM main (#1836) 2023-12-07 23:16:52 -08:00