vllm/kernels at main - vllm - Gitea: Git with a cup of tea

History

ElizaWszola b00b33d77e [Model][Quantization] HQQ support through Marlin kernel expansion (#9766 ) Signed-off-by: ElizaWszola <eliza@neuralmagic.com>		2024-11-19 13:31:12 -08:00
..
benchmark_aqlm.py	[Frontend] Add FlexibleArgumentParser to support both underscore and dash in names (#5718 )	2024-06-20 17:00:13 -06:00
benchmark_layernorm.py	[Hardware] using current_platform.seed_everything (#9785 )	2024-10-29 14:47:44 +00:00
benchmark_machete.py	[Model][Quantization] HQQ support through Marlin kernel expansion (#9766 )	2024-11-19 13:31:12 -08:00
benchmark_marlin.py	[Model][Quantization] HQQ support through Marlin kernel expansion (#9766 )	2024-11-19 13:31:12 -08:00
benchmark_moe.py	[Hardware] using current_platform.seed_everything (#9785 )	2024-10-29 14:47:44 +00:00
benchmark_paged_attention.py	[Hardware] using current_platform.seed_everything (#9785 )	2024-10-29 14:47:44 +00:00
benchmark_quant.py	[Hardware] using current_platform.seed_everything (#9785 )	2024-10-29 14:47:44 +00:00
benchmark_rope.py	[Hardware] using current_platform.seed_everything (#9785 )	2024-10-29 14:47:44 +00:00
benchmark_shapes.py	Add marlin unit tests and marlin benchmark script (#4815 )	2024-05-16 09:36:49 -04:00
graph_machete_bench.py	[Kernel] Initial Machete W4A8 support + Refactors (#9855 )	2024-11-18 12:59:29 -07:00
requirements.txt	[Kernel] (2/N) Machete - Integrate into CompressedTensorsWNA16 and GPTQMarlin (#7701 )	2024-09-23 13:46:26 -04:00
weight_shapes.py	[Kernel] Initial Machete W4A8 support + Refactors (#9855 )	2024-11-18 12:59:29 -07:00