vllm/kernels at 77d9e514a2284d5d0bd34b1518b9483ae7d8a05a - vllm

History

Luka Govedič 7937009a7e [Kernel] Replaced `blockReduce[...]` functions with `cub::BlockReduce` (#7233 ) Co-authored-by: Michael Goin <michael@neuralmagic.com>		2024-08-21 20:18:00 -04:00
..
benchmark_aqlm.py	[Frontend] Add FlexibleArgumentParser to support both underscore and dash in names (#5718 )	2024-06-20 17:00:13 -06:00
benchmark_layernorm.py	[Kernel] Replaced `blockReduce[...]` functions with `cub::BlockReduce` (#7233 )	2024-08-21 20:18:00 -04:00
benchmark_machete.py	[Kernel] (1/N) Machete - Hopper Optimized Mixed Precision Linear Kernel (#7174 )	2024-08-20 07:09:33 -06:00
benchmark_marlin.py	[Misc] Disambiguate quantized types via a new ScalarType (#6396 )	2024-08-02 13:51:58 -07:00
benchmark_moe.py	[Kernel] W8A16 Int8 inside FusedMoE (#7415 )	2024-08-16 10:06:51 -07:00
benchmark_paged_attention.py	[Model] H2O Danube3-4b (#6451 )	2024-07-26 20:47:50 -07:00
benchmark_quant.py	[Kernel] Replaced `blockReduce[...]` functions with `cub::BlockReduce` (#7233 )	2024-08-21 20:18:00 -04:00
benchmark_rope.py	[Model] H2O Danube3-4b (#6451 )	2024-07-26 20:47:50 -07:00
benchmark_shapes.py	Add marlin unit tests and marlin benchmark script (#4815 )	2024-05-16 09:36:49 -04:00
graph_machete_bench.py	[Kernel] (1/N) Machete - Hopper Optimized Mixed Precision Linear Kernel (#7174 )	2024-08-20 07:09:33 -06:00
weight_shapes.py	[Kernel] (1/N) Machete - Hopper Optimized Mixed Precision Linear Kernel (#7174 )	2024-08-20 07:09:33 -06:00