vllm/benchmarks/kernels
2024-03-14 08:11:48 +00:00
..
benchmark_mixtral_moe.py [Kernel] change benchmark script so that result can be directly used; tune moe kernel in A100/H100 with tp=2,4,8 (#3389) 2024-03-14 08:11:48 +00:00
benchmark_paged_attention.py Remove hardcoded device="cuda" to support more devices (#2503) 2024-02-01 15:46:39 -08:00
benchmark_rope.py Add batched RoPE kernel (#3095) 2024-03-13 13:45:26 -07:00