vllm/benchmarks/kernels
Aaron Pham 9d104b5beb
[CI/Build] Update Ruff version (#8469)
Signed-off-by: Aaron Pham <contact@aarnphm.xyz>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2024-09-18 11:00:56 +00:00
..
benchmark_aqlm.py [Frontend] Add FlexibleArgumentParser to support both underscore and dash in names (#5718) 2024-06-20 17:00:13 -06:00
benchmark_layernorm.py [CI/Build] Avoid CUDA initialization (#8534) 2024-09-18 10:38:11 +00:00
benchmark_machete.py [Kernel] (1/N) Machete - Hopper Optimized Mixed Precision Linear Kernel (#7174) 2024-08-20 07:09:33 -06:00
benchmark_marlin.py [Misc] Disambiguate quantized types via a new ScalarType (#6396) 2024-08-02 13:51:58 -07:00
benchmark_moe.py [CI/Build] Avoid CUDA initialization (#8534) 2024-09-18 10:38:11 +00:00
benchmark_paged_attention.py [CI/Build] Avoid CUDA initialization (#8534) 2024-09-18 10:38:11 +00:00
benchmark_quant.py [CI/Build] Avoid CUDA initialization (#8534) 2024-09-18 10:38:11 +00:00
benchmark_rope.py [CI/Build] Avoid CUDA initialization (#8534) 2024-09-18 10:38:11 +00:00
benchmark_shapes.py Add marlin unit tests and marlin benchmark script (#4815) 2024-05-16 09:36:49 -04:00
graph_machete_bench.py [CI/Build] Update Ruff version (#8469) 2024-09-18 11:00:56 +00:00
weight_shapes.py [Kernel] (1/N) Machete - Hopper Optimized Mixed Precision Linear Kernel (#7174) 2024-08-20 07:09:33 -06:00