vllm/tests/kernels
2024-05-17 18:43:34 +09:00
..
__init__.py [CI/Build] Move test_utils.py to tests/utils.py (#4425) 2024-05-13 23:50:09 +09:00
allclose_default.py [ROCm] Fix some kernels failed unit tests (#2498) 2024-02-05 14:25:36 -08:00
conftest.py [Kernel] Use flashinfer for decoding (#4353) 2024-05-03 15:51:27 -07:00
test_activation.py [CI/Build] Move test_utils.py to tests/utils.py (#4425) 2024-05-13 23:50:09 +09:00
test_attention.py [CI/Build] Move test_utils.py to tests/utils.py (#4425) 2024-05-13 23:50:09 +09:00
test_cache.py [Kernel] Refactor FP8 kv-cache with NVIDIA float8_e4m3 support (#4535) 2024-05-09 18:04:17 -06:00
test_cutlass.py [Kernel] Add w8a8 CUTLASS kernels (#4749) 2024-05-16 18:32:50 -04:00
test_layernorm.py [Kernel] Layernorm performance optimization (#3662) 2024-03-30 14:26:38 -07:00
test_marlin_gemm.py Add marlin unit tests and marlin benchmark script (#4815) 2024-05-16 09:36:49 -04:00
test_moe.py [Kernel] Support MoE Fp8 Checkpoints for Mixtral (Static Weights with Dynamic/Static Activations) (#4527) 2024-05-04 11:45:16 -07:00
test_pos_encoding.py [Bugfix] fix rope error when load models with different dtypes (#4835) 2024-05-17 18:43:34 +09:00
test_prefix_prefill.py [Bugfix][Kernel] allow non-power-of-2 for prefix prefill with alibi (#4573) 2024-05-08 09:19:58 -07:00
test_rand.py [CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00
test_sampler.py [CI] Try introducing isort. (#3495) 2024-03-25 07:59:47 -07:00