vllm/tests/basic_correctness
wang.yuqi 6e36f4fa6c
improve chunked prefill performance
[Bugfix] Fix #7592 vllm 0.5.4 enable_chunked_prefill throughput is slightly lower than 0.5.3~0.5.0. (#7874)
2024-09-02 14:20:12 -07:00
..
__init__.py [CI/Build] Move test_utils.py to tests/utils.py (#4425) 2024-05-13 23:50:09 +09:00
test_basic_correctness.py [core][distributed] simplify code to support pipeline parallel (#6406) 2024-07-14 21:20:51 -07:00
test_chunked_prefill.py improve chunked prefill performance 2024-09-02 14:20:12 -07:00
test_cpu_offload.py [CI] Move quantization cpu offload tests out of fastcheck (#7574) 2024-08-15 21:16:20 -07:00
test_preemption.py [Core] Asynchronous Output Processor (#7049) 2024-08-26 20:53:20 -07:00