vllm/tests/entrypoints/llm
Joe Runde de4008e2ab
[Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (#9352)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
2024-10-17 22:47:27 -04:00
..
__init__.py [CI/Build] [3/3] Reorganize entrypoints tests (#5966) 2024-06-30 12:58:49 +08:00
test_encode.py [Core] renamePromptInputs and inputs (#8876) 2024-09-26 20:35:15 -07:00
test_generate_multiple_loras.py [Core] Support load and unload LoRA in api server (#6566) 2024-09-05 18:10:33 -07:00
test_generate.py [Core] renamePromptInputs and inputs (#8876) 2024-09-26 20:35:15 -07:00
test_guided_generate.py [Frontend][Core] Move guided decoding params into sampling params (#8252) 2024-10-01 09:34:25 +08:00
test_lazy_outlines.py [Bugfix][Core] Use torch.cuda.memory_stats() to profile peak memory usage (#9352) 2024-10-17 22:47:27 -04:00
test_prompt_validation.py [BugFix] Fix server crash on empty prompt (#7746) 2024-08-23 13:12:44 +00:00