vllm/entrypoints at 37ca5581039271d4bf69e5cb1f1ec8e04775777c - vllm

History

Woosuk Kwon 37ca558103 Optimize model execution with CUDA graph (#1926 ) Co-authored-by: Chen Shen <scv119@gmail.com> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>		2023-12-16 21:12:08 -08:00
..
openai	Fix completion API echo and logprob combo (#1992 )	2023-12-10 13:20:30 -08:00
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
api_server.py	Add `/health` Endpoint for both Servers (#1540 )	2023-11-01 10:29:44 -07:00
llm.py	Optimize model execution with CUDA graph (#1926 )	2023-12-16 21:12:08 -08:00