vllm/vllm/entrypoints
Woosuk Kwon 37ca558103
Optimize model execution with CUDA graph (#1926)
Co-authored-by: Chen Shen <scv119@gmail.com>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2023-12-16 21:12:08 -08:00
..
openai Fix completion API echo and logprob combo (#1992) 2023-12-10 13:20:30 -08:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
api_server.py Add /health Endpoint for both Servers (#1540) 2023-11-01 10:29:44 -07:00
llm.py Optimize model execution with CUDA graph (#1926) 2023-12-16 21:12:08 -08:00