vllm/vllm/entrypoints
youkaichao 1c27d25fb5
[core][model] yet another cpu offload implementation (#6496)
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-07-17 20:54:35 -07:00
..
openai [Frontend] Support for chat completions input in the tokenize endpoint (#5923) 2024-07-16 20:18:09 +08:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
api_server.py [misc][frontend] log all available endpoints (#6195) 2024-07-07 15:11:12 -07:00
llm.py [core][model] yet another cpu offload implementation (#6496) 2024-07-17 20:54:35 -07:00