vllm/vllm/entrypoints/openai
Robert Shaw ed812a73fa
[ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883)
Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>
Co-authored-by: Joe Runde <Joseph.Runde@ibm.com>
Co-authored-by: Joe Runde <joe@joerun.de>
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
Co-authored-by: Simon Mo <simon.mo@hey.com>
2024-08-02 18:27:28 -07:00
..
rpc [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
api_server.py [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00
cli_args.py [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00
logits_processors.py [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00
protocol.py [Bugfix] Set SamplingParams.max_tokens for OpenAI requests if not provided by user (#6954) 2024-07-31 21:13:34 -07:00
run_batch.py [Frontend] Refactor prompt processing (#4028) 2024-07-22 10:13:53 -07:00
serving_chat.py [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00
serving_completion.py [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00
serving_embedding.py [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00
serving_engine.py [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00
serving_tokenization.py [ Frontend ] Multiprocessing for OpenAI Server with zeromq (#6883) 2024-08-02 18:27:28 -07:00