vllm/serving at 657061fdced8a33a60c1b09f5da2525de9da8f03 - vllm

History

Yuan Tang 49d849b3ab docs: Add tutorial on deploying vLLM model with KServe (#2586 ) Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>		2024-03-01 11:04:14 -08:00
..
deploying_with_docker.rst	[Docker] Add cuda arch list as build option (#1950 )	2023-12-08 09:53:47 -08:00
deploying_with_kserve.rst	docs: Add tutorial on deploying vLLM model with KServe (#2586 )	2024-03-01 11:04:14 -08:00
deploying_with_triton.rst	Add documentation to Triton server tutorial (#983 )	2023-09-20 10:32:40 -07:00
distributed_serving.rst	[Doc] Documentation for distributed inference (#261 )	2023-06-26 11:34:23 -07:00
metrics.rst	Add Production Metrics in Prometheus format (#1890 )	2023-12-02 16:37:44 -08:00
run_on_sky.rst	Update run_on_sky.rst (#2025 )	2023-12-11 10:32:58 -08:00
serving_with_langchain.rst	docs: fix langchain (#2736 )	2024-02-03 18:17:55 -08:00