vllm/worker at 03dccc886ef7e5d0dd67512f3e9748ee00c21fb2 - vllm

History

youkaichao ea3890a5f0 [Core][Distributed] code deduplication in tp&pp with coordinator(#5293 ) [Core][Distributed] add coordinator to reduce code duplication in tp and pp (#5293)		2024-06-12 17:27:08 -07:00
..
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
cache_engine.py	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
cpu_model_runner.py	[Bugfix] Fix wrong multi_modal_input format for CPU runner (#5451 )	2024-06-12 16:20:18 -07:00
cpu_worker.py	[Misc] Enhance attention selector (#4751 )	2024-05-13 10:47:25 -07:00
embedding_model_runner.py	[Core] Support image processor (#4197 )	2024-06-02 22:56:41 -07:00
model_runner.py	[Core][Distributed] code deduplication in tp&pp with coordinator(#5293 )	2024-06-12 17:27:08 -07:00
neuron_model_runner.py	[Core][Model runner refactoring 1/N] Refactor attn metadata term (#4518 )	2024-05-03 10:20:12 -07:00
neuron_worker.py	[Core] RayWorkerVllm --> WorkerWrapper to reduce duplication (#4024 )	2024-04-17 08:34:33 +00:00
tpu_model_runner.py	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
tpu_worker.py	[Hardware] Initial TPU integration (#5292 )	2024-06-12 11:53:03 -07:00
worker_base.py	[Core][Optimization] remove vllm-nccl (#5091 )	2024-05-29 05:13:52 +00:00
worker.py	[Frontend] [Core] Support for sharded tensorized models (#4990 )	2024-06-12 14:13:52 -07:00