vllm/vllm/worker
Michał Moskal d4f3985907
[Core] Sliding window for block manager v2 (#4545)
Co-authored-by: Ruth Evans <ruthevans@Ruths-MacBook-Pro.local>
2024-05-28 11:07:07 +09:00
..
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
cache_engine.py [Core] Sliding window for block manager v2 (#4545) 2024-05-28 11:07:07 +09:00
cpu_model_runner.py [Core][2/N] Model runner refactoring part 2. Combine prepare prefill / decode to a single API (#4681) 2024-05-15 14:00:10 +09:00
cpu_worker.py [Misc] Enhance attention selector (#4751) 2024-05-13 10:47:25 -07:00
embedding_model_runner.py [Core] Eliminate parallel worker per-step task scheduling overhead (#4894) 2024-05-23 06:17:27 +09:00
model_runner.py [Core] Sliding window for block manager v2 (#4545) 2024-05-28 11:07:07 +09:00
neuron_model_runner.py [Core][Model runner refactoring 1/N] Refactor attn metadata term (#4518) 2024-05-03 10:20:12 -07:00
neuron_worker.py [Core] RayWorkerVllm --> WorkerWrapper to reduce duplication (#4024) 2024-04-17 08:34:33 +00:00
worker_base.py [Core] Eliminate parallel worker per-step task scheduling overhead (#4894) 2024-05-23 06:17:27 +09:00
worker.py [Core] Eliminate parallel worker per-step task scheduling overhead (#4894) 2024-05-23 06:17:27 +09:00