vllm/vllm/worker
2024-02-28 09:34:34 -08:00
..
spec_decode [Speculative decoding 2/9] Multi-step worker for draft model (#2424) 2024-01-21 16:31:47 -08:00
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
cache_engine.py [Neuron] Support inference with transformers-neuronx (#2569) 2024-02-28 09:34:34 -08:00
model_runner.py [Neuron] Support inference with transformers-neuronx (#2569) 2024-02-28 09:34:34 -08:00
neuron_worker.py [Neuron] Support inference with transformers-neuronx (#2569) 2024-02-28 09:34:34 -08:00
worker.py [Minor] Small fix to make distributed init logic in worker looks cleaner (#2905) 2024-02-18 14:39:00 -08:00