vllm/worker at c0c2335ce027486d254c31f665ce00d7db427d22 - vllm

History

Liangfu Chen 3b7178cfa4 [Neuron] Support inference with transformers-neuronx (#2569 )		2024-02-28 09:34:34 -08:00
..
spec_decode	[Speculative decoding 2/9] Multi-step worker for draft model (#2424 )	2024-01-21 16:31:47 -08:00
__init__.py	Change the name to vLLM (#150 )	2023-06-17 03:07:40 -07:00
cache_engine.py	[Neuron] Support inference with transformers-neuronx (#2569 )	2024-02-28 09:34:34 -08:00
model_runner.py	[Neuron] Support inference with transformers-neuronx (#2569 )	2024-02-28 09:34:34 -08:00
neuron_worker.py	[Neuron] Support inference with transformers-neuronx (#2569 )	2024-02-28 09:34:34 -08:00
worker.py	[Minor] Small fix to make distributed init logic in worker looks cleaner (#2905 )	2024-02-18 14:39:00 -08:00