vllm/worker at f48c6791b7bfc2579ad575d33ed83912f0bfb011 - vllm

History

Antoni Baum 22de45235c Push logprob generation to LLMEngine (#3065 ) Co-authored-by: Avnish Narayan <avnish@anyscale.com>		2024-03-04 19:54:06 +00:00
..
spec_decode	Push logprob generation to LLMEngine (#3065 )	2024-03-04 19:54:06 +00:00
__init__.py	[Speculative decoding 2/9] Multi-step worker for draft model (#2424 )	2024-01-21 16:31:47 -08:00
test_model_runner.py	Remove hardcoded `device="cuda"` to support more devices (#2503 )	2024-02-01 15:46:39 -08:00