vllm/vllm/executor
2024-04-18 16:15:12 -07:00
..
__init__.py Add distributed model executor abstraction (#3191) 2024-03-11 11:03:45 -07:00
cpu_executor.py [Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894) 2024-04-16 13:09:21 -07:00
executor_base.py [Speculative decoding 6/9] Integrate speculative decoding with LLMEngine (#3894) 2024-04-16 13:09:21 -07:00
gpu_executor.py [Misc] [CI] Fix CI failure caught after merge (#4126) 2024-04-16 17:56:01 -07:00
neuron_executor.py [CI/CD] add neuron docker and ci test scripts (#3571) 2024-04-18 15:26:01 -07:00
ray_gpu_executor.py [Core] add an option to log every function call to for debugging hang/crash in distributed inference (#4079) 2024-04-18 16:15:12 -07:00