vllm/vllm/worker
2024-08-16 21:15:13 -07:00
..
__init__.py Change the name to vLLM (#150) 2023-06-17 03:07:40 -07:00
cache_engine.py [Model] Jamba support (#4115) 2024-07-02 23:11:29 +00:00
cpu_model_runner.py [Bugfix] Fix broadcasting logic for multi_modal_kwargs (#6836) 2024-07-31 10:38:45 +08:00
cpu_worker.py [Hardware] [Intel] Enable Multiprocessing and tensor parallel in CPU backend and update documentation (#6125) 2024-07-26 13:50:10 -07:00
embedding_model_runner.py [Core] Add span metrics for model_forward, scheduler and sampler time (#7089) 2024-08-09 13:55:13 -07:00
enc_dec_model_runner.py [VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126) 2024-08-14 17:55:42 +00:00
model_runner_base.py [BugFix] Fix use of per-request seed with pipeline parallel (#6698) 2024-07-30 10:40:08 -07:00
model_runner.py [Core] Fix tracking of model forward time in case of PP>1 (#7440) 2024-08-16 13:46:01 -07:00
neuron_model_runner.py [Bugfix] update neuron for version > 0.5.0 (#7175) 2024-08-15 09:44:14 -07:00
neuron_worker.py [Bugfix] update neuron for version > 0.5.0 (#7175) 2024-08-15 09:44:14 -07:00
openvino_model_runner.py [Bugfix] Fix broadcasting logic for multi_modal_kwargs (#6836) 2024-07-31 10:38:45 +08:00
openvino_worker.py [core][distributed] support n layers % pp size != 0 (#6115) 2024-07-03 16:40:31 -07:00
tpu_model_runner.py [TPU] Use mark_dynamic to reduce compilation time (#7340) 2024-08-10 18:12:22 -07:00
tpu_worker.py [TPU] Set per-rank XLA cache (#7533) 2024-08-14 14:47:51 -07:00
utils.py [Core] Subclass ModelRunner to support cross-attention & encoder sequences (towards eventual encoder/decoder model support) (#4942) 2024-08-06 16:51:47 -04:00
worker_base.py [misc][plugin] add plugin system implementation (#7426) 2024-08-13 16:24:17 -07:00
worker.py [misc] use nvml to get consistent device name (#7582) 2024-08-16 21:15:13 -07:00
xpu_model_runner.py [VLM][Core] Support profiling with multiple multi-modal inputs per prompt (#7126) 2024-08-14 17:55:42 +00:00
xpu_worker.py [ci] set timeout for test_oot_registration.py (#7082) 2024-08-02 10:03:24 -07:00