vllm/spec_decode at f205c09854853172a446c92aa81eb7199da324ab - vllm

History

Jonas M. Kübler f205c09854 [Bugfix] Unify rank computation across regular decoding and speculative decoding (#7899 )		2024-08-28 22:18:13 -07:00
..
__init__.py	[Bugfix] Add `__init__.py` files for `vllm/core/block/` and `vllm/spec_decode/` (#3798 )	2024-04-02 12:35:31 -07:00
batch_expansion.py	[Spec Decoding] Streamline batch expansion tensor manipulation (#7851 )	2024-08-25 15:45:14 -07:00
draft_model_runner.py	[Speculative Decoding] EAGLE Implementation with Top-1 proposer (#6830 )	2024-08-22 02:42:24 -07:00
interfaces.py	[BUGFIX] Raise an error for no draft token case when draft_tp>1 (#6369 )	2024-07-19 06:01:09 -07:00
medusa_worker.py	[Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971 )	2024-08-09 05:42:45 +00:00
metrics.py	[Core] Optimize SPMD architecture with delta + serialization optimization (#7109 )	2024-08-18 17:57:20 -07:00
mlp_speculator_worker.py	[BugFix] Fix use of per-request seed with pipeline parallel (#6698 )	2024-07-30 10:40:08 -07:00
multi_step_worker.py	[Speculative Decoding] EAGLE Implementation with Top-1 proposer (#6830 )	2024-08-22 02:42:24 -07:00
ngram_worker.py	[BugFix] Fix use of per-request seed with pipeline parallel (#6698 )	2024-07-30 10:40:08 -07:00
proposer_worker_base.py	[Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971 )	2024-08-09 05:42:45 +00:00
smaller_tp_proposer_worker.py	[Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971 )	2024-08-09 05:42:45 +00:00
spec_decode_worker.py	[Spec Decoding] Streamline batch expansion tensor manipulation (#7851 )	2024-08-25 15:45:14 -07:00
target_model_runner.py	[VLM] Refactor `MultiModalConfig` initialization and profiling (#7530 )	2024-08-17 13:30:55 -07:00
top1_proposer.py	[Spec Decoding] Streamline batch expansion tensor manipulation (#7851 )	2024-08-25 15:45:14 -07:00
util.py	[Bugfix] Unify rank computation across regular decoding and speculative decoding (#7899 )	2024-08-28 22:18:13 -07:00