vllm/spec_decode at 6fc5b0f249396c6fb3a63b3175cd9892e7fedd9b - vllm

History

Mahesh Keralapura 933790c209 [Core] Add span metrics for model_forward, scheduler and sampler time (#7089 )		2024-08-09 13:55:13 -07:00
..
__init__.py	[Bugfix] Add `__init__.py` files for `vllm/core/block/` and `vllm/spec_decode/` (#3798 )	2024-04-02 12:35:31 -07:00
batch_expansion.py	[BugFix] Fix use of per-request seed with pipeline parallel (#6698 )	2024-07-30 10:40:08 -07:00
draft_model_runner.py	[Core] Add span metrics for model_forward, scheduler and sampler time (#7089 )	2024-08-09 13:55:13 -07:00
interfaces.py	[BUGFIX] Raise an error for no draft token case when draft_tp>1 (#6369 )	2024-07-19 06:01:09 -07:00
medusa_worker.py	[Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971 )	2024-08-09 05:42:45 +00:00
metrics.py	[Bugfix] [SpecDecode] AsyncMetricsCollector: update time since last collection (#6578 )	2024-07-19 14:01:03 -07:00
mlp_speculator_worker.py	[BugFix] Fix use of per-request seed with pipeline parallel (#6698 )	2024-07-30 10:40:08 -07:00
multi_step_worker.py	[Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971 )	2024-08-09 05:42:45 +00:00
ngram_worker.py	[BugFix] Fix use of per-request seed with pipeline parallel (#6698 )	2024-07-30 10:40:08 -07:00
proposer_worker_base.py	[Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971 )	2024-08-09 05:42:45 +00:00
smaller_tp_proposer_worker.py	[Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971 )	2024-08-09 05:42:45 +00:00
spec_decode_worker.py	[Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971 )	2024-08-09 05:42:45 +00:00
target_model_runner.py	[Core] Add span metrics for model_forward, scheduler and sampler time (#7089 )	2024-08-09 13:55:13 -07:00
top1_proposer.py	[BUGFIX] Raise an error for no draft token case when draft_tp>1 (#6369 )	2024-07-19 06:01:09 -07:00
util.py	[Speculative decoding] Add periodic log with time spent in proposal/scoring/verification (#6963 )	2024-08-05 08:46:44 +00:00