vllm/vllm/spec_decode
Travis Johnson cc0eaf12b1
[Bugfix] spec decode handle None entries in topk args in create_sequence_group_output (#7232)
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>
2024-08-22 09:33:48 -04:00
..
__init__.py [Bugfix] Add __init__.py files for vllm/core/block/ and vllm/spec_decode/ (#3798) 2024-04-02 12:35:31 -07:00
batch_expansion.py [Speculative Decoding] Fixing hidden states handling in batch expansion (#7508) 2024-08-19 17:58:14 -07:00
draft_model_runner.py [Speculative Decoding] EAGLE Implementation with Top-1 proposer (#6830) 2024-08-22 02:42:24 -07:00
interfaces.py [BUGFIX] Raise an error for no draft token case when draft_tp>1 (#6369) 2024-07-19 06:01:09 -07:00
medusa_worker.py [Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971) 2024-08-09 05:42:45 +00:00
metrics.py [Core] Optimize SPMD architecture with delta + serialization optimization (#7109) 2024-08-18 17:57:20 -07:00
mlp_speculator_worker.py [BugFix] Fix use of per-request seed with pipeline parallel (#6698) 2024-07-30 10:40:08 -07:00
multi_step_worker.py [Speculative Decoding] EAGLE Implementation with Top-1 proposer (#6830) 2024-08-22 02:42:24 -07:00
ngram_worker.py [BugFix] Fix use of per-request seed with pipeline parallel (#6698) 2024-07-30 10:40:08 -07:00
proposer_worker_base.py [Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971) 2024-08-09 05:42:45 +00:00
smaller_tp_proposer_worker.py [Speculative decoding] [Multi-Step] decouple should_modify_greedy_probs_inplace (#6971) 2024-08-09 05:42:45 +00:00
spec_decode_worker.py [Speculative Decoding] EAGLE Implementation with Top-1 proposer (#6830) 2024-08-22 02:42:24 -07:00
target_model_runner.py [VLM] Refactor MultiModalConfig initialization and profiling (#7530) 2024-08-17 13:30:55 -07:00
top1_proposer.py [Speculative Decoding] Fixing hidden states handling in batch expansion (#7508) 2024-08-19 17:58:14 -07:00
util.py [Bugfix] spec decode handle None entries in topk args in create_sequence_group_output (#7232) 2024-08-22 09:33:48 -04:00