vllm/vllm/spec_decode
2024-10-12 05:13:37 +00:00
..
__init__.py [Bugfix] Add __init__.py files for vllm/core/block/ and vllm/spec_decode/ (#3798) 2024-04-02 12:35:31 -07:00
batch_expansion.py [Spec Decode] (1/2) Remove batch expansion (#8839) 2024-10-01 16:04:42 -07:00
draft_model_runner.py [Bugfix] Fix try-catch conditions to import correct Flash Attention Backend in Draft Model (#9101) 2024-10-06 13:00:04 +08:00
interfaces.py [Spec Decode] (1/2) Remove batch expansion (#8839) 2024-10-01 16:04:42 -07:00
medusa_worker.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
metrics.py [CI/Build] Update Ruff version (#8469) 2024-09-18 11:00:56 +00:00
mlp_speculator_worker.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
mqa_scorer.py [SpecDec] Remove Batch Expansion (2/3) (#9298) 2024-10-12 05:13:37 +00:00
multi_step_worker.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
ngram_worker.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
proposer_worker_base.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
smaller_tp_proposer_worker.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
spec_decode_worker.py [SpecDec] Remove Batch Expansion (2/3) (#9298) 2024-10-12 05:13:37 +00:00
target_model_runner.py [VLM] Refactor MultiModalConfig initialization and profiling (#7530) 2024-08-17 13:30:55 -07:00
top1_proposer.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
util.py [Core][Bugfix] Support prompt_logprobs returned with speculative decoding (#8047) 2024-09-24 17:29:56 -07:00