vllm/vllm/spec_decode
Cyrus Leung e0191a95d8
[0/N] Rename MultiModalInputs to MultiModalKwargs (#10040)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2024-11-09 11:31:02 +08:00
..
__init__.py [Bugfix] Add __init__.py files for vllm/core/block/ and vllm/spec_decode/ (#3798) 2024-04-02 12:35:31 -07:00
batch_expansion.py [Feature] [Spec decode]: Combine chunked prefill with speculative decoding (#9291) 2024-11-07 08:15:14 -08:00
draft_model_runner.py [0/N] Rename MultiModalInputs to MultiModalKwargs (#10040) 2024-11-09 11:31:02 +08:00
interfaces.py [Spec Decode] (1/2) Remove batch expansion (#8839) 2024-10-01 16:04:42 -07:00
medusa_worker.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
metrics.py [CI/Build] Update Ruff version (#8469) 2024-09-18 11:00:56 +00:00
mlp_speculator_worker.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
mqa_scorer.py [Feature] [Spec decode]: Combine chunked prefill with speculative decoding (#9291) 2024-11-07 08:15:14 -08:00
multi_step_worker.py [Bugfix][SpecDecode] kv corruption with bonus tokens in spec decode (#9730) 2024-11-06 01:45:45 +00:00
ngram_worker.py [2/N] executor pass the complete config to worker/modelrunner (#9938) 2024-11-02 07:35:05 -07:00
proposer_worker_base.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
smaller_tp_proposer_worker.py [Core] Logprobs support in Multi-step (#7652) 2024-08-29 19:19:08 -07:00
spec_decode_worker.py [Feature] [Spec decode]: Combine chunked prefill with speculative decoding (#9291) 2024-11-07 08:15:14 -08:00
target_model_runner.py [2/N] executor pass the complete config to worker/modelrunner (#9938) 2024-11-02 07:35:05 -07:00
top1_proposer.py [Feature] [Spec decode]: Combine chunked prefill with speculative decoding (#9291) 2024-11-07 08:15:14 -08:00
util.py [Core][Bugfix] Support prompt_logprobs returned with speculative decoding (#8047) 2024-09-24 17:29:56 -07:00