vllm/spec_decode at 2b0fb534813e9835077403723a484b7c03d47259 - vllm

History

sroy745 ae151d73be [Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )		2024-07-10 16:02:47 -07:00
..
__init__.py	[Bugfix] Add `__init__.py` files for `vllm/core/block/` and `vllm/spec_decode/` (#3798 )	2024-04-02 12:35:31 -07:00
batch_expansion.py	[Model] MLPSpeculator speculative decoding support (#4947 )	2024-06-20 20:23:12 -04:00
draft_model_runner.py	[CORE] Adding support for insertion of soft-tuned prompts (#4645 )	2024-07-09 13:26:36 -07:00
interfaces.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
medusa_worker.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
metrics.py	[Speculative Decoding 2/2 ] Integrate typical acceptance sampler into Spec Decode Worker (#5348 )	2024-07-01 00:33:05 -07:00
mlp_speculator_worker.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
multi_step_worker.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
ngram_worker.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
proposer_worker_base.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
smaller_tp_proposer_worker.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
spec_decode_worker.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
top1_proposer.py	[Speculative Decoding] Enabling bonus token in speculative decoding for KV cache based models (#5765 )	2024-07-10 16:02:47 -07:00
util.py	[Model] MLPSpeculator speculative decoding support (#4947 )	2024-06-20 20:23:12 -04:00