vllm/tests/spec_decode/e2e
Swapnil Parekh 4d6ada947c
[CORE] Adding support for insertion of soft-tuned prompts (#4645)
Co-authored-by: Swapnil Parekh <swapnilp@ibm.com>
Co-authored-by: Joe G <joseph.granados@h2o.ai>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2024-07-09 13:26:36 -07:00
..
__init__.py [Speculative decoding 7/9] Speculative decoding end-to-end correctness tests. (#3951) 2024-04-23 08:02:36 +00:00
conftest.py [CORE] Adding support for insertion of soft-tuned prompts (#4645) 2024-07-09 13:26:36 -07:00
test_compatibility.py [Speculative decoding][Re-take] Enable TP>1 speculative decoding (#4840) 2024-05-16 00:53:51 -07:00
test_integration_dist_tp2.py [Speculative Decoding] MLPSpeculator Tensor Parallel support (1/2) (#6050) 2024-07-02 07:20:29 -07:00
test_integration_dist_tp4.py [Speculative Decoding] Support draft model on different tensor-parallel size than target model (#5414) 2024-06-25 09:56:06 +00:00
test_integration.py [Speculative decoding][Re-take] Enable TP>1 speculative decoding (#4840) 2024-05-16 00:53:51 -07:00
test_logprobs.py [Speculative decoding] Support target-model logprobs (#4378) 2024-05-03 15:52:01 -07:00
test_mlp_correctness.py [CORE] Quantized lm-head Framework (#4442) 2024-07-02 22:25:17 +00:00
test_multistep_correctness.py [Speculative Decoding 2/2 ] Integrate typical acceptance sampler into Spec Decode Worker (#5348) 2024-07-01 00:33:05 -07:00
test_ngram_correctness.py [Dynamic Spec Decoding] Minor fix for disabling speculative decoding (#5000) 2024-05-25 10:00:14 -07:00