diff --git a/docs/source/models/spec_decode.rst b/docs/source/models/spec_decode.rst index 57ff4517..9fb62397 100644 --- a/docs/source/models/spec_decode.rst +++ b/docs/source/models/spec_decode.rst @@ -17,6 +17,7 @@ Speculating with a draft model The following code configures vLLM to use speculative decoding with a draft model, speculating 5 tokens at a time. .. code-block:: python + from vllm import LLM, SamplingParams prompts = [ @@ -45,6 +46,7 @@ The following code configures vLLM to use speculative decoding where proposals a matching n-grams in the prompt. For more information read `this thread. `_ .. code-block:: python + from vllm import LLM, SamplingParams prompts = [