vllm/csrc/prepare_inputs
Varun Sundar Rabindranath afb050b29d
[Core] CUDA Graphs for Multi-Step + Chunked-Prefill (#8645)
Co-authored-by: Varun Sundar Rabindranath <varun@neuralmagic.com>
2024-10-02 19:44:39 +00:00
..
advance_step.cu [Core] CUDA Graphs for Multi-Step + Chunked-Prefill (#8645) 2024-10-02 19:44:39 +00:00
advance_step.cuh [Core] draft_model_runner: Implement prepare_inputs on GPU for advance_step (#6338) 2024-07-17 14:30:28 -07:00