vllm/docs/source/models
Eric Xihui Lin 8e192ff967
[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799)
Co-authored-by: beagleski <yunanzhang@microsoft.com>
Co-authored-by: bapatra <bapatra@microsoft.com>
Co-authored-by: Barun Patra <codedecde@users.noreply.github.com>
Co-authored-by: Michael Goin <michael@neuralmagic.com>
2024-05-24 22:00:52 -07:00
..
adding_model.rst [Doc]: Update the doc of adding new models (#4236) 2024-04-21 09:57:08 -07:00
engine_args.rst Don't show default value for flags in EngineArgs (#4223) 2024-04-21 09:15:28 -07:00
lora.rst [Doc] Add docs about OpenAI compatible server (#3288) 2024-03-18 22:05:34 -07:00
performance.rst [Scheduler] Warning upon preemption and Swapping (#4647) 2024-05-13 23:50:44 +09:00
supported_models.rst [Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799) 2024-05-24 22:00:52 -07:00