vllm/models at 5bd3c650721cc5de451f034bcbed37d1a1a4116c - vllm

History

Eric Xihui Lin 8e192ff967 [Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799 ) Co-authored-by: beagleski <yunanzhang@microsoft.com> Co-authored-by: bapatra <bapatra@microsoft.com> Co-authored-by: Barun Patra <codedecde@users.noreply.github.com> Co-authored-by: Michael Goin <michael@neuralmagic.com>		2024-05-24 22:00:52 -07:00
..
adding_model.rst	[Doc]: Update the doc of adding new models (#4236 )	2024-04-21 09:57:08 -07:00
engine_args.rst	Don't show default value for flags in `EngineArgs` (#4223 )	2024-04-21 09:15:28 -07:00
lora.rst	[Doc] Add docs about OpenAI compatible server (#3288 )	2024-03-18 22:05:34 -07:00
performance.rst	[Scheduler] Warning upon preemption and Swapping (#4647 )	2024-05-13 23:50:44 +09:00
supported_models.rst	[Kernel][Backend][Model] Blocksparse flash attention kernel and Phi-3-Small model (#4799 )	2024-05-24 22:00:52 -07:00